Pavel Vazharov created TS-3744:
----------------------------------

             Summary: Crash (Seg Fault) when reenabling a VIO from a 
continuator which is different from the VIO's continuator.
                 Key: TS-3744
                 URL: https://issues.apache.org/jira/browse/TS-3744
             Project: Traffic Server
          Issue Type: Bug
            Reporter: Pavel Vazharov


Hi,

I'm trying to create ATS plugin which uses the API for cache write 
(TSCacheWrite, TSVConnWrite). For the write part, from a transformation, I'm 
trying to stream the data to both the client and the cache in the same time. 
The problem described below IMO can be summarized as - crash reenabling of one 
VIO from a continuator which is different from the VIO's continuator.
Here is the backtrace of the crash. 

traffic_server: Segmentation fault (Address not mapped to object [0x28])
traffic_server - STACK TRACE: 
/usr/local/bin/traffic_server(_Z19crash_logger_invokeiP9siginfo_tPv+0x8e)[0x4ad13e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x2b9092c2a340]
/usr/local/bin/traffic_server(_ZN7CacheVC8reenableEP3VIO+0x28)[0x6db868]
/home/freak82/ats/src/plugins/ccontent/ccontent.so(+0x29e5)[0x2b9096bce9e5]
/home/freak82/ats/src/plugins/ccontent/ccontent.so(+0x3094)[0x2b9096bcf094]
/usr/local/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x120)[0x767ea0]
/usr/local/bin/traffic_server(_ZN7EThread7executeEv+0x81b)[0x768aab]
/usr/local/bin/traffic_server(main+0xee6)[0x495436]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2b909387dec5]
/usr/local/bin/traffic_server[0x49ba6f]

It's like the VIO mutex->thread_holding or the VIO object itself are in some 
inappropriate state or invalid. The VIO has the same memory address as the 
originally created one and it's continuator is not explicitly destroyed 
(TSContDestroy). The associated buffer and reader are also alive.

I'm not sure if the thing (writing "in-parallel") that I'm trying to do is 
possible with the current API, by design? Is it possible/allowed by design to 
copy bytes to one VIO buffer and reenable the same VIO from another 
continuator, not the same continuator as the one of the VIO.
If it's possible am I doing something wrong or this is a bug?

Basically, I'm trying to do it in the following way. The explanations skip the 
error handling.
1. On transformation start, on the first EVENT_IMMEDIATE from the upstream, the 
code initializes the client stream (TSBuffer, TSBufferReader and TSVIO as in 
the null-transform plugin) and then start the cache write (TSCacheWrite) with a 
created and digested cache key (TSCacheKey).
2. On EVENT_CACHE_OPEN_WRITE, the code initializes the cache stream (TSBuffer, 
TSBufferReader and TSVIO) in the same way as the client stream, but using the 
passed TSCont and TSVConn from the event data. So far, it works as expected.
3. Both continuator callbacks, for the transformation and for the cache write, 
are handling events WRITE_READY and WRITE_COMPLETE. The transformation callback 
also handles EVENT_IMMEDIATE to know when there is more data from the upstream.
I was thinking to mark every stream as ready when the corresponding callback 
receives WRITE_READY, and when both streams are ready to copy the available 
data from the upstream to them, then reenable the both streams and the 
upstream. Then when there are new data available from the upstream, to copy 
them again when the both streams becomes ready, etc, etc.

Usually the first writes/copies and reenables are made from inside the 
TSCacheWrite, because it's reentrant and generates WRITE_READY for the cache 
continuator. These operations succeeds. The problem is that the plugin leads to 
crash in the ATS when it tries to reeenable the cache VIO from inside the 
transform continuator.

I tried to pass whole data from the upstream to the client first, copying 
(TSIOBufferCopy) "in-paralles" them to a temporary buffer, and initiate cache 
write at the end of the transformation and then write the data from the buffer 
to the cache VIO (similarly to the metalink plugin). This also works as 
expected.

Thanks,
Pavel.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to