Re: Using queue for publisher to buffer server communication (APEX-259)

Vlad Rozov Sun, 15 Nov 2015 19:18:43 -0800

Small correction or clarification - two (or more) operators that areconnected by a stream that is neither THREAD_LOCAL or CONTAINER_LOCALpass data using a buffer server. Upstream operator (publisher) sendsdata to the buffer server and buffer server sends data to downstreamoperator(s) (subscribers). In the current implementation of the bufferserver there is no assumption that it resides in the same container asthe upstream operator, so the communication is implemented using networklibrary (Netlet). At the same time to avoid complications with faulttolerance and high availability buffer server is always deployed intothe same container as the upstream operator and lifetime of the bufferserver is the same as the operator and a fault in the buffer server willcause redeployment of the upstream operator and similarly a fault in theupstream operator will cause automatic redeploy of the buffer server.This leads to the first optimization that is already implemented in thebuffer server - communication from the publisher (upstream operator) tothe buffer server is going over local loopback (not an actual networkdevice, but a virtual adapter). We are still paying price to pass dataover multiple call stacks - from the publisher to the AbstractClient inthe Netlet (connected using CircluarBuffer) and from the Netlet to thekernel TCP/IP stack (requires copying data from the JVM heap to directbuffers) and back from TCP/IP stack to the Netlet and to the buffer server.

I filed APEX-259 to see how much gain we can get by further leveragingthe fact that the upstream operator resides in the same container as thebuffer server so data can be passed within the same process (instead ofIPC) with BlockingQueue being one of the most efficient in processchannel inside JVM.


Thank you,

Vlad

On 11/13/15 11:19, Isha Arkatkar wrote:

Hi all,

    For APEX-259 (https://malhar.atlassian.net/browse/APEX-259), I am
exploring option of passing serialized tuples from publisher to buffer
server through a blocking queue.

Right now, publisher and buffer server reside within the same container,
however, communication between the two goes though sockets. We want to
check if we get any performance benefit by changing this communication to
queue-based one.

This is in exploration phase right now, but if we do see improvement, we
may want to provide it as a pluggable option.

Please let me know your thoughts!

Thanks,
Isha

Re: Using queue for publisher to buffer server communication (APEX-259)

Reply via email to