You are correct, ES nodes consumes data request by request, before they are passed on through the cluster. Also the bulk indexing requests, such requests are temporarily pushed to buffers, but they are split by lines and executed as single actions.
So to reduce network roundtrips, the best thing is to use the bulk API. What is left is a few percent to optimize, which is not much worth it. With gzip, ES HTTP provides transparent compression. Main challenge is HTTP overhead (headers can't be compressed), and base64, if you use binary data with ES. Please note that you must evaluate the bulk responses too, in order to validate the notification about bulk success on doc level. It is possible to extend the whole ES API also to Websocket, so beside JSON, it could also be possible to transfer JSON text frames or SMILE/binary frames on a single bi-directional channel. HTTP must use two channels for this, so with Websocket, you can reduce connection resources to the half. In this sense, the Netty channel / REST / Java API could be extended for special realtime WS streaming mode applications, like for pubsub applications. I experimented with that some time ago on ES 0.20 https://github.com/jprante/elasticsearch-transport-websocket (needs updating) >From what I understand, the thrift transport plugin compiles the ES API, operates in a streaming-like fashion, and is providing a solution that reduces HTTP overhead: https://github.com/elasticsearch/elasticsearch-transport-thrift Jörg -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH7wM%2BpdVpH9%3Dysoq7a0CesOGxDnY4yAwQAeAcqLWDGvQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
