You are correct, ES nodes consumes data request by request, before they are
passed on through the cluster. Also the bulk indexing requests, such
requests are temporarily pushed to buffers, but they are split by lines and
executed as single actions.

So to reduce network roundtrips, the best thing is to use the bulk API.
What is left is a few percent to optimize, which is not much worth it. With
gzip, ES HTTP provides transparent compression. Main challenge is HTTP
overhead (headers can't be compressed), and base64, if you use binary data
with ES.

Please note that you must evaluate the bulk responses too, in order to
validate the notification about bulk success on doc level.

It is possible to extend the whole ES API also to Websocket, so beside
JSON, it could also be possible to transfer JSON text frames or
SMILE/binary frames on a single bi-directional channel. HTTP must use two
channels for this, so with Websocket, you can reduce connection resources
to the half. In this sense, the Netty channel / REST / Java API could be
extended for special realtime WS streaming mode applications, like for
pubsub applications. I experimented with that some time ago on ES 0.20
https://github.com/jprante/elasticsearch-transport-websocket  (needs
updating)

>From what I understand, the thrift transport plugin compiles the ES API,
operates in a streaming-like fashion, and is providing a solution that
reduces HTTP overhead:
https://github.com/elasticsearch/elasticsearch-transport-thrift

Jörg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH7wM%2BpdVpH9%3Dysoq7a0CesOGxDnY4yAwQAeAcqLWDGvQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to