Hello everyone,

I wanted to know if it is possible to index the docs through a stream which 
pushes data to the Elasticsearch cluster. 

Our current problem is to index the huge set of data from Postgres to 
Elasticsearch while processing the data in between. We have been able to 
stream data out of Postgres enabling us to use constant memory in our Ruby 
code but there is a significant delay while posting these docs in batches 
to ES through the Bulk API.

I think it would be ideal if there would exist such a mechanism to push our 
docs continuously in the ES cluster thereby reducing the bottleneck 
currently created by the bulk call. 

Also I would ideally have wanted to post all the batches of docs in a 
different thread but that will create memory issues so I though streaming 
to be a good alternative. 

I apologise for the fact that this is a kind of subjective question but 
please do ask for the code if you want to know something.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/47377f8b-5825-47b6-9eba-ea2521fbab92%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to