Hello everyone, I wanted to know if it is possible to index the docs through a stream which pushes data to the Elasticsearch cluster.
Our current problem is to index the huge set of data from Postgres to Elasticsearch while processing the data in between. We have been able to stream data out of Postgres enabling us to use constant memory in our Ruby code but there is a significant delay while posting these docs in batches to ES through the Bulk API. I think it would be ideal if there would exist such a mechanism to push our docs continuously in the ES cluster thereby reducing the bottleneck currently created by the bulk call. Also I would ideally have wanted to post all the batches of docs in a different thread but that will create memory issues so I though streaming to be a good alternative. I apologise for the fact that this is a kind of subjective question but please do ask for the code if you want to know something. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/47377f8b-5825-47b6-9eba-ea2521fbab92%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
