Hi tdjb, I am also working on similar task, my requirement is to store around 70 million documents on a single mode. I have a 8 GB 8 core machine. Please guide me how should take approach.
Also request you to share some code samples of how I can use Bulk Processor. Regards Geet On Thursday, January 2, 2014 5:08:57 AM UTC+5:30, tdjb wrote: > > Hmm, ok, thank you for that info Jörg. I had previously been using one > client with 64 concurrent requests as the hardware we are running on has 32 > cores. It sounds like I might need to try bumping that number up to see > what happens. > > On Wednesday, January 1, 2014 5:27:40 AM UTC-7, Jörg Prante wrote: >> >> There is no need for more than one client instance per JVM. You can >> increase the bulk request concurrency in the BulkProcessor with >> "setConcurrentRequests" to avoid blocking threads, until you reach the >> sweet spot where client submitting resources matches the indexing capacity >> of the cluster. >> >> This is a matter of dynamic balance, which is different from setup to >> setup. The default request concurrency is 1. For a higher value, you have >> to prepare enough heap resources and maybe run your doc construction in >> multiple threads to exploit the advantages. >> >> As a rule of thumb, use 4 * available cores for the concurrency, and >> ~1-10MB for the bulk size. >> >> For example, I often operate with a bulk size of 1000 docs and a >> concurrency level of 32. >> >> Jörg >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8080c572-f66b-4866-af72-9e7ab2e0f939%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
