Re: Bulk throughput issues

Geet Gangwar Thu, 02 Jan 2014 02:52:05 -0800

Hi tdjb,

I am also working on similar task, my requirement is to store around 70 
million documents on a single mode. I have a 8 GB 8 core machine. Please 
guide me how should take approach.


Also request you to share some code samples of how I can use Bulk Processor.

Regards

Geet

On Thursday, January 2, 2014 5:08:57 AM UTC+5:30, tdjb wrote:
>
> Hmm, ok, thank you for that info Jörg. I had previously been using one 
> client with 64 concurrent requests as the hardware we are running on has 32 
> cores. It sounds like I might need to try bumping that number up to see 
> what happens.
>
> On Wednesday, January 1, 2014 5:27:40 AM UTC-7, Jörg Prante wrote:
>>
>> There is no need for more than one client instance per JVM. You can 
>> increase the bulk request concurrency in the BulkProcessor with 
>> "setConcurrentRequests" to avoid blocking threads, until you reach the 
>> sweet spot where client submitting resources matches the indexing capacity 
>> of the cluster. 
>>
>> This is a matter of dynamic balance, which is different from setup to 
>> setup. The default request concurrency is 1. For a higher value, you have 
>> to prepare enough heap resources and maybe run your doc construction in 
>> multiple threads to exploit the advantages.
>>
>> As a rule of thumb, use 4 * available cores for the concurrency, and 
>> ~1-10MB for the bulk size.
>>
>> For example, I often operate with a bulk size of 1000 docs and a 
>> concurrency level of 32.
>>
>> Jörg
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8080c572-f66b-4866-af72-9e7ab2e0f939%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Bulk throughput issues

Reply via email to