did you tried any of elasticseach health monitoring plugins
for example 'ElasticSearch HQ' have 'Node Diagnostics' option that will 
point weak points of your cluster and will suggest possible solution (very 
useful if you just starting your adventure with elasticsearch)
also 'bigdesk' is very good for realtime monitoring

do you have parent/child relationship configured on your documents?
it is quite often cause of high heap usage (and in consequence heavy GC'ing)


On Monday, January 13, 2014 12:22:47 PM UTC, Eric Lu wrote:
>
> Hi, guys
> I'm using elasticsearch to index a large number  of documents. A document 
> is about 0.5KB. 
> My elasticsearch cluster has 5 nodes(all data nodes). Each nodes are 
> running oracle Java version: 1.7.0_13 and both have 16GB RAM with 8GB 
> allocated to the JVM. And the index has 50 shards and 1 replicas.
> I set the bulk thread pool to size:30 and queue:1000.
> I use one thread to indexing documents by bulk,  bulk size is 1000.
> In the beginning, the performance is very good. It can index about 10 
> million documents per hour. But with the increasing of indexing document, 
> it slows down. When the cluster has 500 million document indexed, i noticed 
> that it spent about 12 hours to index 10 million documents.
>
> Is it normal? Or what is the bottleneck that throttling it?
>
> Any help?
>
> Regards
> Eric
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c8a675b9-e3e0-40c2-883d-31211d1add6e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to