Yes, I can push >10k docs (= 10 MB/sec) with the Java bulk API, on a single
node.

How many docs can your app generate if you disable the bulk indexing and
run a "dry feed"?

7k per sec depend also on doc size. The larger the docs, the slower. Have
you checked how much capacity is used on network bandwidth, between client
and servers, and between servers? Maybe it is saturated.

Here is a small checklist for bulk:

- most important the refresh rate, disable it

- shard number should be reasonable (for four nodes, maybe four or eight
shards to distribute the load evenly)

- replica level should be 0 during bulk

- the Java client should connect to all nodes (I prefer TransportClient)

- if you have predefined mappings, create the index and the mappings before
bulk start, it saves some overhead of dynamic mapping

- after bulk, re-enable refresh rate and increase replica level (and maybe
send an optimize request)

Some more tunables exist for advanced usage. I'm quite sure you do not need
to modify the advanced settings, since with 32 cores, ES selects reasonable
thread pool size.

I recommend to move from 0.90.5 to 0.90.7 / 0.90.8

Jörg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHViRtTQwpN6%3DwcKFz5EySyjB30ce0jAR7AkzjbmOGr6g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to