Hi I can index 70m small (1k) records in 40 minutes.
Would that performance be good/bad? Configuration is 6 x Elasticsearch nodes each with 16GB dedicated memory. Each node is 8 processor intel linux server There are 6 clients running locally on each node (localhost) each running elasticsearch-py helper.bulk in turn spawning 8 client processes (48 processes total). The index.store.type is memory refresh_interval 120s threadpool.bulk.queue_size is 200 Marvel reports up to 80,000 records per second index rate. But in practice the net records per second taking the 40minutes is more like 30,000 records/s Given the hardware my question is: is this good or should I expect faster? And what can be done to increase through-put? Throwing more clients at the server does seem to drive up performance... but how to measure what is the bottleneck? Should I be concerned that the IOps reported by marvel on the cluster summary is 1: 344 2: 466 3: 246 4: 261 5: 162 6: 93 Thanks. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bce7bc57-a5ce-4224-bf28-4791cacf12de%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
