You'll actually get better indexing performance if you leave refresh enabled, maybe at 5s. This is because ES a separate refresh thread which will do the flushing, instead of having your bulk indexing threads to it when RAM is full, effectively giving you one more thread of concurrency.
Mike McCandless http://blog.mikemccandless.com On Mon, Jun 23, 2014 at 6:56 AM, [email protected] < [email protected]> wrote: > Your bulk insert size is too large. It makes no sense to insert 100.000 > with one request. Use 1000-10000 instead. > > Also you should submit bulk requests in parallel and not sequential like > you do. Sequential bulk is slow if client CPU/network is not saturated. > > Check if you have disabled the index refresh from 1 (1s) to -1 while bulk > indexing is active. 30s makes not much sense if you can execute the bulk in > this time. > > Do not limit indexing memory to 50%. > > It makes no sense to increase queue_size for bulk thread pool to 1000. > This means you want a single ES node should accept 1000 x 100000 = 100 000 > 000 = 100m docs at once. This will simply exceeds all reasonable limits and > bring the node down with an OOM (if you really have 100m docs). > > More advice is possible if you can show your client code how you push docs > to ES. > > Jörg > > > > On Mon, Jun 23, 2014 at 12:30 PM, Frederic Esnault < > [email protected]> wrote: > >> Hi everyone, >> >> I'm inserting around 265 000 documents into an elastic search cluster >> composed of 3 nodes (real servers). >> On two servers i give elastic search 20g of heap, on third one which has >> 64g ram, i set 30g of heap for elastic search. >> >> I set elastic search configuration to : >> >> - 3 shards (1 per server) >> - 0 replicas >> - discovery.zen.ping.multicast.enabled: false (and giving on each node >> the unicast hostnames of the two other nodes); >> - and this : >> >> indices.memory.index_buffer_size: 50% >> index.refresh_interval: 30s >> threadpool: >> index: >> type: fixed >> size: 30 >> queue_size: 1000 >> bulk: >> queue_size: 1000 >> bulk: >> type: fixed >> size: 30 >> queue_size: 1000 >> search: >> type: fixed >> size: 100 >> queue_size: 200 >> get: >> type: fixed >> size: 100 >> queue_size: 200 >> >> Indexing is done by groups of 100 000 docs, and here is my application >> log : >> INFO: Adding records to bulk insert batch >> INFO: Added 100000 records to bulk insert batch. Inserting batch... >> -- Bulk insert took 38.724 secondes >> INFO: Adding records to bulk insert batch >> INFO: Added 100000 records to bulk insert batch. Inserting batch... >> -- Bulk insert took 31.134 secondes >> INFO: Adding records to bulk insert batch >> INFO: Added 64201 records to bulk insert batch. Inserting batch... >> -- Bulk insert took 17.366 secondes >> >> --- Import CSV file took 108.905 secondes --- >> >> I'm wondering if this time is correct or not, or if there is something i >> can do to improve performances ? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/3a38e79e-9afb-4146-a7e1-7984ec082e22%40googlegroups.com >> <https://groups.google.com/d/msgid/elasticsearch/3a38e79e-9afb-4146-a7e1-7984ec082e22%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFWyXXyQSXhVbQ94VfxYvs50yQwS-Rg%3Dy51%2B%3Dwd9DT6Uw%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFWyXXyQSXhVbQ94VfxYvs50yQwS-Rg%3Dy51%2B%3Dwd9DT6Uw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAD7smRec8A%2BE4XDs2b-oVe-Ai%2BLyXXMr2BwNgB8LqnkK7MJXZA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
