May I ask, when you seek for better indexing performance, what your current performance is? How many nodes ( = hardware machines) do you have?
Jörg On Tue, Apr 14, 2015 at 1:36 PM, Hajime <[email protected]> wrote: > Possibly it is IO bound but I don't seem too many io wait on Cpu or write > activity on iostat.By the way,uses ssd and xfs as file system and default > Directory ( I think it becomes MMapDirectory). > > > each single bulk request to one index is done concurrently 5X so you > only need enough concurrent bulk requests to saturate the number of CPUs > I suppose that IndexWriter will lock at some point but will this strategy > work on the same index? > > However,setting *index.merge.async_interval* higher than default "1s" > seems better for the huge indexing (I'm still using 1.4.0).I found that it > was removed from recent release of 1.5.0.Do you know why?Will I see > better indexing performance just simply upgrade to >=1.5.0? > > > On Tue, Apr 14, 2015 at 1:14 AM, Michael McCandless <[email protected]> > wrote: > >> Hmm maybe your nodes are IO bound? What IO system are you using? >> >> You should not need to increase the default bulk thread pool size, and if >> you are using default 5 shards then each single bulk request to one index >> is done concurrently 5X so you only need enough concurrent bulk requests to >> saturate the number of CPUs, e.g. 40 / 5 = 8 concurrent bulk indexing >> clients. >> >> Mike McCandless >> >> On Mon, Apr 13, 2015 at 7:41 AM, Hajime <[email protected]> >> wrote: >> >>> Hi Mike, >>> >>> I guess you are right.Cpu usage was about 10~20% (of 40 cpu cores) and >>> load average was about 10.I might well as check out with the condition >>> using about 50 ~70% of Cpu while maintaining the lower load average. >>> >>> However,do you have any idea how can I use more Cpu when indexing?I'm >>> increasing the input amount but Cpu usage remains almost same (and the >>> processing speed seems almost same). >>> Should I configure something like "*bulk.thread_pool*" size or >>> "indices.memory.max_shard_index_buffer_size" >>> ( >>> https://github.com/elastic/elasticsearch/blob/97559c0614d900a682d01afc241615cf5627fb4c/src/main/java/org/elasticsearch/indices/memory/IndexingMemoryController.java#L96 >>> )? >>> >>> >>> On Mon, Apr 13, 2015 at 6:00 PM, Michael McCandless <[email protected]> >>> wrote: >>> >>>> You should see better performance with -1 refresh_interval, because >>>> Lucene will flush larger, single segments, causing less merging pressure. >>>> >>>> Are both of your tests (-1 vs 10s) fully saturating CPU and/or IO on >>>> your nodes? >>>> >>>> If not, then that can explain it: when you have 10s refresh_interval, a >>>> separate thread (refresh thread) bears the cost of moving the new segments >>>> to disk, but with -1, the bulk index threads themselves bear the cost. >>>> >>>> But if you test with enough client-side concurrency to saturate your >>>> resources you should see the opposite (-1 refresh_interval is faster >>>> indexing throughput). >>>> >>>> Mike McCandless >>>> >>>> On Mon, Apr 13, 2015 at 4:05 AM, Hajime <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm trying to improve the indexing performances.I follow these >>>>> instructions >>>>> >>>>> https://www.elastic.co/blog/performance-considerations-elasticsearch-indexing >>>>> >>>>> http://www.elastic.co/guide/en/elasticsearch/guide/master/indexing-performance.html >>>>> >>>>> and created 20 different indexes by changing >>>>> translog,merge,refresh...etc while >>>>> "*number_of_shards:1,number_of_replica:0" >>>>> part is common.* >>>>> I found that the best performed index was* actually **default >>>>> settings + refresh_interval:"10s"(not **refresh_interval:"-1") *in >>>>> terms of doc count accuracy or load average or lesser bulk queue*.* >>>>> >>>>> Can anyone explain why this settings is the best? >>>>> >>>>> In addition,I can observe when refresh thread is active,the segments >>>>> count decrease.What is exactly the refresh thread doing? >>>>> >>>>> thanks >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsrZNVJtOj__YO6enFqKDt4T1Hxi_pT94W9YQx7bNe%3Dg1g%40mail.gmail.com >>>>> <https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsrZNVJtOj__YO6enFqKDt4T1Hxi_pT94W9YQx7bNe%3Dg1g%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elasticsearch/CAKHUQPiEsvcVdYdgYLK2PqkA7L-VGngTh-7kMDBXccz%3DkvD%2BjQ%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/elasticsearch/CAKHUQPiEsvcVdYdgYLK2PqkA7L-VGngTh-7kMDBXccz%3DkvD%2BjQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsqnykKeMnwotNVkJL_R8XMe88t6sYMc0yM0c744RpDbsQ%40mail.gmail.com >>> <https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsqnykKeMnwotNVkJL_R8XMe88t6sYMc0yM0c744RpDbsQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAKHUQPiQQmhVEyuyJOyE2oc37V%2BARVHi9aFM7f-uqct4VKC92w%40mail.gmail.com >> <https://groups.google.com/d/msgid/elasticsearch/CAKHUQPiQQmhVEyuyJOyE2oc37V%2BARVHi9aFM7f-uqct4VKC92w%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsqMALFMDN%2BRQKNXmG-9RRcPtqOQSJGFW6m9fL%3D5Q1Mr%2Bg%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsqMALFMDN%2BRQKNXmG-9RRcPtqOQSJGFW6m9fL%3D5Q1Mr%2Bg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHxAFKRZtC_DQ2ss_7cx_T%2BiZhvq2xL39RpG2wP-uSbeg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
