May I ask, when you seek for better indexing performance, what your current
performance is? How many nodes ( = hardware machines) do you have?

Jörg

On Tue, Apr 14, 2015 at 1:36 PM, Hajime <[email protected]> wrote:

> Possibly it is IO bound but I don't seem too many io wait on Cpu or write
> activity on iostat.By the way,uses ssd and xfs as file system and default
> Directory ( I think it becomes MMapDirectory).
>
> > each single bulk request to one index is done concurrently 5X so you
> only need enough concurrent bulk requests to saturate the number of CPUs
> I suppose that IndexWriter will lock at some point but will this strategy
> work on the same index?
>
> However,setting *index.merge.async_interval* higher than default "1s"
> seems better for the huge indexing (I'm still using 1.4.0).I found that it
> was removed from recent release of 1.5.0.Do you know why?Will I  see
> better indexing performance just simply upgrade to >=1.5.0?
>
>
> On Tue, Apr 14, 2015 at 1:14 AM, Michael McCandless <[email protected]>
> wrote:
>
>> Hmm maybe your nodes are IO bound?  What IO system are you using?
>>
>> You should not need to increase the default bulk thread pool size, and if
>> you are using default 5 shards then each single bulk request to one index
>> is done concurrently 5X so you only need enough concurrent bulk requests to
>> saturate the number of CPUs, e.g. 40 / 5 = 8 concurrent bulk indexing
>> clients.
>>
>> Mike McCandless
>>
>> On Mon, Apr 13, 2015 at 7:41 AM, Hajime <[email protected]>
>> wrote:
>>
>>> Hi Mike,
>>>
>>> I guess you are right.Cpu usage was about 10~20% (of 40 cpu cores) and
>>> load average was about 10.I might well as check out with the condition
>>> using about 50 ~70% of Cpu while maintaining the lower load average.
>>>
>>> However,do you have any idea how can I use more Cpu when indexing?I'm
>>> increasing the input amount but Cpu usage remains almost same (and the
>>> processing speed seems almost same).
>>> Should I configure something like "*bulk.thread_pool*" size or 
>>> "indices.memory.max_shard_index_buffer_size"
>>> (
>>> https://github.com/elastic/elasticsearch/blob/97559c0614d900a682d01afc241615cf5627fb4c/src/main/java/org/elasticsearch/indices/memory/IndexingMemoryController.java#L96
>>> )?
>>>
>>>
>>> On Mon, Apr 13, 2015 at 6:00 PM, Michael McCandless <[email protected]>
>>> wrote:
>>>
>>>> You should see better performance with -1 refresh_interval, because
>>>> Lucene will flush larger, single segments, causing less merging pressure.
>>>>
>>>> Are both of your tests (-1 vs 10s) fully saturating CPU and/or IO on
>>>> your nodes?
>>>>
>>>> If not, then that can explain it: when you have 10s refresh_interval, a
>>>> separate thread (refresh thread) bears the cost of moving the new segments
>>>> to disk, but with -1, the bulk index threads themselves bear the cost.
>>>>
>>>> But if you test with enough client-side concurrency to saturate your
>>>> resources you should see the opposite (-1 refresh_interval is faster
>>>> indexing throughput).
>>>>
>>>> Mike McCandless
>>>>
>>>> On Mon, Apr 13, 2015 at 4:05 AM, Hajime <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm trying to improve the indexing performances.I follow these
>>>>> instructions
>>>>>
>>>>> https://www.elastic.co/blog/performance-considerations-elasticsearch-indexing
>>>>>
>>>>> http://www.elastic.co/guide/en/elasticsearch/guide/master/indexing-performance.html
>>>>>
>>>>> and created 20 different indexes by changing
>>>>> translog,merge,refresh...etc while 
>>>>> "*number_of_shards:1,number_of_replica:0"
>>>>> part is common.*
>>>>> I found that the best performed index was* actually **default
>>>>> settings + refresh_interval:"10s"(not **refresh_interval:"-1") *in
>>>>> terms of doc count accuracy or load average or lesser bulk queue*.*
>>>>>
>>>>> Can anyone explain why this settings is the best?
>>>>>
>>>>> In addition,I can observe when refresh thread is active,the segments
>>>>> count decrease.What is exactly the refresh thread doing?
>>>>>
>>>>> thanks
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsrZNVJtOj__YO6enFqKDt4T1Hxi_pT94W9YQx7bNe%3Dg1g%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsrZNVJtOj__YO6enFqKDt4T1Hxi_pT94W9YQx7bNe%3Dg1g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/elasticsearch/CAKHUQPiEsvcVdYdgYLK2PqkA7L-VGngTh-7kMDBXccz%3DkvD%2BjQ%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/elasticsearch/CAKHUQPiEsvcVdYdgYLK2PqkA7L-VGngTh-7kMDBXccz%3DkvD%2BjQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsqnykKeMnwotNVkJL_R8XMe88t6sYMc0yM0c744RpDbsQ%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsqnykKeMnwotNVkJL_R8XMe88t6sYMc0yM0c744RpDbsQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAKHUQPiQQmhVEyuyJOyE2oc37V%2BARVHi9aFM7f-uqct4VKC92w%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/CAKHUQPiQQmhVEyuyJOyE2oc37V%2BARVHi9aFM7f-uqct4VKC92w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsqMALFMDN%2BRQKNXmG-9RRcPtqOQSJGFW6m9fL%3D5Q1Mr%2Bg%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsqMALFMDN%2BRQKNXmG-9RRcPtqOQSJGFW6m9fL%3D5Q1Mr%2Bg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHxAFKRZtC_DQ2ss_7cx_T%2BiZhvq2xL39RpG2wP-uSbeg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to