Adrien,

What you're saying makes sense.  I batch load the records very quickly,
indexing about 100g of data in a little over an hour.  However, I've tried
your suggestions and am not seeing any improvement (I set
index.store.throttle.max_bytes_per_sec = 500mb).  File system reads and
writes are flatlined, CPU usage is 0.  The optimize command is just waiting
to return.  When I first kick off the optimize I occasionally see a very
brief burst in disk writes on the node on which the command is called, but
that's it.

Still stumped.


On Fri, Apr 4, 2014 at 11:53 AM, Adrien Grand <
[email protected]> wrote:

> Elasticsearch throttles merges by default so that they don't slow search
> down too much. This is usually preferable for read/writes loads, but in
> your case it looks like you batch-indexed a lot of documents at once and
> merges couldn't keep up with the indexing rate so you ended up with a very
> high number of segments. The thing is that merge throttling also applies to
> optimize calls, which might explain why your calls to the optimize API last
> forever.
>
> Could you try to disable merge throttling[1] before running a call to
> optimize again to see if the situation improves?
>
> [1]
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#store-throttling
>
>
> On Fri, Apr 4, 2014 at 5:27 PM, Elliott Bradshaw <[email protected]>wrote:
>
>> Any thoughts on this?  I've run optimize several more times, and the
>> number of segments falls each time, but I'm still over 1000 segments per
>> shard.  Has anyone else run into something similar?
>>
>>
>> On Thursday, April 3, 2014 11:21:29 AM UTC-4, Elliott Bradshaw wrote:
>>>
>>> OK.  Optimize finally returned, so I suppose something was happening in
>>> the background, but I'm still seeing over 6500 segments.  Even after
>>> setting max_num_segments=5.  Does this seem right?  Queries are a little
>>> faster (350-400ms) but still not great.  Bigdesk is still showing a fair
>>> amount of file IO.
>>>
>>> On Thursday, April 3, 2014 8:47:32 AM UTC-4, Elliott Bradshaw wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I've recently upgraded to Elasticsearch 1.1.0.  I've got a 4 node
>>>> cluster, each with 64G of ram, with 24G allocated to Elasticsearch on
>>>> each.  I've batch loaded approximately 86 million documents into a single
>>>> index (4 shards) and have started benchmarking cross_field/multi_match
>>>> queries on them.  The index has one replica and takes up a total of 111G.
>>>> I've run several batches of warming queries, but queries are not as fast as
>>>> I had hoped, approximately 400-500ms each.  Given that *top *(on
>>>> Centos) shows 5-8 GB of free memory on each server, I would assume that the
>>>> entire index has been paged into memory (I had worried about disk
>>>> performance previously, as we are working in a virtualized environment).
>>>>
>>>> A stats query on the index in questions shows that the index is
>>>> composed of > 7000 segments.  This seemed high to me, but maybe it's
>>>> appropriate.  Regardless, I dispatched an optimize command, but I am not
>>>> seeing any progress and the command has not returned.  Current merges
>>>> remains at zero, and the segment count is not changing.  Checking out hot
>>>> threads in ElasticHQ, I initially saw an optimize call in the stack that
>>>> was blocked on a waitForMerge call.  This however has disappeared, and I'm
>>>> seeing no evidence that the optimize is occuring.
>>>>
>>>> Does any of this seem out of the norm or unusual?  Has anyone else had
>>>> similar issues.  This is the second time I have tried to optimize an index
>>>> since upgrading.  I've gotten the same result both time.
>>>>
>>>> Thanks in advance for any help/tips!
>>>>
>>>> - Elliott
>>>>
>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5e-4088-a1f2-93272beef0bb%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5e-4088-a1f2-93272beef0bb%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Adrien Grand
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/kqTRRADQBwc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6WMRx8x-rJJi3KS2CZUu9wSbX8Vmuy48CpHFM_jUCXdw%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6WMRx8x-rJJi3KS2CZUu9wSbX8Vmuy48CpHFM_jUCXdw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGCt%2BFv5-vhDC3JGvRVH%2BdtssgRD-QZK--k2%3D0xg_eBAsbVb3w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to