Re: Elasticsearch 1.1.0 - Optimize broken?

Robin Wallin Fri, 11 Apr 2014 00:46:23 -0700

Hello,

We are experiencing a related problem with 1.1.0. Segments do not seem to 
merge as they should during indexing. The optimize API does practically 
nothing in terms of lowering the segments count either. The problem 
persists through a cluster restart. The vast amount of segments seem to be 
greatly impact the performance of the cluster, in a very negative way.


We currently have 414 million documents across 3 nodes, each shard has in 
average 1200 segments(!).

With 1.0.1 we had even more documents, ~650 million, without any segment 
problems. Looking in Marvel we were hovering at around 30-40 segments per 
shard back then.

Best Regards,
Robin

On Friday, April 11, 2014 1:35:42 AM UTC+2, Adrien Grand wrote:
>
> Thanks for reporting this, the behavior is definitely unexpected. I'll 
> test _optimize on very large numbers of shards to see if I can reproduce 
> the issue.
>
>
> On Thu, Apr 10, 2014 at 2:10 PM, Elliott Bradshaw 
> <[email protected]<javascript:>
> > wrote:
>
>> Adrien,
>>
>> Just an FYI, after resetting the cluster, things seem to have improved.  
>> Optimize calls now lead to CPU/IO activity over their duration.  
>> Max_num_segments=1 does not seem to be working for me on any given call, as 
>> each call would only reduce the segment count by about 600-700.  I ran 10 
>> calls in sequence overnight, and actually got down to 4 segments (1/shard)!
>>
>> I'm glad I got the index optimized, searches are literally 10-20 times 
>> faster without 1500/segments per shard to deal with.  It's awesome.
>>
>> That said, any thoughts on why the index wasn't merging on its own, or 
>> why optimize was returning prematurely?
>>
>>
>> On Wednesday, April 9, 2014 11:10:56 AM UTC-4, Elliott Bradshaw wrote:
>>>
>>> Hi Adrien,
>>>
>>> I kept the logs up over the last optimize call, and I did see an 
>>> exception.  I Ctrl-C'd a curl optimize call before making another one, but 
>>> I don't think that that caused this exception.  The error is essentially as 
>>> follows:
>>>
>>> netty - Caught exception while handling client http traffic, closing 
>>> connection [id: 0x4d8f1a90, /127.0.0.1:33480 :> /127.0.0.1:9200]
>>>
>>> java.nio.channels.ClosedChannelException at AbstractNioWorker.
>>> cleanUpWriteBuffer(AbstractNioWorker.java:433)
>>> at AbstractNioWorker.writeFromUserCode
>>> at NioServerSocketPipelineSink.handleAcceptedSocket
>>> at NioServerSocketPipelineSink.eventSunk
>>> at DefaultChannelPipeline$DefaultChannelhandlerContext.sendDownstream
>>> at Channels.write
>>> at OneToOneEncoder.doEncode
>>> at OneToOneEncoder.handleDownstream
>>> at DefaultChannelPipeline.sendDownstream
>>> at DefaultChannelPipeline.sendDownstream
>>> at Channels.write
>>> at AbstractChannel.write
>>> at NettyHttpChannel.sendResponse
>>> at RestOptimizeAction$1.onResponse(95)
>>> at RestOptimizeAction$1.onResponse(85)
>>> at TransportBroadcastOperationAction$AsyncBroadcastAction.finishHim
>>> at TransportBroadcastOperationAction$AsyncBroadcastAction.onOperation
>>> at TransportBroadcastOperationAction$AsyncBroadcastAction$2.run
>>>
>>> Sorry about the crappy stack trace.  Still, looks like this might point 
>>> to a problem!  The exception fired about an hour after I kicked off the 
>>> optimize.  Any thoughts?
>>>
>>> On Wednesday, April 9, 2014 10:06:57 AM UTC-4, Elliott Bradshaw wrote:
>>>>
>>>> Hi Adrien,
>>>>
>>>> I did customize my merge policy, although I did so only because I was 
>>>> so surprised by the number of segments left over after the load.  I'm 
>>>> pretty sure the optimize problem was happening before I made this change, 
>>>> but either way here are my settings:
>>>>
>>>> "index" : {
>>>> "merge" : {
>>>> "policy" : {
>>>> "max_merged_segment" : "20gb",
>>>> "segments_per_tier" : 5,
>>>> "floor_segment" : "10mb"
>>>> },
>>>> "scheduler" : "concurrentmergescheduler"
>>>> }
>>>> }
>>>>
>>>> Not sure whether this set up could be a contributing factor or not.  
>>>> Nothing really jumps out at me in the logs.  In fact, when i kick off the 
>>>> optimize, I don't see any logging at all.  Should I?
>>>>
>>>> I'm running the following command: curl -XPOST 
>>>> http://localhost:9200/index/_optimize
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> On Wednesday, April 9, 2014 8:56:35 AM UTC-4, Adrien Grand wrote:
>>>>>
>>>>> Hi Elliott,
>>>>>
>>>>> 1500 segments per shard is certainly way too much, and it is not 
>>>>> normal that optimize doesn't manage to reduce the number of segments.
>>>>>  - Is there anything suspicious in the logs?
>>>>>  - Have you customized the merge policy or scheduler?[1]
>>>>>  - Does the issue still reproduce if you restart your cluster?
>>>>>
>>>>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/
>>>>> reference/current/index-modules-merge.html
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Apr 9, 2014 at 2:38 PM, Elliott Bradshaw 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> Any other thoughts on this?  Would 1500 segments per shard be 
>>>>>> significantly impacting performance?  Have you guys noticed this 
>>>>>> behavior 
>>>>>> elsewhere?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>> On Monday, April 7, 2014 8:56:38 AM UTC-4, Elliott Bradshaw wrote:
>>>>>>>
>>>>>>> Adrian,
>>>>>>>
>>>>>>> I ran the following command:
>>>>>>>
>>>>>>> curl -XPUT http://localhost:9200/_settings -d 
>>>>>>> '{"indices.store.throttle.max_bytes_per_sec" : "10gb"}'
>>>>>>>
>>>>>>> and received a { "acknowledged" : "true" } response.  The logs 
>>>>>>> showed "cluster state updated".
>>>>>>>
>>>>>>> I did have to close my index prior to changing the setting and 
>>>>>>> reopen afterward.
>>>>>>>
>>>>>>>
>>>>>>> I've since began another optimize, but again it doesn't look like 
>>>>>>> much is happening.  The optimize isn't returning and the total CPU 
>>>>>>> usage on 
>>>>>>> every node is holding at about 2% of a single core.  I would copy a 
>>>>>>> hot_threads stack trace, but I'm unfortunately on a closed network and 
>>>>>>> this 
>>>>>>> isn't possible.  I can tell you that refreshes of hot_threads show vary 
>>>>>>> little happening.  The occasional [merge] thread (always in a 
>>>>>>> LinkedTransferQueue.awaitMatch() state) or [optimize] (doing 
>>>>>>> nothing on a waitForMerge() call) thread shows up, but it's always 
>>>>>>> consuming 0-1% CPU.  It sure feels like something isn't right.  Any 
>>>>>>> thoughts?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 4, 2014 at 3:24 PM, Adrien Grand <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Did you see a message in the logs confirming that the setting has 
>>>>>>>> been updated? It would be interesting to see the output of hot 
>>>>>>>> threads[1] 
>>>>>>>> to see what your node is doing.
>>>>>>>>
>>>>>>>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/referenc
>>>>>>>> e/current/cluster-nodes-hot-threads.html
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Apr 4, 2014 at 7:18 PM, Elliott Bradshaw <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Yes. I have run max_num_segments=1 every time.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Apr 4, 2014 at 12:26 PM, Michael Sick <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Have you tried max_num_segments=1 on your optimize? 
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 4, 2014 at 11:27 AM, Elliott Bradshaw <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Any thoughts on this?  I've run optimize several more times, and 
>>>>>>>>>>> the number of segments falls each time, but I'm still over 1000 
>>>>>>>>>>> segments 
>>>>>>>>>>> per shard.  Has anyone else run into something similar?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thursday, April 3, 2014 11:21:29 AM UTC-4, Elliott Bradshaw 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> OK.  Optimize finally returned, so I suppose something was 
>>>>>>>>>>>> happening in the background, but I'm still seeing over 6500 
>>>>>>>>>>>> segments.  Even 
>>>>>>>>>>>> after setting max_num_segments=5.  Does this seem right?  Queries 
>>>>>>>>>>>> are a 
>>>>>>>>>>>> little faster (350-400ms) but still not great.  Bigdesk is still 
>>>>>>>>>>>> showing a 
>>>>>>>>>>>> fair amount of file IO.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thursday, April 3, 2014 8:47:32 AM UTC-4, Elliott Bradshaw 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've recently upgraded to Elasticsearch 1.1.0.  I've got a 4 
>>>>>>>>>>>>> node cluster, each with 64G of ram, with 24G allocated to 
>>>>>>>>>>>>> Elasticsearch on 
>>>>>>>>>>>>> each.  I've batch loaded approximately 86 million documents into 
>>>>>>>>>>>>> a single 
>>>>>>>>>>>>> index (4 shards) and have started benchmarking 
>>>>>>>>>>>>> cross_field/multi_match 
>>>>>>>>>>>>> queries on them.  The index has one replica and takes up a total 
>>>>>>>>>>>>> of 111G.  
>>>>>>>>>>>>> I've run several batches of warming queries, but queries are not 
>>>>>>>>>>>>> as fast as 
>>>>>>>>>>>>> I had hoped, approximately 400-500ms each.  Given that *top *(on 
>>>>>>>>>>>>> Centos) shows 5-8 GB of free memory on each server, I would 
>>>>>>>>>>>>> assume that the 
>>>>>>>>>>>>> entire index has been paged into memory (I had worried about disk 
>>>>>>>>>>>>> performance previously, as we are working in a virtualized 
>>>>>>>>>>>>> environment).
>>>>>>>>>>>>>
>>>>>>>>>>>>> A stats query on the index in questions shows that the index 
>>>>>>>>>>>>> is composed of > 7000 segments.  This seemed high to me, but 
>>>>>>>>>>>>> maybe it's 
>>>>>>>>>>>>> appropriate.  Regardless, I dispatched an optimize command, but I 
>>>>>>>>>>>>> am not 
>>>>>>>>>>>>> seeing any progress and the command has not returned.  Current 
>>>>>>>>>>>>> merges 
>>>>>>>>>>>>> remains at zero, and the segment count is not changing.  Checking 
>>>>>>>>>>>>> out hot 
>>>>>>>>>>>>> threads in ElasticHQ, I initially saw an optimize call in the 
>>>>>>>>>>>>> stack that 
>>>>>>>>>>>>> was blocked on a waitForMerge call.  This however has 
>>>>>>>>>>>>> disappeared, and I'm 
>>>>>>>>>>>>> seeing no evidence that the optimize is occuring.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does any of this seem out of the norm or unusual?  Has anyone 
>>>>>>>>>>>>> else had similar issues.  This is the second time I have tried to 
>>>>>>>>>>>>> optimize 
>>>>>>>>>>>>> an index since upgrading.  I've gotten the same result both time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks in advance for any help/tips!
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Elliott
>>>>>>>>>>>>>
>>>>>>>>>>>>  -- 
>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>> Google Groups "elasticsearch" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>  To view this discussion on the web visit 
>>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5
>>>>>>>>>>> e-4088-a1f2-93272beef0bb%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5e-4088-a1f2-93272beef0bb%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  -- 
>>>>>>>>>> You received this message because you are subscribed to a topic 
>>>>>>>>>> in the Google Groups "elasticsearch" group.
>>>>>>>>>> To unsubscribe from this topic, visit 
>>>>>>>>>> https://groups.google.com/d/topic/elasticsearch/kqTRRADQBwc/
>>>>>>>>>> unsubscribe.
>>>>>>>>>> To unsubscribe from this group and all its topics, send an email 
>>>>>>>>>> to [email protected].
>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAP8axnD7BUz
>>>>>>>>>> iGct2%3Db%3DfupaKYFnA5fR2TBsxHoURJumHSyODFA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAP8axnD7BUziGct2%3Db%3DfupaKYFnA5fR2TBsxHoURJumHSyODFA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  -- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "elasticsearch" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to [email protected].
>>>>>>>>>  To view this discussion on the web visit 
>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAGCt%2BFvoS
>>>>>>>>> QTvv%2B6G%3D3GOX27AuYdEwLiW%3Demc0JTouT9%2BBeUk_A%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAGCt%2BFvoSQTvv%2B6G%3D3GOX27AuYdEwLiW%3Demc0JTouT9%2BBeUk_A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>  
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Adrien Grand
>>>>>>>>  
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to a topic in 
>>>>>>>> the Google Groups "elasticsearch" group.
>>>>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/
>>>>>>>> topic/elasticsearch/kqTRRADQBwc/unsubscribe.
>>>>>>>> To unsubscribe from this group and all its topics, send an email to 
>>>>>>>> [email protected].
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6sQrP
>>>>>>>> jijV86nYGoGTAQ%3D3cO_pgyYE6%2B3sGjJPr8%2BKDsg%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6sQrPjijV86nYGoGTAQ%3D3cO_pgyYE6%2B3sGjJPr8%2BKDsg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>  -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "elasticsearch" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>>> msgid/elasticsearch/8742280e-922f-4e91-bcb2-6096ca0165e6%
>>>>>> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/8742280e-922f-4e91-bcb2-6096ca0165e6%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Adrien Grand
>>>>>  
>>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/344b09db-a2d8-4c2d-a917-dbf53eda03ce%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/344b09db-a2d8-4c2d-a917-dbf53eda03ce%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eda52a43-94ec-4574-b989-32727cf3cfe4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch 1.1.0 - Optimize broken?

Reply via email to