For reference I'm also on 1.1.0 but I'm not seeing more segments then I expect. I see an average of ~28 per shard on an index I write to constantly. I don't write all that quickly, < 50 updates a second.
Nik On Fri, Apr 11, 2014 at 5:46 AM, Adrien Grand < [email protected]> wrote: > I managed to reproduce the issue locally, I'm looking into it. > > > On Fri, Apr 11, 2014 at 9:45 AM, Robin Wallin <[email protected]> wrote: > >> Hello, >> >> We are experiencing a related problem with 1.1.0. Segments do not seem to >> merge as they should during indexing. The optimize API does practically >> nothing in terms of lowering the segments count either. The problem >> persists through a cluster restart. The vast amount of segments seem to be >> greatly impact the performance of the cluster, in a very negative way. >> >> We currently have 414 million documents across 3 nodes, each shard has in >> average 1200 segments(!). >> >> With 1.0.1 we had even more documents, ~650 million, without any segment >> problems. Looking in Marvel we were hovering at around 30-40 segments per >> shard back then. >> >> Best Regards, >> Robin >> >> On Friday, April 11, 2014 1:35:42 AM UTC+2, Adrien Grand wrote: >> >>> Thanks for reporting this, the behavior is definitely unexpected. I'll >>> test _optimize on very large numbers of shards to see if I can reproduce >>> the issue. >>> >>> >>> On Thu, Apr 10, 2014 at 2:10 PM, Elliott Bradshaw >>> <[email protected]>wrote: >>> >>>> Adrien, >>>> >>>> Just an FYI, after resetting the cluster, things seem to have >>>> improved. Optimize calls now lead to CPU/IO activity over their duration. >>>> Max_num_segments=1 does not seem to be working for me on any given call, as >>>> each call would only reduce the segment count by about 600-700. I ran 10 >>>> calls in sequence overnight, and actually got down to 4 segments (1/shard)! >>>> >>>> I'm glad I got the index optimized, searches are literally 10-20 times >>>> faster without 1500/segments per shard to deal with. It's awesome. >>>> >>>> That said, any thoughts on why the index wasn't merging on its own, or >>>> why optimize was returning prematurely? >>>> >>>> >>>> On Wednesday, April 9, 2014 11:10:56 AM UTC-4, Elliott Bradshaw wrote: >>>>> >>>>> Hi Adrien, >>>>> >>>>> I kept the logs up over the last optimize call, and I did see an >>>>> exception. I Ctrl-C'd a curl optimize call before making another one, but >>>>> I don't think that that caused this exception. The error is essentially >>>>> as >>>>> follows: >>>>> >>>>> netty - Caught exception while handling client http traffic, closing >>>>> connection [id: 0x4d8f1a90, /127.0.0.1:33480 :> /127.0.0.1:9200] >>>>> >>>>> java.nio.channels.ClosedChannelException at AbstractNioWorker. >>>>> cleanUpWriteBuffer(AbstractNioWorker.java:433) >>>>> at AbstractNioWorker.writeFromUserCode >>>>> at NioServerSocketPipelineSink.handleAcceptedSocket >>>>> at NioServerSocketPipelineSink.eventSunk >>>>> at DefaultChannelPipeline$DefaultChannelhandlerContext.sendDownstream >>>>> at Channels.write >>>>> at OneToOneEncoder.doEncode >>>>> at OneToOneEncoder.handleDownstream >>>>> at DefaultChannelPipeline.sendDownstream >>>>> at DefaultChannelPipeline.sendDownstream >>>>> at Channels.write >>>>> at AbstractChannel.write >>>>> at NettyHttpChannel.sendResponse >>>>> at RestOptimizeAction$1.onResponse(95) >>>>> at RestOptimizeAction$1.onResponse(85) >>>>> at TransportBroadcastOperationAction$AsyncBroadcastAction.finishHim >>>>> at TransportBroadcastOperationAction$AsyncBroadcastAction.onOperation >>>>> at TransportBroadcastOperationAction$AsyncBroadcastAction$2.run >>>>> >>>>> Sorry about the crappy stack trace. Still, looks like this might >>>>> point to a problem! The exception fired about an hour after I kicked off >>>>> the optimize. Any thoughts? >>>>> >>>>> On Wednesday, April 9, 2014 10:06:57 AM UTC-4, Elliott Bradshaw wrote: >>>>>> >>>>>> Hi Adrien, >>>>>> >>>>>> I did customize my merge policy, although I did so only because I was >>>>>> so surprised by the number of segments left over after the load. I'm >>>>>> pretty sure the optimize problem was happening before I made this change, >>>>>> but either way here are my settings: >>>>>> >>>>>> "index" : { >>>>>> "merge" : { >>>>>> "policy" : { >>>>>> "max_merged_segment" : "20gb", >>>>>> "segments_per_tier" : 5, >>>>>> "floor_segment" : "10mb" >>>>>> }, >>>>>> "scheduler" : "concurrentmergescheduler" >>>>>> } >>>>>> } >>>>>> >>>>>> Not sure whether this set up could be a contributing factor or not. >>>>>> Nothing really jumps out at me in the logs. In fact, when i kick off the >>>>>> optimize, I don't see any logging at all. Should I? >>>>>> >>>>>> I'm running the following command: curl -XPOST >>>>>> http://localhost:9200/index/_optimize >>>>>> >>>>>> Thanks! >>>>>> >>>>>> >>>>>> On Wednesday, April 9, 2014 8:56:35 AM UTC-4, Adrien Grand wrote: >>>>>>> >>>>>>> Hi Elliott, >>>>>>> >>>>>>> 1500 segments per shard is certainly way too much, and it is not >>>>>>> normal that optimize doesn't manage to reduce the number of segments. >>>>>>> - Is there anything suspicious in the logs? >>>>>>> - Have you customized the merge policy or scheduler?[1] >>>>>>> - Does the issue still reproduce if you restart your cluster? >>>>>>> >>>>>>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/referenc >>>>>>> e/current/index-modules-merge.html >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Apr 9, 2014 at 2:38 PM, Elliott Bradshaw <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> Any other thoughts on this? Would 1500 segments per shard be >>>>>>>> significantly impacting performance? Have you guys noticed this >>>>>>>> behavior >>>>>>>> elsewhere? >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> >>>>>>>> On Monday, April 7, 2014 8:56:38 AM UTC-4, Elliott Bradshaw wrote: >>>>>>>>> >>>>>>>>> Adrian, >>>>>>>>> >>>>>>>>> I ran the following command: >>>>>>>>> >>>>>>>>> curl -XPUT http://localhost:9200/_settings -d >>>>>>>>> '{"indices.store.throttle.max_bytes_per_sec" : "10gb"}' >>>>>>>>> >>>>>>>>> and received a { "acknowledged" : "true" } response. The logs >>>>>>>>> showed "cluster state updated". >>>>>>>>> >>>>>>>>> I did have to close my index prior to changing the setting and >>>>>>>>> reopen afterward. >>>>>>>>> >>>>>>>>> >>>>>>>>> I've since began another optimize, but again it doesn't look like >>>>>>>>> much is happening. The optimize isn't returning and the total CPU >>>>>>>>> usage on >>>>>>>>> every node is holding at about 2% of a single core. I would copy a >>>>>>>>> hot_threads stack trace, but I'm unfortunately on a closed network >>>>>>>>> and this >>>>>>>>> isn't possible. I can tell you that refreshes of hot_threads show >>>>>>>>> vary >>>>>>>>> little happening. The occasional [merge] thread (always in a >>>>>>>>> LinkedTransferQueue.awaitMatch() state) or [optimize] (doing >>>>>>>>> nothing on a waitForMerge() call) thread shows up, but it's always >>>>>>>>> consuming 0-1% CPU. It sure feels like something isn't right. Any >>>>>>>>> thoughts? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Apr 4, 2014 at 3:24 PM, Adrien Grand < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Did you see a message in the logs confirming that the setting has >>>>>>>>>> been updated? It would be interesting to see the output of hot >>>>>>>>>> threads[1] >>>>>>>>>> to see what your node is doing. >>>>>>>>>> >>>>>>>>>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/referenc >>>>>>>>>> e/current/cluster-nodes-hot-threads.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Apr 4, 2014 at 7:18 PM, Elliott Bradshaw < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Yes. I have run max_num_segments=1 every time. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Apr 4, 2014 at 12:26 PM, Michael Sick < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Have you tried max_num_segments=1 on your optimize? >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Apr 4, 2014 at 11:27 AM, Elliott Bradshaw < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Any thoughts on this? I've run optimize several more times, >>>>>>>>>>>>> and the number of segments falls each time, but I'm still over >>>>>>>>>>>>> 1000 >>>>>>>>>>>>> segments per shard. Has anyone else run into something similar? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thursday, April 3, 2014 11:21:29 AM UTC-4, Elliott Bradshaw >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> OK. Optimize finally returned, so I suppose something was >>>>>>>>>>>>>> happening in the background, but I'm still seeing over 6500 >>>>>>>>>>>>>> segments. Even >>>>>>>>>>>>>> after setting max_num_segments=5. Does this seem right? >>>>>>>>>>>>>> Queries are a >>>>>>>>>>>>>> little faster (350-400ms) but still not great. Bigdesk is still >>>>>>>>>>>>>> showing a >>>>>>>>>>>>>> fair amount of file IO. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thursday, April 3, 2014 8:47:32 AM UTC-4, Elliott Bradshaw >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've recently upgraded to Elasticsearch 1.1.0. I've got a 4 >>>>>>>>>>>>>>> node cluster, each with 64G of ram, with 24G allocated to >>>>>>>>>>>>>>> Elasticsearch on >>>>>>>>>>>>>>> each. I've batch loaded approximately 86 million documents >>>>>>>>>>>>>>> into a single >>>>>>>>>>>>>>> index (4 shards) and have started benchmarking >>>>>>>>>>>>>>> cross_field/multi_match >>>>>>>>>>>>>>> queries on them. The index has one replica and takes up a >>>>>>>>>>>>>>> total of 111G. >>>>>>>>>>>>>>> I've run several batches of warming queries, but queries are >>>>>>>>>>>>>>> not as fast as >>>>>>>>>>>>>>> I had hoped, approximately 400-500ms each. Given that *top >>>>>>>>>>>>>>> *(on Centos) shows 5-8 GB of free memory on each server, I >>>>>>>>>>>>>>> would assume that the entire index has been paged into memory >>>>>>>>>>>>>>> (I had >>>>>>>>>>>>>>> worried about disk performance previously, as we are working in >>>>>>>>>>>>>>> a >>>>>>>>>>>>>>> virtualized environment). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> A stats query on the index in questions shows that the index >>>>>>>>>>>>>>> is composed of > 7000 segments. This seemed high to me, but >>>>>>>>>>>>>>> maybe it's >>>>>>>>>>>>>>> appropriate. Regardless, I dispatched an optimize command, but >>>>>>>>>>>>>>> I am not >>>>>>>>>>>>>>> seeing any progress and the command has not returned. Current >>>>>>>>>>>>>>> merges >>>>>>>>>>>>>>> remains at zero, and the segment count is not changing. >>>>>>>>>>>>>>> Checking out hot >>>>>>>>>>>>>>> threads in ElasticHQ, I initially saw an optimize call in the >>>>>>>>>>>>>>> stack that >>>>>>>>>>>>>>> was blocked on a waitForMerge call. This however has >>>>>>>>>>>>>>> disappeared, and I'm >>>>>>>>>>>>>>> seeing no evidence that the optimize is occuring. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Does any of this seem out of the norm or unusual? Has >>>>>>>>>>>>>>> anyone else had similar issues. This is the second time I have >>>>>>>>>>>>>>> tried to >>>>>>>>>>>>>>> optimize an index since upgrading. I've gotten the same result >>>>>>>>>>>>>>> both time. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks in advance for any help/tips! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Elliott >>>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>>> Google Groups "elasticsearch" group. >>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5 >>>>>>>>>>>>> e-4088-a1f2-93272beef0bb%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5e-4088-a1f2-93272beef0bb%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>>> . >>>>>>>>>>>>> >>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> You received this message because you are subscribed to a topic >>>>>>>>>>>> in the Google Groups "elasticsearch" group. >>>>>>>>>>>> To unsubscribe from this topic, visit >>>>>>>>>>>> https://groups.google.com/d/topic/elasticsearch/kqTRRADQBwc/ >>>>>>>>>>>> unsubscribe. >>>>>>>>>>>> To unsubscribe from this group and all its topics, send an >>>>>>>>>>>> email to [email protected]. >>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAP8axnD7BUz >>>>>>>>>>>> iGct2%3Db%3DfupaKYFnA5fR2TBsxHoURJumHSyODFA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAP8axnD7BUziGct2%3Db%3DfupaKYFnA5fR2TBsxHoURJumHSyODFA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>> . >>>>>>>>>>>> >>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>> Google Groups "elasticsearch" group. >>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAGCt%2BFvoS >>>>>>>>>>> QTvv%2B6G%3D3GOX27AuYdEwLiW%3Demc0JTouT9%2BBeUk_A%40mail.gma >>>>>>>>>>> il.com<https://groups.google.com/d/msgid/elasticsearch/CAGCt%2BFvoSQTvv%2B6G%3D3GOX27AuYdEwLiW%3Demc0JTouT9%2BBeUk_A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Adrien Grand >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to a topic >>>>>>>>>> in the Google Groups "elasticsearch" group. >>>>>>>>>> To unsubscribe from this topic, visit >>>>>>>>>> https://groups.google.com/d/topic/elasticsearch/kqTRRADQBwc/ >>>>>>>>>> unsubscribe. >>>>>>>>>> To unsubscribe from this group and all its topics, send an email >>>>>>>>>> to [email protected]. >>>>>>>>>> To view this discussion on the web visit >>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6sQrP >>>>>>>>>> jijV86nYGoGTAQ%3D3cO_pgyYE6%2B3sGjJPr8%2BKDsg%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6sQrPjijV86nYGoGTAQ%3D3cO_pgyYE6%2B3sGjJPr8%2BKDsg%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>>>>> . >>>>>>>>>> >>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "elasticsearch" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/elasticsearch/8742280e-922 >>>>>>>> f-4e91-bcb2-6096ca0165e6%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/8742280e-922f-4e91-bcb2-6096ca0165e6%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Adrien Grand >>>>>>> >>>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/344b09db-a2d8-4c2d-a917-dbf53eda03ce% >>>> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/344b09db-a2d8-4c2d-a917-dbf53eda03ce%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> >>> -- >>> Adrien Grand >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/eda52a43-94ec-4574-b989-32727cf3cfe4%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/eda52a43-94ec-4574-b989-32727cf3cfe4%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Adrien Grand > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7mYNrg1vVauWN8CyD-csXPqtdPad%3DC0QFiTyYOzsU2Bg%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7mYNrg1vVauWN8CyD-csXPqtdPad%3DC0QFiTyYOzsU2Bg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1f93pFKC0RqT40uFOg6Zwkn5UO0QNfeHHFGuYENwLD6w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
