Re: unstable results on refresh
My user interface shows some boxes to describe results categories. After half a day of small updates and delete I noticed with various queries that the boxes started swapping while browsing. For sure I relied too much in getting the same results on each call, now I'm keeping the categories order in request parameters to avoid the blink effect while browsing. The optimize process is really slow, and I can't use it. Since I have many other parameters that should be carried along the request to make sure that the navigation is consistent, I would like to understand if is there a setup that can limit the idf change and keep it low enough I tried with indexConfig mergeFactor5/mergeFactor /indexConfig In solrconfig but this morning /solr/admin/cores?action=STATUS still reports a number of segments above ten for all cores of the shard. (I'm sure I have reloaded each core after changing the value) Now I'm trying with expungeDeletes called from solrj, but still I don't see the segment count decrease UpdateRequest commitRequest = new UpdateRequest(); commitRequest.setAction //(action, waitFlush, waitSearcher, maxSegments, softCommit, expungeDeletes) ( ACTION.COMMIT, true, true, 10, false, true); commitRequest.process(solrServer); 2014-10-22 15:48 GMT+02:00 Erick Erickson erickerick...@gmail.com: I would rather ask whether such small differences matter enough to do this. Is this something users will _ever_ notice? Optimization is quite a heavyweight operation, and is generally not recommended on indexes that change often, and 5 minutes is certainly below the recommendation for optimizing. There is/has been work done on distributed IDF, but I don't quite know the current status that should address this (I think). But other than in a test setup, is it worth the effort? Best, Erick On Wed, Oct 22, 2014 at 3:54 AM, Giovanni Bricconi giovanni.bricc...@banzai.it wrote: I have made some small patch to the application to make this problem less visible, and I'm trying to perform the optimize once per hour, yesterday it took 5 minutes to perform it, this morning 15 minutes. Today I will collect some statistics but the publication process sends documents every 5 minutes, and I think the optimize is taking too much time. I have no default mergeFactor configured for this collection, do you think that setting it to a small value could improve the situation? If I have understood well having to merge segments will keep similar stats on all nodes. It's ok to have the indexing process a little bit slower. 2014-10-21 18:44 GMT+02:00 Erick Erickson erickerick...@gmail.com: Giovanni: To see how this happens, consider a shard with a leader and two followers. Assume your autocommit interval is 60 seconds on each. This interval can expire at slightly different wall clock times. Even if the servers started perfectly in synch, they can get slightly out of sync. So, you index a bunch of docs and these replicas close the current segment and re-open a new segment with slightly different contents. Now docs come in that replace older docs. The tf/idf statistics _include_ deleted document data (which is purged on optimize). Given that doc X an be in different segments (or, more accurately, segments that get merged at different times on different machines), replica 1 may have slightly different stats than replica 2, thus computing slightly different scores. Optimizing purges all data related to deleted documents, so it all regularizes itself on optimize. Best, Erick On Tue, Oct 21, 2014 at 11:08 AM, Giovanni Bricconi giovanni.bricc...@banzai.it wrote: I noticed again the problem, now I was able to collect some data. in my paste http://pastebin.com/nVwf327c you can see the result of the same query issued twice, the 2nd and 3rd group are swapped. I pasted also the clusterstate and the core state for each core. The logs did'n show any problem related to indexing, only some malformed query. After doing an optimize the problem disappeared. So, is the problem related to documents that where deleted from the index? The optimization took 5 minutes to complete 2014-10-21 11:41 GMT+02:00 Giovanni Bricconi giovanni.bricc...@banzai.it: Nice! I will monitor the index and try this if the problem comes back. Actually the problem was due to small differences in score, so I think the problem has the same origin 2014-10-21 8:10 GMT+02:00 lboutros boutr...@gmail.com: Hi Giovanni, we had this problem as well. The cause was that the different nodes have slightly different idf values. We solved this problem by doing an optimize operation which really remove suppressed data. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/unstable-results-on-refresh
Re: unstable results on refresh
On 10/23/2014 2:44 AM, Giovanni Bricconi wrote: My user interface shows some boxes to describe results categories. After half a day of small updates and delete I noticed with various queries that the boxes started swapping while browsing. For sure I relied too much in getting the same results on each call, now I'm keeping the categories order in request parameters to avoid the blink effect while browsing. The optimize process is really slow, and I can't use it. Since I have many other parameters that should be carried along the request to make sure that the navigation is consistent, I would like to understand if is there a setup that can limit the idf change and keep it low enough I tried with indexConfig mergeFactor5/mergeFactor /indexConfig In solrconfig but this morning /solr/admin/cores?action=STATUS still reports a number of segments above ten for all cores of the shard. (I'm sure I have reloaded each core after changing the value) Now I'm trying with expungeDeletes called from solrj, but still I don't see the segment count decrease It's completely normal to have more segments than the mergeFactor. Think about this scenario with a mergeFactor of 5: You index five segments. They get merged to one segment. Let's say that this happens a total of four times, so you've indexed a total of 20 segments and merging has reduced that to four larger segments. Let's say that you now index four more segments. You'll be completely stable with eight segments. If you index another one, that will result in a fifth larger segment. This sets conditions up just right for another merge -- to one even larger segment. This represents three levels of merging, and there can be even more levels, each of which can have four segments and remain stable. Starting at the last state I described, if you then indexed 24 more segments, you'd have a stable index with a total of nine segments - four of them would be normal sized, four of them would be about five times normal size, and the first one would be about 25 times normal size. The Solr default for the merge policy in all recent versions is TieredMergePolicy, and this can make things slightly more complicated than I've described, because it can merge *any* segments, not just those indexed sequentially, and I believe that it can delay merging until the right number of segments with suitable characteristics appear. I've got merge settings equivalent to a mergeFactor of 35, but I regularly see the segment count approach 100, and there's absolutely nothing wrong with my merging. If I understand it correctly, expungeDeletes will not decrease the segment count. It will simply rewrite segments that have deleted documents so there are none. I'm not 100% sure that I know exactly what expungeDeletes does, though. Thanks, Shawn
Re: unstable results on refresh
I have made some small patch to the application to make this problem less visible, and I'm trying to perform the optimize once per hour, yesterday it took 5 minutes to perform it, this morning 15 minutes. Today I will collect some statistics but the publication process sends documents every 5 minutes, and I think the optimize is taking too much time. I have no default mergeFactor configured for this collection, do you think that setting it to a small value could improve the situation? If I have understood well having to merge segments will keep similar stats on all nodes. It's ok to have the indexing process a little bit slower. 2014-10-21 18:44 GMT+02:00 Erick Erickson erickerick...@gmail.com: Giovanni: To see how this happens, consider a shard with a leader and two followers. Assume your autocommit interval is 60 seconds on each. This interval can expire at slightly different wall clock times. Even if the servers started perfectly in synch, they can get slightly out of sync. So, you index a bunch of docs and these replicas close the current segment and re-open a new segment with slightly different contents. Now docs come in that replace older docs. The tf/idf statistics _include_ deleted document data (which is purged on optimize). Given that doc X an be in different segments (or, more accurately, segments that get merged at different times on different machines), replica 1 may have slightly different stats than replica 2, thus computing slightly different scores. Optimizing purges all data related to deleted documents, so it all regularizes itself on optimize. Best, Erick On Tue, Oct 21, 2014 at 11:08 AM, Giovanni Bricconi giovanni.bricc...@banzai.it wrote: I noticed again the problem, now I was able to collect some data. in my paste http://pastebin.com/nVwf327c you can see the result of the same query issued twice, the 2nd and 3rd group are swapped. I pasted also the clusterstate and the core state for each core. The logs did'n show any problem related to indexing, only some malformed query. After doing an optimize the problem disappeared. So, is the problem related to documents that where deleted from the index? The optimization took 5 minutes to complete 2014-10-21 11:41 GMT+02:00 Giovanni Bricconi giovanni.bricc...@banzai.it: Nice! I will monitor the index and try this if the problem comes back. Actually the problem was due to small differences in score, so I think the problem has the same origin 2014-10-21 8:10 GMT+02:00 lboutros boutr...@gmail.com: Hi Giovanni, we had this problem as well. The cause was that the different nodes have slightly different idf values. We solved this problem by doing an optimize operation which really remove suppressed data. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/unstable-results-on-refresh-tp4164913p4165086.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: unstable results on refresh
I would rather ask whether such small differences matter enough to do this. Is this something users will _ever_ notice? Optimization is quite a heavyweight operation, and is generally not recommended on indexes that change often, and 5 minutes is certainly below the recommendation for optimizing. There is/has been work done on distributed IDF, but I don't quite know the current status that should address this (I think). But other than in a test setup, is it worth the effort? Best, Erick On Wed, Oct 22, 2014 at 3:54 AM, Giovanni Bricconi giovanni.bricc...@banzai.it wrote: I have made some small patch to the application to make this problem less visible, and I'm trying to perform the optimize once per hour, yesterday it took 5 minutes to perform it, this morning 15 minutes. Today I will collect some statistics but the publication process sends documents every 5 minutes, and I think the optimize is taking too much time. I have no default mergeFactor configured for this collection, do you think that setting it to a small value could improve the situation? If I have understood well having to merge segments will keep similar stats on all nodes. It's ok to have the indexing process a little bit slower. 2014-10-21 18:44 GMT+02:00 Erick Erickson erickerick...@gmail.com: Giovanni: To see how this happens, consider a shard with a leader and two followers. Assume your autocommit interval is 60 seconds on each. This interval can expire at slightly different wall clock times. Even if the servers started perfectly in synch, they can get slightly out of sync. So, you index a bunch of docs and these replicas close the current segment and re-open a new segment with slightly different contents. Now docs come in that replace older docs. The tf/idf statistics _include_ deleted document data (which is purged on optimize). Given that doc X an be in different segments (or, more accurately, segments that get merged at different times on different machines), replica 1 may have slightly different stats than replica 2, thus computing slightly different scores. Optimizing purges all data related to deleted documents, so it all regularizes itself on optimize. Best, Erick On Tue, Oct 21, 2014 at 11:08 AM, Giovanni Bricconi giovanni.bricc...@banzai.it wrote: I noticed again the problem, now I was able to collect some data. in my paste http://pastebin.com/nVwf327c you can see the result of the same query issued twice, the 2nd and 3rd group are swapped. I pasted also the clusterstate and the core state for each core. The logs did'n show any problem related to indexing, only some malformed query. After doing an optimize the problem disappeared. So, is the problem related to documents that where deleted from the index? The optimization took 5 minutes to complete 2014-10-21 11:41 GMT+02:00 Giovanni Bricconi giovanni.bricc...@banzai.it: Nice! I will monitor the index and try this if the problem comes back. Actually the problem was due to small differences in score, so I think the problem has the same origin 2014-10-21 8:10 GMT+02:00 lboutros boutr...@gmail.com: Hi Giovanni, we had this problem as well. The cause was that the different nodes have slightly different idf values. We solved this problem by doing an optimize operation which really remove suppressed data. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/unstable-results-on-refresh-tp4164913p4165086.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: unstable results on refresh
Hi Giovanni, we had this problem as well. The cause was that the different nodes have slightly different idf values. We solved this problem by doing an optimize operation which really remove suppressed data. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/unstable-results-on-refresh-tp4164913p4165086.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: unstable results on refresh
I noticed the problem looking at a group query, the groups returned where sorted on the score field of the first result, and then showed to the user. Repeating the same query I noticed that the order of two group started switching Thank you, I will look for the thread you said 2014-10-20 22:07 GMT+02:00 Alexandre Rafalovitch arafa...@gmail.com: What are the differences on. The document count or things like facets? This could be important. Also, I think there was a similar thread on the mailing list a week or two ago, might be worth looking for it. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 20 October 2014 04:49, Giovanni Bricconi giovanni.bricc...@banzai.it wrote: Hello I have a procedure that sends small data changes during the day to a solrcloud cluster, version 4.8 The cluster is made of three nodes, and three shards, each node contains two shards The procedure has been running for days; I don't know when but at some point one of the cores has gone out of synch and so repeating the same query has began to show small differences. The core graph was not useful, everything seemed active. I have solved the problem reindexing all, because the collection is quite small, but is there a way to fix this problem? Suppose I can figure out which core returns different results, is there a command to force that core to refetch the whole index from its master? Thanks Giovanni
Re: unstable results on refresh
Nice! I will monitor the index and try this if the problem comes back. Actually the problem was due to small differences in score, so I think the problem has the same origin 2014-10-21 8:10 GMT+02:00 lboutros boutr...@gmail.com: Hi Giovanni, we had this problem as well. The cause was that the different nodes have slightly different idf values. We solved this problem by doing an optimize operation which really remove suppressed data. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/unstable-results-on-refresh-tp4164913p4165086.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: unstable results on refresh
I noticed again the problem, now I was able to collect some data. in my paste http://pastebin.com/nVwf327c you can see the result of the same query issued twice, the 2nd and 3rd group are swapped. I pasted also the clusterstate and the core state for each core. The logs did'n show any problem related to indexing, only some malformed query. After doing an optimize the problem disappeared. So, is the problem related to documents that where deleted from the index? The optimization took 5 minutes to complete 2014-10-21 11:41 GMT+02:00 Giovanni Bricconi giovanni.bricc...@banzai.it: Nice! I will monitor the index and try this if the problem comes back. Actually the problem was due to small differences in score, so I think the problem has the same origin 2014-10-21 8:10 GMT+02:00 lboutros boutr...@gmail.com: Hi Giovanni, we had this problem as well. The cause was that the different nodes have slightly different idf values. We solved this problem by doing an optimize operation which really remove suppressed data. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/unstable-results-on-refresh-tp4164913p4165086.html Sent from the Solr - User mailing list archive at Nabble.com.
unstable results on refresh
Hello I have a procedure that sends small data changes during the day to a solrcloud cluster, version 4.8 The cluster is made of three nodes, and three shards, each node contains two shards The procedure has been running for days; I don't know when but at some point one of the cores has gone out of synch and so repeating the same query has began to show small differences. The core graph was not useful, everything seemed active. I have solved the problem reindexing all, because the collection is quite small, but is there a way to fix this problem? Suppose I can figure out which core returns different results, is there a command to force that core to refetch the whole index from its master? Thanks Giovanni
Re: unstable results on refresh
Can you please provide us the exception when the shard goes out of sync ? Please monitor the logs. -- View this message in context: http://lucene.472066.n3.nabble.com/unstable-results-on-refresh-tp4164913p4165002.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: unstable results on refresh
What are the differences on. The document count or things like facets? This could be important. Also, I think there was a similar thread on the mailing list a week or two ago, might be worth looking for it. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 20 October 2014 04:49, Giovanni Bricconi giovanni.bricc...@banzai.it wrote: Hello I have a procedure that sends small data changes during the day to a solrcloud cluster, version 4.8 The cluster is made of three nodes, and three shards, each node contains two shards The procedure has been running for days; I don't know when but at some point one of the cores has gone out of synch and so repeating the same query has began to show small differences. The core graph was not useful, everything seemed active. I have solved the problem reindexing all, because the collection is quite small, but is there a way to fix this problem? Suppose I can figure out which core returns different results, is there a command to force that core to refetch the whole index from its master? Thanks Giovanni