[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13580328#comment-13580328 ] Nagendra Nagarajayya commented on SOLR-3816: @Yonik: Please find a new patch. This patch removes the code that you had highlighted and introduces a request granularity based realtime-search. The mechanism guarantees that the underlying nrt reader does not change for a request. So all components of a request like search, faceting, highlight, etc. see the same view of the index. Each request though may return a different set of results. I have also implemented a intra-request granularity wherein each component like search, faceting, highlight, etc. may return different results. This is not included in the patch. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509460#comment-13509460 ] Nagendra Nagarajayya commented on SOLR-3816: @radim: Faceting should be working. Let me know if you see any problems. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508785#comment-13508785 ] Otis Gospodnetic commented on SOLR-3816: bq. I have implemented a TimedSerialMergeScheduler so that merges are postponed to known time intervals. It works with both tiered and log policies. I can make available that as a patch. It seems to work well on my system here. Sounds like that would be a good separate patch. I think [~yo...@apache.org] can address your other comments > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508776#comment-13508776 ] Nagendra Nagarajayya commented on SOLR-3816: The check highlighted is a protection for the data structure being used so that it does not overflow. Docs ids should not change except on deletes and a commit. Commit opens a new searcher and a new reader so can watch for this to not change the nrt reader. In case of merges, I have implemented a TimedSerialMergeScheduler so that merges are postponed to known time intervals. It works with both tiered and log policies. I can make available that as a patch. It seems to work well on my system here. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501198#comment-13501198 ] Radim Kolar commented on SOLR-3816: --- What kind of incorrect results it can deliver? For me its okay to live with possibility of broken faceting because i do not use that feature. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501194#comment-13501194 ] Otis Gospodnetic commented on SOLR-3816: [~nnagarajayya] should we close this then? > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494815#comment-13494815 ] Yonik Seeley commented on SOLR-3816: So I took a quick look at the patch, and it's as I feared from earlier discussions - the reader is changed out from under the searcher at random times. This approach simply won't work. Lucene and Solr have searchers that work on a non-changing reader. You'll get incorrect documents back, incorrect facets back, pretty much any number of random looking bugs because internal docids will be changing underneath you. One only has to look at this snippet of the patch as an example of an attempt to be defensive about these changing docids: {code} + /* realtime NRT changes */ + // We may be getting docs that are beyond maxDoc, ignore for this request. + if (doc < maxDoc) { +bits.fastSet(doc); + } {code} So how is it that tests can pass? Well, the vast majority of our tests (like querying, faceting, etc), index documents and then test the requests. They do not currently test requests while concurrently indexing (and such tests would be much more difficult to write of course... one would need to know exactly what documents made it into the index in order to know what the correct results should be. I'm sorry folks, but this really looks like a dead end. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494554#comment-13494554 ] Otis Gospodnetic commented on SOLR-3816: H... didn't check the sources now, but I'm not sure if the above is all correct. Lucene gets the new Reader from IndexWriter, and I would think Solr uses that on soft commit and not something else, big and heavy. Yes, there is Searcher/cache warming, but I'm not sure if that comes into play any more with NRT and soft commits. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494105#comment-13494105 ] Nagendra Nagarajayya commented on SOLR-3816: @Otis: Yes, you could set to something low or 0, but this means it has to close and open the SolrIndexSearcher this often. SolrIndexSearcher is a heavy object that is reference counted so there may be searches going on, etc. has lots critical areas that need to be synchronized to close and reopen a new searcher, warms it up, etc.; was not meant for this kind of a use ... Realtime-search just gets a new nrt reader from the writer and passes this along to the Searcher, a lean searcher with no state. In the future if lucene's developers make the reader more realtime so it sees more changes as they happen at the writer realtime-search should be able to handle it ... "Quote from the user using realtime-search" Insertion speed – while we can’t really explain this, we are able to insert 70k records per second at a steady rate over time with RA, while we can only do 40k at a descending rate with normal Solr. Granted we haven’t even slightly configured regular Solr for high speed insertion with regard to segment configs, but this was good for us to get us quickly off the ground. "end quote" I think has gotten better with the 4.0 release. I have also requested the user to benchmark and update the JIRA as I don't have the required hardware. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492551#comment-13492551 ] Otis Gospodnetic commented on SOLR-3816: [~nnagarajayya] Hmmm maybe I'm missing something but if you set the soft commit in Solr to something very, very low, then yes, while it is still technically point in time view, that point in time is shifted so frequently that it looks like RT search to a human - new results can show up with every new search. So the effect can be as (N)RT as you choose with the soft commit frequency. I think the only Q is whether that approach vs. the approach in your patch yields better performance, and it looks like [~hsn] will test that soon and we're all anxiously waiting to see the results! :) > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492457#comment-13492457 ] Nagendra Nagarajayya commented on SOLR-3816: @Otis: Regarding the performance improvement, apart from the performance improvement, realtime-search makes available a realtime (nrt) view of the index as to current Solr implementation of point-in-time snapshots of the index. So each search may return new results ... > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489424#comment-13489424 ] Radim Kolar commented on SOLR-3816: --- I will test performance of this as soon i will have enough free diskspace to load 100 GB into solr. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489391#comment-13489391 ] Nagendra Nagarajayya commented on SOLR-3816: @Otis: Solr config was a standard config. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: alltests_passed_with_realtime_turnedoff.log, > SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488905#comment-13488905 ] Otis Gospodnetic commented on SOLR-3816: [~nnagarajayya] Right, that is what I thought. This doesn't make it clear this change is actually a performance improvement unfortunately - who knows if those people configured Solr optimally in the first place. :) Any way you can put together a side by side comparison? > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488755#comment-13488755 ] Nagendra Nagarajayya commented on SOLR-3816: @David: You have to disable the caches for now to return results in realtime (nrt). > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488752#comment-13488752 ] Nagendra Nagarajayya commented on SOLR-3816: @Otis: The 70,000 update performance is a at a user site. I don't have access to their system other than what they let me know (can only share the performance info). They have also tried soft-commit which was around 20k-40k docs / sec. I don't have a formal benchmark. The performance tests done sometime back was with the MBArtists index (from the Solr Enterprise book) to measure performance and was about 10k docs / sec. My storage system is not the fastest, default/internal disk that came on a two core system. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488720#comment-13488720 ] Nagendra Nagarajayya commented on SOLR-3816: Attached a patch for the 4.x trunk. All tests pass with realtime turned off and turned on. Will attach the test logs also. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, > solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487423#comment-13487423 ] Otis Gospodnetic commented on SOLR-3816: [~nnagarajayya] Have you done any benchmarks against Solr 4.0 that show any performance differences? > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: SOLR-3816_4.0_branch.patch, solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487000#comment-13487000 ] Daniel Collins commented on SOLR-3816: -- @David, the original post does say "The cache (Query Result Cache, etc.) needs to be disabled for realtime NRT.", so presumably it skips the caches to provide its NRT capability (I'm still working through what the patch does myself). > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: SOLR-3816_4.0_branch.patch, solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486927#comment-13486927 ] David Smiley commented on SOLR-3816: Since this NRT system does not re-open the SolrIndexSearcher, does it also not refresh its caches? If it doesn't, wouldn't a cached filter query return incorrect results if the system received updates since when Solr started? > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: SOLR-3816_4.0_branch.patch, solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
Dne 29.10.2012 17:31, Erick Erickson napsal(a): Radim: Do you mean branch 4x (as opposed to 4.0) or trunk (5.x)? 5.x, it can be merged to 4x if it proves to be useful - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
Radim: Do you mean branch 4x (as opposed to 4.0) or trunk (5.x)? On Mon, Oct 29, 2012 at 10:10 AM, Radim Kolar (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486034#comment-13486034 > ] > > Radim Kolar commented on SOLR-3816: > --- > > patch should be against solr trunk. Solr 4.0 branch is for bugfixes only. > >> Need a more granular nrt system that is close to a realtime system. >> --- >> >> Key: SOLR-3816 >> URL: https://issues.apache.org/jira/browse/SOLR-3816 >> Project: Solr >> Issue Type: Improvement >> Components: clients - java, replication (java), search, >> SearchComponents - other, SolrCloud, update >>Affects Versions: 4.0 >>Reporter: Nagendra Nagarajayya >> Labels: nrt, realtime, replication, search, solrcloud, update >> Attachments: SOLR-3816_4.0_branch.patch, solr-3816-realtime_nrt.patch >> >> >> Need a more granular NRT system that is close to a realtime system. A >> realtime system should be able to reflect changes to the index as and when >> docs are added/updated to the index. soft-commit offers NRT and is more >> realtime friendly than hard commit but is limited by the dependency on the >> SolrIndexSearcher being closed and reopened and offers a coarse granular >> NRT. Closing and reopening of the SolrIndexSearcher may impact performance >> also. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486034#comment-13486034 ] Radim Kolar commented on SOLR-3816: --- patch should be against solr trunk. Solr 4.0 branch is for bugfixes only. > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: SOLR-3816_4.0_branch.patch, solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486012#comment-13486012 ] Nagendra Nagarajayya commented on SOLR-3816: Lucene/Solr search and commit architecture is designed to work off a “point-in-time snapshots” of the index. Any add/update/delete needs a commit to be visible to searches (or atleast a soft-commit). soft-commit re-opens the SolrIndexSearcher object and can be a performance limitation if the soft-commits happen more than one per second, see blog:http://searchhub.org/dev/2011/09/07/realtime-get/. Realtime NRT makes available a near realtime view of the index. So any changes made to the index is immediately visible. Performance is not a limitation as it does not close the SolrIndexSearcher object as with soft-commit. Realtime NRT is also different from realtime-get which is a simple lookup by id and needs the transaction log to be enabled. realtime-get does not have search capability. Realtime NRT allows full search, so you could search by id, text, location, etc. using boolean, dismax, faceting, range queries ie. no change to existing functionality. No new request handlers to be defined in solrconfig.xml. So all of your existing queries work as it is with no changes, except that the results returned are in near real time. Realtime NRT also does not need the transaction update log needed by realtime-get. So you can turn this off for improved performance. autoCommit freq can also be increased to an hour from the default of 15 secs for improved performance (remember commits can slow down your update performance) More info about Realtime NRT and a integrated download of Solr 4.0 with Realtime NRT is available here: http://solr-ra.tgels.org/realtime-nrt.jsp (Attached a patch for Solr 4.0 release branch) > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > Attachments: solr-3816-realtime_nrt.patch > > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452045#comment-13452045 ] Nagendra Nagarajayya commented on SOLR-3816: The attached patch provides a realtime NRT that is closer to a realtime system and offers a fine granular NRT, at very high performance and offers a path to a true realtime system as and when the underlying IndexWriter system supports it. A performance of about 70,000 document adds / sec has been seen on a user system with realtime queries (almost 1.5-2x improvement over soft-commit) The realtime nrt can be enabled by adding the following tag to SolrConfig.xml: true/false A true enables realtime nrt while false disables it. The visible attribute controls the time a document may not visible in a search. Settng this to 0 means new documents are visible immediately in searches. Very High performance can be observed with visible around 150-200ms. For eg: true The other parameters supported are: true/false A true deletes duplicates while a false allows duplicates to exist until a commit. Setting this to true may impact performance. The same may be achieved by setting to 1. Note: 1. The cache (Query Result Cache, etc.) needs to be disabled for realtime NRT. 2. Increase the number of file descriptors to around 64k before starting Solr ie. ulimit -n 65536 > Need a more granular nrt system that is close to a realtime system. > --- > > Key: SOLR-3816 > URL: https://issues.apache.org/jira/browse/SOLR-3816 > Project: Solr > Issue Type: Improvement > Components: clients - java, replication (java), search, > SearchComponents - other, SolrCloud, update >Affects Versions: 4.0 >Reporter: Nagendra Nagarajayya > Labels: nrt, realtime, replication, search, solrcloud, update > > Need a more granular NRT system that is close to a realtime system. A > realtime system should be able to reflect changes to the index as and when > docs are added/updated to the index. soft-commit offers NRT and is more > realtime friendly than hard commit but is limited by the dependency on the > SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. > Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org