Re: is there a way to remove deleted documents from index without optimize
On 10/12/2017 10:01 PM, Erick Erickson wrote: > You can use the IndexUpgradeTool that ships with each version of Solr > (well, actually Lucene) to, well, upgrade your index. So you can use > the IndexUpgradeTool that ships with 5x to upgrade from 4x. And the > one that ships with 6x to upgrade from 5x. etc. > > That said, none of that is necessary _if_ you >> have the Lucene version in solrconfig.xml be the one that corresponds to >> your current Solr. I.e. a solrconfig for 6x should have a luceneMatchVersion >> of 6something. >> you update your index enough to rewrite all segments before moving to the >> _next_ version. When Lucene sees merges a segment, it writes the new segment >> according to the luceneMatchVersion in solrconfig.xml. So as long as you are >> on a version long enough for all segments to be merged into new segments, >> you don't have to worry. As far as I am aware, luceneMatchVersion in Solr will not change the segment format, but only how some Lucene components (primarily analysis) function. Have I got incorrect information? Something else that might be worth mentioning: The IndexUpgrader is an fairly simple piece of code. It runs forceMerge (optimize) on the index, creating a single new segment from the entire existing index. That ties into this thread's initial subject and LUCENE-7976. I wonder if perhaps the upgrade merge policy should be changed so that it just rewrites all existing segments instead of fully merging them. Thanks, Shawn
Re: is there a way to remove deleted documents from index without optimize
Thanks for the clarification. I use ${lucene.version} in the solrconfig.xml and pass -Dlucene.version when I launch solr, to keep the versions. > On Oct 12, 2017, at 11:01 PM, Erick Erickson wrote: > > You can use the IndexUpgradeTool that ships with each version of Solr > (well, actually Lucene) to, well, upgrade your index. So you can use > the IndexUpgradeTool that ships with 5x to upgrade from 4x. And the > one that ships with 6x to upgrade from 5x. etc. > > That said, none of that is necessary _if_ you >> have the Lucene version in solrconfig.xml be the one that corresponds to >> your current Solr. I.e. a solrconfig for 6x should have a luceneMatchVersion >> of 6something. >> you update your index enough to rewrite all segments before moving to the >> _next_ version. When Lucene sees merges a segment, it writes the new segment >> according to the luceneMatchVersion in solrconfig.xml. So as long as you are >> on a version long enough for all segments to be merged into new segments, >> you don't have to worry. > > Best, > Erick > > On Thu, Oct 12, 2017 at 8:29 PM, Harry Yoo wrote: >> I should have read this. My project has been running from apache solr 4.x, >> and moved to 5.x and recently migrated to 6.6.1. Do you think solr will take >> care of old version indexes as well? I wanted to make sure my indexes are >> updated with 6.x lucence version so that it will be supported when i move to >> solr 7.x >> >> Is there any best practice managing solr indexes? >> >> Harry >> >>> On Sep 22, 2015, at 8:21 PM, Walter Underwood wrote: >>> >>> Don’t do anything. Solr will automatically clean up the deleted documents >>> for you. >>> >>> wunder >>> Walter Underwood >>> wun...@wunderwood.org >>> http://observer.wunderwood.org/ (my blog) >>> >>> >>>> On Sep 22, 2015, at 6:01 PM, CrazyDiamond wrote: >>>> >>>> my index is updating frequently and i need to remove unused documents from >>>> index after update/reindex. >>>> Optimizaion is very expensive so what should i do? >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://lucene.472066.n3.nabble.com/is-there-a-way-to-remove-deleted-documents-from-index-without-optimize-tp4230691.html >>>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >>
Re: is there a way to remove deleted documents from index without optimize
You can use the IndexUpgradeTool that ships with each version of Solr (well, actually Lucene) to, well, upgrade your index. So you can use the IndexUpgradeTool that ships with 5x to upgrade from 4x. And the one that ships with 6x to upgrade from 5x. etc. That said, none of that is necessary _if_ you > have the Lucene version in solrconfig.xml be the one that corresponds to your > current Solr. I.e. a solrconfig for 6x should have a luceneMatchVersion of > 6something. > you update your index enough to rewrite all segments before moving to the > _next_ version. When Lucene sees merges a segment, it writes the new segment > according to the luceneMatchVersion in solrconfig.xml. So as long as you are > on a version long enough for all segments to be merged into new segments, you > don't have to worry. Best, Erick On Thu, Oct 12, 2017 at 8:29 PM, Harry Yoo wrote: > I should have read this. My project has been running from apache solr 4.x, > and moved to 5.x and recently migrated to 6.6.1. Do you think solr will take > care of old version indexes as well? I wanted to make sure my indexes are > updated with 6.x lucence version so that it will be supported when i move to > solr 7.x > > Is there any best practice managing solr indexes? > > Harry > >> On Sep 22, 2015, at 8:21 PM, Walter Underwood wrote: >> >> Don’t do anything. Solr will automatically clean up the deleted documents >> for you. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >> >>> On Sep 22, 2015, at 6:01 PM, CrazyDiamond wrote: >>> >>> my index is updating frequently and i need to remove unused documents from >>> index after update/reindex. >>> Optimizaion is very expensive so what should i do? >>> >>> >>> >>> -- >>> View this message in context: >>> http://lucene.472066.n3.nabble.com/is-there-a-way-to-remove-deleted-documents-from-index-without-optimize-tp4230691.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >> >
Re: is there a way to remove deleted documents from index without optimize
I should have read this. My project has been running from apache solr 4.x, and moved to 5.x and recently migrated to 6.6.1. Do you think solr will take care of old version indexes as well? I wanted to make sure my indexes are updated with 6.x lucence version so that it will be supported when i move to solr 7.x Is there any best practice managing solr indexes? Harry > On Sep 22, 2015, at 8:21 PM, Walter Underwood wrote: > > Don’t do anything. Solr will automatically clean up the deleted documents for > you. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > >> On Sep 22, 2015, at 6:01 PM, CrazyDiamond wrote: >> >> my index is updating frequently and i need to remove unused documents from >> index after update/reindex. >> Optimizaion is very expensive so what should i do? >> >> >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/is-there-a-way-to-remove-deleted-documents-from-index-without-optimize-tp4230691.html >> Sent from the Solr - User mailing list archive at Nabble.com. >
Re: is there a way to remove deleted documents from index without optimize
Don’t do anything. Solr will automatically clean up the deleted documents for you. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Sep 22, 2015, at 6:01 PM, CrazyDiamond wrote: > > my index is updating frequently and i need to remove unused documents from > index after update/reindex. > Optimizaion is very expensive so what should i do? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/is-there-a-way-to-remove-deleted-documents-from-index-without-optimize-tp4230691.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: is there a way to remove deleted documents from index without optimize
Avoid optimize like the plague. Instead focus on tuning the segment merging process. As you commit index files, segments are created. But they're periodically merged. Merging removes remnants of the tombstoned docs. You can optimize this, tune it, etc. If you're dealing with a lot of updates, this is something you definitely want to tune. See this document, scroll down to the merge parameters. https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig There are other options for dealing with high update speed. You could shard SolrCloud further and minimize replication. You could put things in Kafka and work through them as you can, catching if you have any slow time. You can tune your hard and soft commits to create segments of an appropriate size, etc. -Doug On Tue, Sep 22, 2015 at 9:01 PM, CrazyDiamond wrote: > my index is updating frequently and i need to remove unused documents from > index after update/reindex. > Optimizaion is very expensive so what should i do? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/is-there-a-way-to-remove-deleted-documents-from-index-without-optimize-tp4230691.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections <http://opensourceconnections.com>, LLC | 240.476.9983 Author: Relevant Search <http://manning.com/turnbull> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
is there a way to remove deleted documents from index without optimize
my index is updating frequently and i need to remove unused documents from index after update/reindex. Optimizaion is very expensive so what should i do? -- View this message in context: http://lucene.472066.n3.nabble.com/is-there-a-way-to-remove-deleted-documents-from-index-without-optimize-tp4230691.html Sent from the Solr - User mailing list archive at Nabble.com.