Re: converting string to solr.TextField

Shawn Heisey Sat, 17 Oct 2020 11:02:47 -0700

On 10/17/2020 6:23 AM, Vinay Rajput wrote:

That said, one more time I want to come back to the same question: why
solr/lucene can not handle this when we are updating all the documents?
Let's take a couple of examples :-


*Ex 1:*
Let's say I have only 10 documents in my index and all of them are in a
single segment (Segment 1). Now, I change the schema (update field type in
this case) and reindex all of them.
This is what (according to me) should happen internally :-

1st update req : Solr will mark 1st doc as deleted and index it again
(might run the analyser chain based on config)
2nd update req : Solr will mark 2st doc as deleted and index it again
(might run the analyser chain based on config)
And so on......
based on autoSoftCommit/autoCommit configuration, all new documents will be
indexed and probably flushed to disk as part of new segment (Segment 2)


<snip>

*Ex 2:*
I see that it can be an issue if we think about reindexing millions of
docs. Because in that case, merging can be triggered when indexing is half
way through, and since there are some live docs in the old segment (with
old cofig), things will blow up. Please correct me if I am wrong.

If you could guarantee a few things, you could be sure this will work.But it's a serious long shot.

The change in schema might be such that when Lucene tries to merge them,it fails because the data in the old segments is incompatible with thenew segments. If that happens, then you're sunk ... it won't work at all.

If the merges of old and new segments are successful, then you wouldhave to optimize the index after you're done indexing to be SURE therewere no old documents remaining. Lucene calls that operation"ForceMerge". This operation is disruptive and can take a very long time.

You would also have to be sure there was no query activity until theupdate/merge is completely done. Which probably means that you'd wantto work on a copy of the index in another collection. And if you'regoing to do that, you might as well start indexing from scratch into anew/empty collection. That would also allow you to continue queryingthe old collection until the new one was ready.


Thanks,
Shawn

Re: converting string to solr.TextField

Reply via email to