Because Solr is not updating documents. Solr is adding to indexes
of fields. You cannot add a TextField document to a StringField index.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 17, 2020, at 5:23 AM, Vinay Rajput <vinayrajput4...@gmail.com> wrote:
> 
> Sorry to jump into this discussion. I also get confused whenever I see this
> strange Solr/Lucene behaviour. Probably, As @Erick said in his last year
> talk, this is how it has been designed to avoid many problems that are
> hard/impossible to solve.
> 
> That said, one more time I want to come back to the same question: why
> solr/lucene can not handle this when we are updating all the documents?
> Let's take a couple of examples :-
> 
> *Ex 1:*
> Let's say I have only 10 documents in my index and all of them are in a
> single segment (Segment 1). Now, I change the schema (update field type in
> this case) and reindex all of them.
> This is what (according to me) should happen internally :-
> 
> 1st update req : Solr will mark 1st doc as deleted and index it again
> (might run the analyser chain based on config)
> 2nd update req : Solr will mark 2st doc as deleted and index it again
> (might run the analyser chain based on config)
> And so on......
> based on autoSoftCommit/autoCommit configuration, all new documents will be
> indexed and probably flushed to disk as part of new segment (Segment 2)
> 
> 
> Now, whenever segment merging happens (during commit or later in time),
> lucene will create a new segment (Segment 3) can discard all the docs
> present in segment 1 as there are no live docs in it. And there would *NOT*
> be any situation to decide whether to choose the old config or new config
> as there is not even a single live document with the old config. Isn't it?
> 
> *Ex 2:*
> I see that it can be an issue if we think about reindexing millions of
> docs. Because in that case, merging can be triggered when indexing is half
> way through, and since there are some live docs in the old segment (with
> old cofig), things will blow up. Please correct me if I am wrong.
> 
> I am *NOT* a Solr/Lucene expert and just started learning the ways things
> are working internally. In the above example, I can be wrong at many
> places. Can someone confirm if scenarios like Ex-2 are the reasons behind
> the fact that even re-indexing all documents doesn't help if some
> incompatible schema changes are done?  Any other insight would also be
> helpful.
> 
> Thanks,
> Vinay
> 
> On Sat, Oct 17, 2020 at 5:48 AM Shawn Heisey <apa...@elyograg.org> wrote:
> 
>> On 10/16/2020 2:36 PM, David Hastings wrote:
>>> sorry, i was thinking just using the
>>> <delete><query>*:*</query></delete>
>>> method for clearing the index would leave them still
>> 
>> In theory, if you delete all documents at the Solr level, Lucene will
>> delete all the segment files on the next commit, because they are empty.
>>  I have not confirmed with testing whether this actually happens.
>> 
>> It is far safer to use a new index as Erick has said, or to delete the
>> index directories completely and restart Solr ... so you KNOW the index
>> has nothing in it.
>> 
>> Thanks,
>> Shawn
>> 

Reply via email to