Owen:

Collection reload is necessary but not sufficient. You’ll still get wonky 
results even if you re-index everything unless you delete _all_ the documents 
first or start with a whole new collection. Each Lucene index is a “mini index” 
with its own picture of the structure of that index (i.e. the schema in force 
when it was created). If you have segments created with the old schema and 
other segments with the new schema, when they get merged the result is 
undefined. It may not blow up, but it also won't do what you want.

Take your change from text to string type and the title “my dog has fleas”. In 
the segment with the field defined as a Text type, you’ll be able to search for 
“dog” and get the doc. Similarly for Dog (assuming you have lowercasing in your 
analysis chain). “has fleas” would hit, as would “dog fleas”~2. 

For the segment defined with String, you will only get a hit if you search for 
“my dog has fleas”. You wouldn’t find the doc if you searched for any of the 
following:
- my AND dog AND has AND fleas
- “My dog has fleas”
- fleas
- “dog has fleas my"

When those segments are merged, Lucene doesn’t have the information to “do the 
right thing”, and even if it did the cost would be prohibitive because it’d be 
like re-indexing all the docs in one segment or the other.

You cannot spoof this by simply reindexing the corpus over top of an existing 
index since that’ll involve a bunch of segment merges.

You’re seeing consistent results here because you started with a _new_ 
collection that had no old segments lying around.

Best,
Erick

> On Oct 20, 2020, at 4:37 AM, Cox, Owen <o...@deloitte.co.uk> wrote:
> 
> Hi Konstantinos, I think you're onto something there.  I don't think the 
> collection was reloaded, I've just tried the same code against a different 
> collection that uses the same configset; only difference being this 
> collection was created after the schema changes.  That works, so it must've 
> been the reload that was missing.
> 
> Thanks!
> 
> Owen Cox
> Senior Consultant | Deloitte MCS Limited
> D: +44 20 7007 1657
> o...@deloitte.co.uk | www.deloitte.co.uk
> 
> 
> -----Original Message-----
> From: Konstantinos Koukouvis <konstantinos.koukou...@mecenat.com>
> Sent: 20 October 2020 09:04
> To: solr-user@lucene.apache.org
> Subject: [EXT: NEWSLETTER] Re: SolrDocument difference between String and 
> text_general
> 
> Hi Owen,
> 
> If I understand correctly you have changed the schema, then reloaded the core 
> and reindexed all data right? Cause whenever I got this error I’ve usually 
> forgotten to do one of those two things…
> 
> Regards,
> Konstantinos
> 
>> On 20 Oct 2020, at 09:53, Cox, Owen <o...@deloitte.co.uk> wrote:
>> 
>> Hi folks,
>> 
>> I'm using Solr 8.5.2 and populating documents which include a string field 
>> called "title".  This field used to be text_general, but the data was 
>> reindexed and we've been inserting data happily with REST calls and it's 
>> been behaving as desired.
>> 
>> I've now written a Java Spring-Boot program to populate documents (snippet 
>> below) using SolrCrudRepository.  This works when I don't index the "title" 
>> field, but when I try include title I get the following error "cannot change 
>> field "title" from index options=DOCS_AND_FREQS_AND_POSITIONS to 
>> inconsistent index options=DOCS"
>> 
>> To me that looks like it's trying to index the title as text_general and 
>> store it in a string field.  But the Solr schema states that field is 
>> string, all of the data in it is string, and any other string field in the 
>> document which is string is indexed correctly.
>> 
>> Could there be any hanging reference to the field's type anywhere?  Or some 
>> requirement that a field named "title" is always text_general or something 
>> odd like that?
>> 
>> Any help appreciated, thanks
>> Owen
>> 
>> 
>> 
>> @Data
>> @SolrDocument(collection="mycollection")
>> public class Node {
>> 
>>   @Id
>>   @Field
>>   private String id;
>> 
>> 
>>   @Field
>>   private String title;
>> 
>> 
>> 
>> 
>> IMPORTANT NOTICE
>> 
>> This communication is from Deloitte LLP, a limited liability partnership 
>> registered in England and Wales with registered number OC303675. Its 
>> registered office is 1 New Street Square, London EC4A 3HQ, United Kingdom. 
>> Deloitte LLP is the United Kingdom affiliate of Deloitte NSE LLP, a member 
>> firm of Deloitte Touche Tohmatsu Limited, a UK private company limited by 
>> guarantee ("DTTL"). DTTL and each of its member firms are legally separate 
>> and independent entities. DTTL and Deloitte NSE LLP do not provide services 
>> to clients. Please see 
>> www.deloitte.co.uk/about<https://www.deloitte.co.uk/about> to learn more 
>> about our global network of member firms. For details of our professional 
>> regulation please see 
>> Regulators<https://www2.deloitte.com/uk/en/footerlinks1/regulators-and-provision-service-regulations.html>.
>> 
>> This communication contains information which is confidential and may also 
>> be privileged. It is for the exclusive use of the intended recipient(s). If 
>> you are not the intended recipient(s), please notify 
>> it.security...@deloitte.co.uk<mailto:it.security...@deloitte.co.uk> and 
>> destroy this message immediately. Email communications cannot be guaranteed 
>> to be secure or free from error or viruses. All emails sent to or from a 
>> @deloitte.co.uk email account are securely archived and stored by an 
>> external supplier within the European Union.
>> 
>> You can understand more about how we collect and use (process) your personal 
>> information in our Privacy 
>> Notice<https://www2.deloitte.com/uk/en/legal/privacy.html>.
>> 
>> Deloitte LLP does not accept any liability for use of or reliance on the 
>> contents of this email by any person save by the intended recipient(s) to 
>> the extent agreed in a Deloitte LLP engagement contract.
>> 
>> Opinions, conclusions and other information in this email which have not 
>> been delivered by way of the business of Deloitte LLP are neither given nor 
>> endorsed by it.
> 
> ==================================================
> Konstantinos Koukouvis
> konstantinos.koukou...@mecenat.com
> 
> Using Golang and Solr? Try this: https://github.com/mecenat/solr
> 
> 
> 
> 
> 
> IMPORTANT NOTICE
> 
> This communication is from Deloitte LLP, a limited liability partnership 
> registered in England and Wales with registered number OC303675. Its 
> registered office is 1 New Street Square, London EC4A 3HQ, United Kingdom. 
> Deloitte LLP is the United Kingdom affiliate of Deloitte NSE LLP, a member 
> firm of Deloitte Touche Tohmatsu Limited, a UK private company limited by 
> guarantee (“DTTL”). DTTL and each of its member firms are legally separate 
> and independent entities. DTTL and Deloitte NSE LLP do not provide services 
> to clients. Please see 
> www.deloitte.co.uk/about<https://www.deloitte.co.uk/about> to learn more 
> about our global network of member firms. For details of our professional 
> regulation please see 
> Regulators<https://www2.deloitte.com/uk/en/footerlinks1/regulators-and-provision-service-regulations.html>.
> 
> This communication contains information which is confidential and may also be 
> privileged. It is for the exclusive use of the intended recipient(s). If you 
> are not the intended recipient(s), please notify 
> it.security...@deloitte.co.uk<mailto:it.security...@deloitte.co.uk> and 
> destroy this message immediately. Email communications cannot be guaranteed 
> to be secure or free from error or viruses. All emails sent to or from a 
> @deloitte.co.uk email account are securely archived and stored by an external 
> supplier within the European Union.
> 
> You can understand more about how we collect and use (process) your personal 
> information in our Privacy 
> Notice<https://www2.deloitte.com/uk/en/legal/privacy.html>.
> 
> Deloitte LLP does not accept any liability for use of or reliance on the 
> contents of this email by any person save by the intended recipient(s) to the 
> extent agreed in a Deloitte LLP engagement contract.
> 
> Opinions, conclusions and other information in this email which have not been 
> delivered by way of the business of Deloitte LLP are neither given nor 
> endorsed by it.

Reply via email to