The message is telling you that you previously indexed the field boe.search.wild_description with offsets and now you are trying to index it without offsets. This probably indicates you are using a different Analyzer, which is generally *not ok* since indexed fields must be indexed in a consistent way in order to be searchable in a consistent way
On Thu, May 29, 2025 at 11:03 PM Saha, Rajib <rajib.s...@sap.com.invalid> wrote: > > Dear Experts, > > Can somebody please help and guide me for the below queries? > I have become bit clueless now, after giving a good number of different tries. > > Regards > Rajib > > -----Original Message----- > From: Saha, Rajib > Sent: 27 May 2025 11:52 > To: java-user@lucene.apache.org > Subject: RE: Suggestion needed for a case of Lucene Migration with TokenStream > > Hi Uwe, > > Thanks for your suggestions till now. We have been able to proceed good. > We are now stuck to a point, where we need some your expert suggestion. > > As per our design, on full content indexing, > - in first step, there will small Lucene index files gets created with 5-6 > documents. We called it delta index files. > - in second steps, we try to merge the delta index files to master Index File. > Below is snippet of the code: > ============================ > IndexWriter masterIndexWriter = new IndexWriter(indexDir, config); > FSDirectory[] deltaIndexDirs = new FSDirectory[deltaIndexDirList.size()]; > int j = 0; > for (Iterator<FSDirectory> i = deltaIndexDirList.iterator(); i.hasNext(); > j++) { > deltaIndexDirs[j] = i.next(); > } > masterIndexWriter.addIndexes(deltaIndexDirs); > =========================== > > But on doing it, we are getting the below exception. > I tried several things. But, could not come out of the problem. > Do you suspect anything here? Can you please suggest something to come out of > the problem? > > ============================================= > CaughtException while Merging in LuceneIndexEngine cannot change field > "boe.search.wild_description" from index > options=DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS to inconsistent index > options=DOCS_AND_FREQS_AND_POSITIONS > java.lang.IllegalArgumentException: cannot change field > "boe.search.wild_description" from index > options=DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS to inconsistent index > options=DOCS_AND_FREQS_AND_POSITIONS > at > org.apache.lucene.index.FieldInfos$FieldNumbers.addOrGet(FieldInfos.java:308) > at > org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2984) > at > com.sap.businessobjects.platform.search.lucene.index.engine.LuceneIndexEngine.merge(LuceneIndexEngine.java:981) > ============================================= > > Regards > Rajib > > > -----Original Message----- > From: Uwe Schindler <u...@thetaphi.de> > Sent: 30 April 2025 02:03 > To: java-user@lucene.apache.org > Subject: Re: Suggestion needed for a case of Lucene Migration with TokenStream > > If this is Windows, the deletion may not work if there are still > IndexReaders or Writers open by same or other processes. > > On Linux I have no idea, need an exception message. It should clearly > say why it fails. > > Uwe > > Am 29.04.2025 um 13:44 schrieb Saha, Rajib: > > Hi Uwe, > > > > In our product we have different level of indexing like > > MetaData/FullContent information of the Reports. > > So, Rebuild indexing deletes the existing Lucene index files and do a fresh > > indexing of all the documents. > > > > When we do physically going to directory and delete the Lucene Index files. > > The Rebuild indexing is working fine. > > But, from UI of product when we are selecting for Rebuild indexing, > > Indexing is not happening. > > > > I am debugging more for it. I will update you further on getting better > > picture. As our code for the Area goes with multiple tasks and thread. It > > is taking time to debug. > > > > I am suspecting, there may be some lock is there in Lucene Index files, due > > to of it, delete of Lucene index files are not working with stopping the > > service. But, this is a guess. Investigation is on for it. > > Do you have any suspect? > > > > Regards > > Rajib > > > > -----Original Message----- > > From: Uwe Schindler <u...@thetaphi.de> > > Sent: 28 April 2025 17:59 > > To: java-user@lucene.apache.org > > Subject: Re: Suggestion needed for a case of Lucene Migration with > > TokenStream > > > > Hi, > > > > what do you mean with: "But same content on rebuilding the index is not > > working"? > > > > How do you rebuild the index? It is not enough to just read all > > documents as stored fields and reindex them. You need the original > > document data and basically run them thorugh the same pipeline that you > > already have (so the indexing should be done by the same code that > > indexes new documents). So I'd write some code that reads the old data > > (if possible from source) or reads the old index (if all information > > that was indexed is available as stored fields, synthetically builds > > input data for the new indexer and sends it to the API (or whatever you > > have for indexing in your new system). > > > > If you just have incomplete Lucene Document instances from the older > > Lucene index, I think you're lost. When you cann > > IndexReader/IndexSearcher.document(), you only get stored fields, -- but > > that's not all information that was originbally used for indexing. > > Reading documents from IndexReader and passing it to IndexWriter does > > not work. It works from the API point of view, but the data is different. > > > > Uwe > > > > Am 28.04.2025 um 12:43 schrieb Saha, Rajib: > >> Hi Uwe, > >> > >> Thank you for your detailed input and valuable advice. I fully understand > >> and agree that upgrading from such an old version of Lucene involves much > >> more than just resolving compilation issues. > >> Based on the latest Lucene version, we have redesigned our platform > >> accordingly going through the Lucene APIs used and replacing accordingly > >> to latest. > >> > >> With these changes, Fresh content indexing is working fine. Search results > >> are also coming as expected. > >> Greatly appreciate your expert guidance, to help to bringing till this > >> point. > >> > >> But same content on rebuilding the index is not working. > >> I am debugging this part now. > >> > >> Do you have any suggestion on the problem ? > >> > >> Regards > >> Rajib > >> > >> -----Original Message----- > >> From: Uwe Schindler <u...@thetaphi.de> > >> Sent: 25 April 2025 18:19 > >> To: java-user@lucene.apache.org > >> Subject: Re: Suggestion needed for a case of Lucene Migration with > >> TokenStream > >> > >> Hi, > >> > >> I'd like to mention the following: You are trying to upgrade Lucene from > >> a really ancient version. Of course, basic concepts are still the same, > >> but the serach engine and its APIs have changed dramatically, so just > >> trying to "compile code and fix random stuff until it compiles" will not > >> bring you to a working product. On top, it may make the product worse > >> than before the update. > >> > >> To do the upgrade correctly, it is recommended to have somebody > >> available (ideally the person who wrote the code originally) and then go > >> though it line-by line and rewrite it. I am explicitely mentioning > >> "rewrite" because that's what you should do! If you don't have a person > >> that undertstands Lucene enough, I'd suggest to get help from outside. > >> You need to understand every line of code when rewriting it. In addition > >> there are many new features that make all that sepcial cases like > >> PayLoads on Tokenstreams obsolete. I'd not recommend to use something > >> payloads on terms nowadays. > >> > >> Uwe > >> > >> Am 24.04.2025 um 12:29 schrieb Mikhail Khludnev: > >>> Right. TextField.TYPE_NOT_STORED should be used then. > >>> > >>> On Thu, Apr 24, 2025 at 10:37 AM Saha, Rajib <rajib.s...@sap.com.invalid> > >>> wrote: > >>> > >>>> Thanks Mikhail for the suggestion. > >>>> Now the previous exception has gone. But a new exception has come from > >>>> Field.java. > >>>> Here below are the exception details. > >>>> ======== > >>>> java.lang.IllegalArgumentException: TokenStream fields cannot be stored > >>>> at org.apache.lucene.document.Field.<init>(Field.java:155) > >>>> ========= > >>>> > >>>> Can you please suggest here too? > >>>> > >>>> Regards > >>>> Rajib > >>>> > >>>> > >>>> -----Original Message----- > >>>> From: Mikhail Khludnev <m...@apache.org> > >>>> Sent: 24 April 2025 12:10 > >>>> To: java-user@lucene.apache.org > >>>> Subject: Re: Suggestion needed for a case of Lucene Migration with > >>>> TokenStream > >>>> > >>>> Hi > >>>> Use TextField.TYPE_STORED as the third argument in new Field() > >>>> see > >>>> > >>>> https://github.com/apache/lucene-solr/blob/e27f44e3d78dfcec230c97e0a1240e3751daeff9/lucene/core/src/java/org/apache/lucene/document/TextField.java#L35C33-L35C44 > >>>> > >>>> > >>>> On Thu, Apr 24, 2025 at 8:37 AM Saha, Rajib <rajib.s...@sap.com.invalid> > >>>> wrote: > >>>> > >>>>> Hi Experts, > >>>>> > >>>>> We are migrating Lucene from 2.4.1 to 8.11.2. > >>>>> > >>>>> During Migration for a part of code, we are getting below exception in > >>>>> 8.11.2 based changes from Red line colored. > >>>>> ============= > >>>>> java.lang.IllegalArgumentException: TokenStream fields must be indexed > >>>> and > >>>>> tokenized > >>>>> at org.apache.lucene.document.Field.<init>(Field.java:152) > >>>>> > >>>>> I tied few options. But, could not able to resolve the error. Beiiw > >>>>> Can somebody of you please help me to identify, where it is going as > >>>> wrong? > >>>>> We had code based on 2.4.1 as like below: > >>>>> =================================== > >>>>> Int currentVal< > >>>>> http://10.238.236.101:8080/source/s?defs=currentVal&project=2025_RTM> = > >>>>> //some value > >>>>> PayloadTokenStream< > >>>>> > >>>> http://10.238.236.101:8080/source/s?defs=PayloadTokenStream&project=2025_RTM > >>>>> tokenStream< > >>>>> http://10.238.236.101:8080/source/s?refs=tokenStream&project=2025_RTM> = > >>>>> new PayloadTokenStream< > >>>>> > >>>> http://10.238.236.101:8080/source/s?defs=PayloadTokenStream&project=2025_RTM > >>>>>> (); > >>>>> tokenStream< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#tokenStream > >>>>>> .setPayload< > >>>>> http://10.238.236.101:8080/source/s?defs=setPayload&project=2025_RTM > >>>>>> (currentVal< > >>>>> http://10.238.236.101:8080/source/s?defs=currentVal&project=2025_RTM>); > >>>>> lucField< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#lucField > >>>>> = new Field< > >>>>> http://10.238.236.101:8080/source/s?defs=Field&project=2025_RTM>(config< > >>>>> http://10.238.236.101:8080/source/s?defs=config&project=2025_RTM > >>>>>> .payloadUid< > >>>>> http://10.238.236.101:8080/source/s?defs=payloadUid&project=2025_RTM > >>>>>> ().name<http://10.238.236.101:8080/source/s?defs=name&project=2025_RTM > >>>>> , > >>>>> tokenStream< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#tokenStream > >>>>>> ); > >>>>> doc<http://10.238.236.101:8080/source/s?defs=doc&project=2025_RTM>.add< > >>>>> http://10.238.236.101:8080/source/s?defs=add&project=2025_RTM>(lucField< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#lucField > >>>>>> ); > >>>>> ...... > >>>>> public class PayloadTokenStream< > >>>>> > >>>> http://10.238.236.101:8080/source/s?refs=PayloadTokenStream&project=2025_RTM > >>>>> extends TokenStream< > >>>>> http://10.238.236.101:8080/source/s?defs=TokenStream&project=2025_RTM>{ > >>>>> public static String< > >>>>> http://10.238.236.101:8080/source/s?defs=String&project=2025_RTM> > >>>>> UID_PAYLOAD_START_VAL< > >>>>> > >>>> http://10.238.236.101:8080/source/s?refs=UID_PAYLOAD_START_VAL&project=2025_RTM > >>>>> = "_UID_"; > >>>>> private Token< > >>>>> http://10.238.236.101:8080/source/s?defs=Token&project=2025_RTM> token< > >>>>> http://10.238.236.101:8080/source/s?refs=token&project=2025_RTM> = new > >>>>> Token<http://10.238.236.101:8080/source/s?defs=Token&project=2025_RTM > >>>>>> (UID_PAYLOAD_START_VAL< > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#UID_PAYLOAD_START_VAL > >>>>>> ,0,0); > >>>>> private byte[] buffer< > >>>>> http://10.238.236.101:8080/source/s?refs=buffer&project=2025_RTM> = new > >>>>> byte[4]; > >>>>> private boolean returnToken< > >>>>> http://10.238.236.101:8080/source/s?refs=returnToken&project=2025_RTM> = > >>>>> false; > >>>>> > >>>>> public void setPayload< > >>>>> http://10.238.236.101:8080/source/s?refs=setPayload&project=2025_RTM > >>>>> (int > >>>>> uid<http://10.238.236.101:8080/source/s?refs=uid&project=2025_RTM>){ > >>>>> buffer< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer > >>>>> [0] > >>>>> = (byte)uid< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid > >>>>>> ; > >>>>> buffer< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer > >>>>> [1] > >>>>> = (byte)(uid< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid > >>>>>>>> 8); > >>>>> buffer< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer > >>>>> [2] > >>>>> = (byte)(uid< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid > >>>>>>>> 16); > >>>>> buffer< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer > >>>>> [3] > >>>>> = (byte)(uid< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid > >>>>>>>> 24); > >>>>> token< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#token > >>>>>> .setPayload< > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#setPayload > >>>>> (new > >>>>> Payload< > >>>> http://10.238.236.101:8080/source/s?defs=Payload&project=2025_RTM > >>>>>> (buffer< > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer > >>>>>> )); > >>>>> returnToken = true; > >>>>> } > >>>>> public Token< > >>>>> http://10.238.236.101:8080/source/s?defs=Token&project=2025_RTM> next< > >>>>> http://10.238.236.101:8080/source/s?refs=next&project=2025_RTM>() throws > >>>>> IOException< > >>>>> http://10.238.236.101:8080/source/s?defs=IOException&project=2025_RTM>{ > >>>>> if (returnToken< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#returnToken > >>>>> ){ > >>>>> returnToken< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#returnToken > >>>>> = false; return token< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#token > >>>>> ; > >>>>> } > >>>>> else { return null< > >>>>> http://10.238.236.101:8080/source/s?defs=null&project=2025_RTM>; } > >>>>> > >>>>> } > >>>>> } > >>>>> > >>>>> > >>>>> We have made code based on 8.11.2 as like below: > >>>>> ========================================== > >>>>> PayloadTokenStream tokenStream = new PayloadTokenStream(); > >>>>> tokenStream.setPayload(currentVal); > >>>>> FieldType fieldType = new FieldType(); > >>>>> lucField = new Field(config.payloadUid().name, tokenStream, fieldType); > >>>>> doc.add(lucField); > >>>>> ---- > >>>>> public class PayloadTokenStream< > >>>>> > >>>> http://10.238.236.101:8080/source/s?refs=PayloadTokenStream&project=2025_RTM > >>>>> extends TokenStream< > >>>>> http://10.238.236.101:8080/source/s?defs=TokenStream&project=2025_RTM>{ > >>>>> public static String< > >>>>> http://10.238.236.101:8080/source/s?defs=String&project=2025_RTM> > >>>>> UID_PAYLOAD_START_VAL< > >>>>> > >>>> http://10.238.236.101:8080/source/s?refs=UID_PAYLOAD_START_VAL&project=2025_RTM > >>>>> = "_UID_"; > >>>>> private byte[] buffer< > >>>>> http://10.238.236.101:8080/source/s?refs=buffer&project=2025_RTM> = new > >>>>> byte[4]; > >>>>> private boolean returnToken< > >>>>> http://10.238.236.101:8080/source/s?refs=returnToken&project=2025_RTM> = > >>>>> false; > >>>>> > >>>>> public void setPayload< > >>>>> http://10.238.236.101:8080/source/s?refs=setPayload&project=2025_RTM > >>>>> (int > >>>>> uid<http://10.238.236.101:8080/source/s?refs=uid&project=2025_RTM>){ > >>>>> buffer< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer > >>>>> [0] > >>>>> = (byte)uid< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid > >>>>>> ; > >>>>> buffer< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer > >>>>> [1] > >>>>> = (byte)(uid< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid > >>>>>>>> 8); > >>>>> buffer< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer > >>>>> [2] > >>>>> = (byte)(uid< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid > >>>>>>>> 16); > >>>>> buffer< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer > >>>>> [3] > >>>>> = (byte)(uid< > >>>>> > >>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid > >>>>>>>> 24); > >>>>> PayloadAttributeImpl attributeImpl = new > >>>>> PayloadAttributeImpl(new BytesRef(buffer)); > >>>>> addAttributeImpl(attributeImpl); > >>>>> returnToken = true; > >>>>> } > >>>>> public boolean incrementToken() throws IOException { > >>>>> if (returnToken){ > >>>>> returnToken = false; > >>>>> return true; > >>>>> } > >>>>> else { > >>>>> return false; > >>>>> } > >>>>> } > >>>>> } > >>>>> > >>>>> Regards > >>>>> Rajib > >>>>> > >>>>> > >>>> -- > >>>> Sincerely yours > >>>> Mikhail Khludnev > >>>> > >> -- > >> Uwe Schindler > >> Achterdiek 19, D-28357 Bremen > >> https://www.thetaphi.de/ > >> eMail: u...@thetaphi.de > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > > -- > > Uwe Schindler > > Achterdiek 19, D-28357 Bremen > > https://www.thetaphi.de/ > > eMail: u...@thetaphi.de > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > -- > Uwe Schindler > Achterdiek 19, D-28357 Bremen > https://www.thetaphi.de/ > eMail: u...@thetaphi.de > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org