Dear Experts,

Can somebody please help and guide me for the below queries?
I have become bit clueless now, after giving a good number of different tries.

Regards
Rajib

-----Original Message-----
From: Saha, Rajib
Sent: 27 May 2025 11:52
To: java-user@lucene.apache.org
Subject: RE: Suggestion needed for a case of Lucene Migration with TokenStream

Hi Uwe,

Thanks for your suggestions till now. We have been able to proceed good.
We are now stuck to a point, where we need some your expert suggestion.

As per our design, on full content indexing,
- in first step, there will small Lucene index files gets created with 5-6 
documents. We called it delta index files.
- in second steps, we try to merge the delta index files to master Index File.
Below is snippet of the code:
============================
IndexWriter masterIndexWriter = new IndexWriter(indexDir, config);
FSDirectory[] deltaIndexDirs = new FSDirectory[deltaIndexDirList.size()];
int j = 0;
for (Iterator<FSDirectory> i = deltaIndexDirList.iterator(); i.hasNext(); j++) {
        deltaIndexDirs[j] = i.next();
}
masterIndexWriter.addIndexes(deltaIndexDirs);
===========================

But on doing it, we are getting the below exception.
I tried several things. But, could not come out of the problem.
Do you suspect anything here? Can you please suggest something to come out of 
the problem?

=============================================
CaughtException while Merging in LuceneIndexEngine cannot change field 
"boe.search.wild_description" from index 
options=DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS to inconsistent index 
options=DOCS_AND_FREQS_AND_POSITIONS
java.lang.IllegalArgumentException: cannot change field 
"boe.search.wild_description" from index 
options=DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS to inconsistent index 
options=DOCS_AND_FREQS_AND_POSITIONS
        at 
org.apache.lucene.index.FieldInfos$FieldNumbers.addOrGet(FieldInfos.java:308)
        at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2984)
        at 
com.sap.businessobjects.platform.search.lucene.index.engine.LuceneIndexEngine.merge(LuceneIndexEngine.java:981)
=============================================

Regards
Rajib


-----Original Message-----
From: Uwe Schindler <u...@thetaphi.de>
Sent: 30 April 2025 02:03
To: java-user@lucene.apache.org
Subject: Re: Suggestion needed for a case of Lucene Migration with TokenStream

If this is Windows, the deletion may not work if there are still
IndexReaders or Writers open by same or other processes.

On Linux I have no idea, need an exception message. It should clearly
say why it fails.

Uwe

Am 29.04.2025 um 13:44 schrieb Saha, Rajib:
> Hi Uwe,
>
> In our product we have different level of indexing like MetaData/FullContent 
> information of the Reports.
> So, Rebuild indexing deletes the existing Lucene index files and do a fresh 
> indexing of all the documents.
>
> When we do physically going to directory and delete the Lucene Index files. 
> The Rebuild indexing is working fine.
> But, from UI of product when we are selecting for Rebuild indexing, Indexing 
> is not happening.
>
> I am debugging more for it. I will update you further on getting better 
> picture. As our code for the Area goes with multiple tasks and thread. It is 
> taking time to debug.
>
> I am suspecting, there may be some lock is there in Lucene Index files, due 
> to of it, delete of Lucene index files are not working with stopping the 
> service. But, this is a guess. Investigation is on for it.
> Do you have any suspect?
>
> Regards
> Rajib
>
> -----Original Message-----
> From: Uwe Schindler <u...@thetaphi.de>
> Sent: 28 April 2025 17:59
> To: java-user@lucene.apache.org
> Subject: Re: Suggestion needed for a case of Lucene Migration with TokenStream
>
> Hi,
>
> what do you mean with: "But same content on rebuilding the index is not
> working"?
>
> How do you rebuild the index? It is not enough to just read all
> documents as stored fields and reindex them. You need the original
> document data and basically run them thorugh the same pipeline that you
> already have (so the indexing should be done by the same code that
> indexes new documents). So I'd write some code that reads the old data
> (if possible from source) or reads the old index (if all information
> that was indexed is available as stored fields, synthetically builds
> input data for the new indexer and sends it to the API (or whatever you
> have for indexing in your new system).
>
> If you just have incomplete Lucene Document instances from the older
> Lucene index, I think you're lost. When you cann
> IndexReader/IndexSearcher.document(), you only get stored fields, -- but
> that's not all information that was originbally used for indexing.
> Reading documents from IndexReader and passing it to IndexWriter does
> not work. It works from the API point of view, but the data is different.
>
> Uwe
>
> Am 28.04.2025 um 12:43 schrieb Saha, Rajib:
>> Hi Uwe,
>>
>> Thank you for your detailed input and valuable advice. I fully understand 
>> and agree that upgrading from such an old version of Lucene involves much 
>> more than just resolving compilation issues.
>> Based on the latest Lucene version, we have redesigned our platform 
>> accordingly going through the Lucene APIs used and replacing accordingly to 
>> latest.
>>
>> With these changes, Fresh content indexing is working fine. Search results 
>> are also coming as expected.
>> Greatly appreciate your expert guidance, to help to bringing till this point.
>>
>> But same content on rebuilding the index is not working.
>> I am debugging this part now.
>>
>> Do you have any suggestion on the problem ?
>>
>> Regards
>> Rajib
>>
>> -----Original Message-----
>> From: Uwe Schindler <u...@thetaphi.de>
>> Sent: 25 April 2025 18:19
>> To: java-user@lucene.apache.org
>> Subject: Re: Suggestion needed for a case of Lucene Migration with 
>> TokenStream
>>
>> Hi,
>>
>> I'd like to mention the following: You are trying to upgrade Lucene from
>> a really ancient version. Of course, basic concepts are still the same,
>> but the serach engine and its APIs have changed dramatically, so just
>> trying to "compile code and fix random stuff until it compiles" will not
>> bring you to a working product. On top, it may make the product worse
>> than before the update.
>>
>> To do the upgrade correctly, it is recommended to have somebody
>> available (ideally the person who wrote the code originally) and then go
>> though it line-by line and rewrite it. I am explicitely mentioning
>> "rewrite" because that's what you should do! If you don't have a person
>> that undertstands Lucene enough, I'd suggest to get help from outside.
>> You need to understand every line of code when rewriting it. In addition
>> there are many new features that make all that sepcial cases like
>> PayLoads on Tokenstreams obsolete. I'd not recommend to use something
>> payloads on terms nowadays.
>>
>> Uwe
>>
>> Am 24.04.2025 um 12:29 schrieb Mikhail Khludnev:
>>> Right. TextField.TYPE_NOT_STORED should be used then.
>>>
>>> On Thu, Apr 24, 2025 at 10:37 AM Saha, Rajib <rajib.s...@sap.com.invalid>
>>> wrote:
>>>
>>>> Thanks Mikhail for the suggestion.
>>>> Now the previous exception has gone. But a new exception has come from
>>>> Field.java.
>>>> Here below are the exception details.
>>>> ========
>>>> java.lang.IllegalArgumentException: TokenStream fields cannot be stored
>>>>            at org.apache.lucene.document.Field.<init>(Field.java:155)
>>>> =========
>>>>
>>>> Can you please suggest here too?
>>>>
>>>> Regards
>>>> Rajib
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Mikhail Khludnev <m...@apache.org>
>>>> Sent: 24 April 2025 12:10
>>>> To: java-user@lucene.apache.org
>>>> Subject: Re: Suggestion needed for a case of Lucene Migration with
>>>> TokenStream
>>>>
>>>> Hi
>>>> Use TextField.TYPE_STORED as the third argument in new Field()
>>>> see
>>>>
>>>> https://github.com/apache/lucene-solr/blob/e27f44e3d78dfcec230c97e0a1240e3751daeff9/lucene/core/src/java/org/apache/lucene/document/TextField.java#L35C33-L35C44
>>>>
>>>>
>>>> On Thu, Apr 24, 2025 at 8:37 AM Saha, Rajib <rajib.s...@sap.com.invalid>
>>>> wrote:
>>>>
>>>>> Hi Experts,
>>>>>
>>>>> We are migrating Lucene from 2.4.1 to 8.11.2.
>>>>>
>>>>> During Migration for a part of code, we are getting below exception in
>>>>> 8.11.2 based changes from Red line colored.
>>>>> =============
>>>>> java.lang.IllegalArgumentException: TokenStream fields must be indexed
>>>> and
>>>>> tokenized
>>>>> at org.apache.lucene.document.Field.<init>(Field.java:152)
>>>>>
>>>>> I tied few options. But, could not able to resolve the error. Beiiw
>>>>> Can somebody of you please help me to identify, where it is going as
>>>> wrong?
>>>>> We had code based on 2.4.1 as like below:
>>>>> ===================================
>>>>> Int currentVal<
>>>>> http://10.238.236.101:8080/source/s?defs=currentVal&project=2025_RTM> =
>>>>> //some value
>>>>> PayloadTokenStream<
>>>>>
>>>> http://10.238.236.101:8080/source/s?defs=PayloadTokenStream&project=2025_RTM
>>>>> tokenStream<
>>>>> http://10.238.236.101:8080/source/s?refs=tokenStream&project=2025_RTM> =
>>>>> new PayloadTokenStream<
>>>>>
>>>> http://10.238.236.101:8080/source/s?defs=PayloadTokenStream&project=2025_RTM
>>>>>> ();
>>>>> tokenStream<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#tokenStream
>>>>>> .setPayload<
>>>>> http://10.238.236.101:8080/source/s?defs=setPayload&project=2025_RTM
>>>>>> (currentVal<
>>>>> http://10.238.236.101:8080/source/s?defs=currentVal&project=2025_RTM>);
>>>>> lucField<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#lucField
>>>>> = new Field<
>>>>> http://10.238.236.101:8080/source/s?defs=Field&project=2025_RTM>(config<
>>>>> http://10.238.236.101:8080/source/s?defs=config&project=2025_RTM
>>>>>> .payloadUid<
>>>>> http://10.238.236.101:8080/source/s?defs=payloadUid&project=2025_RTM
>>>>>> ().name<http://10.238.236.101:8080/source/s?defs=name&project=2025_RTM
>>>>> ,
>>>>> tokenStream<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#tokenStream
>>>>>> );
>>>>> doc<http://10.238.236.101:8080/source/s?defs=doc&project=2025_RTM>.add<
>>>>> http://10.238.236.101:8080/source/s?defs=add&project=2025_RTM>(lucField<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#lucField
>>>>>> );
>>>>> ......
>>>>> public class PayloadTokenStream<
>>>>>
>>>> http://10.238.236.101:8080/source/s?refs=PayloadTokenStream&project=2025_RTM
>>>>> extends TokenStream<
>>>>> http://10.238.236.101:8080/source/s?defs=TokenStream&project=2025_RTM>{
>>>>> public static String<
>>>>> http://10.238.236.101:8080/source/s?defs=String&project=2025_RTM>
>>>>> UID_PAYLOAD_START_VAL<
>>>>>
>>>> http://10.238.236.101:8080/source/s?refs=UID_PAYLOAD_START_VAL&project=2025_RTM
>>>>> = "_UID_";
>>>>>                  private Token<
>>>>> http://10.238.236.101:8080/source/s?defs=Token&project=2025_RTM> token<
>>>>> http://10.238.236.101:8080/source/s?refs=token&project=2025_RTM> = new
>>>>> Token<http://10.238.236.101:8080/source/s?defs=Token&project=2025_RTM
>>>>>> (UID_PAYLOAD_START_VAL<
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#UID_PAYLOAD_START_VAL
>>>>>> ,0,0);
>>>>>                  private byte[] buffer<
>>>>> http://10.238.236.101:8080/source/s?refs=buffer&project=2025_RTM> = new
>>>>> byte[4];
>>>>>                  private boolean returnToken<
>>>>> http://10.238.236.101:8080/source/s?refs=returnToken&project=2025_RTM> =
>>>>> false;
>>>>>
>>>>>                  public void setPayload<
>>>>> http://10.238.236.101:8080/source/s?refs=setPayload&project=2025_RTM
>>>>> (int
>>>>> uid<http://10.238.236.101:8080/source/s?refs=uid&project=2025_RTM>){
>>>>>                                 buffer<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
>>>>> [0]
>>>>> = (byte)uid<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
>>>>>> ;
>>>>>                                 buffer<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
>>>>> [1]
>>>>> = (byte)(uid<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
>>>>>>>> 8);
>>>>>                                 buffer<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
>>>>> [2]
>>>>> = (byte)(uid<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
>>>>>>>> 16);
>>>>>                                 buffer<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
>>>>> [3]
>>>>> = (byte)(uid<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
>>>>>>>> 24);
>>>>>                                 token<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#token
>>>>>> .setPayload<
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#setPayload
>>>>> (new
>>>>> Payload<
>>>> http://10.238.236.101:8080/source/s?defs=Payload&project=2025_RTM
>>>>>> (buffer<
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
>>>>>> ));
>>>>>                                 returnToken = true;
>>>>>                  }
>>>>>                  public Token<
>>>>> http://10.238.236.101:8080/source/s?defs=Token&project=2025_RTM> next<
>>>>> http://10.238.236.101:8080/source/s?refs=next&project=2025_RTM>() throws
>>>>> IOException<
>>>>> http://10.238.236.101:8080/source/s?defs=IOException&project=2025_RTM>{
>>>>>                                 if (returnToken<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#returnToken
>>>>> ){
>>>>> returnToken<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#returnToken
>>>>> = false; return token<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#token
>>>>> ;
>>>>> }
>>>>>                                 else { return null<
>>>>> http://10.238.236.101:8080/source/s?defs=null&project=2025_RTM>; }
>>>>>
>>>>>                  }
>>>>> }
>>>>>
>>>>>
>>>>> We have made code based on 8.11.2 as like below:
>>>>> ==========================================
>>>>> PayloadTokenStream tokenStream = new PayloadTokenStream();
>>>>> tokenStream.setPayload(currentVal);
>>>>> FieldType fieldType = new FieldType();
>>>>> lucField = new Field(config.payloadUid().name, tokenStream, fieldType);
>>>>> doc.add(lucField);
>>>>> ----
>>>>> public class PayloadTokenStream<
>>>>>
>>>> http://10.238.236.101:8080/source/s?refs=PayloadTokenStream&project=2025_RTM
>>>>> extends TokenStream<
>>>>> http://10.238.236.101:8080/source/s?defs=TokenStream&project=2025_RTM>{
>>>>> public static String<
>>>>> http://10.238.236.101:8080/source/s?defs=String&project=2025_RTM>
>>>>> UID_PAYLOAD_START_VAL<
>>>>>
>>>> http://10.238.236.101:8080/source/s?refs=UID_PAYLOAD_START_VAL&project=2025_RTM
>>>>> = "_UID_";
>>>>> private byte[] buffer<
>>>>> http://10.238.236.101:8080/source/s?refs=buffer&project=2025_RTM> = new
>>>>> byte[4];
>>>>>                  private boolean returnToken<
>>>>> http://10.238.236.101:8080/source/s?refs=returnToken&project=2025_RTM> =
>>>>> false;
>>>>>
>>>>>                  public void setPayload<
>>>>> http://10.238.236.101:8080/source/s?refs=setPayload&project=2025_RTM
>>>>> (int
>>>>> uid<http://10.238.236.101:8080/source/s?refs=uid&project=2025_RTM>){
>>>>>                                 buffer<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
>>>>> [0]
>>>>> = (byte)uid<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
>>>>>> ;
>>>>>                                 buffer<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
>>>>> [1]
>>>>> = (byte)(uid<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
>>>>>>>> 8);
>>>>>                                 buffer<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
>>>>> [2]
>>>>> = (byte)(uid<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
>>>>>>>> 16);
>>>>>                                 buffer<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
>>>>> [3]
>>>>> = (byte)(uid<
>>>>>
>>>> http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
>>>>>>>> 24);
>>>>>                                 PayloadAttributeImpl attributeImpl = new
>>>>> PayloadAttributeImpl(new BytesRef(buffer));
>>>>>                                addAttributeImpl(attributeImpl);
>>>>>                                 returnToken = true;
>>>>>                  }
>>>>>                  public boolean incrementToken() throws IOException {
>>>>>                                 if (returnToken){
>>>>>                                               returnToken = false;
>>>>>                                               return true;
>>>>>                                 }
>>>>>                                 else {
>>>>>                                               return false;
>>>>>                                 }
>>>>>                  }
>>>>> }
>>>>>
>>>>> Regards
>>>>> Rajib
>>>>>
>>>>>
>>>> --
>>>> Sincerely yours
>>>> Mikhail Khludnev
>>>>
>> --
>> Uwe Schindler
>> Achterdiek 19, D-28357 Bremen
>> https://www.thetaphi.de/
>> eMail: u...@thetaphi.de
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de/
> eMail: u...@thetaphi.de
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de/
eMail: u...@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to