Hi,

what do you mean with: "But same content on rebuilding the index is not working"?

How do you rebuild the index? It is not enough to just read all documents as stored fields and reindex them. You need the original document data and basically run them thorugh the same pipeline that you already have (so the indexing should be done by the same code that indexes new documents). So I'd write some code that reads the old data (if possible from source) or reads the old index (if all information that was indexed is available as stored fields, synthetically builds input data for the new indexer and sends it to the API (or whatever you have for indexing in your new system).

If you just have incomplete Lucene Document instances from the older Lucene index, I think you're lost. When you cann IndexReader/IndexSearcher.document(), you only get stored fields, -- but that's not all information that was originbally used for indexing. Reading documents from IndexReader and passing it to IndexWriter does not work. It works from the API point of view, but the data is different.

Uwe

Am 28.04.2025 um 12:43 schrieb Saha, Rajib:
Hi Uwe,

Thank you for your detailed input and valuable advice. I fully understand and 
agree that upgrading from such an old version of Lucene involves much more than 
just resolving compilation issues.
Based on the latest Lucene version, we have redesigned our platform accordingly 
going through the Lucene APIs used and replacing accordingly to latest.

With these changes, Fresh content indexing is working fine. Search results are 
also coming as expected.
Greatly appreciate your expert guidance, to help to bringing till this point.

But same content on rebuilding the index is not working.
I am debugging this part now.

Do you have any suggestion on the problem ?

Regards
Rajib

-----Original Message-----
From: Uwe Schindler <u...@thetaphi.de>
Sent: 25 April 2025 18:19
To: java-user@lucene.apache.org
Subject: Re: Suggestion needed for a case of Lucene Migration with TokenStream

Hi,

I'd like to mention the following: You are trying to upgrade Lucene from
a really ancient version. Of course, basic concepts are still the same,
but the serach engine and its APIs have changed dramatically, so just
trying to "compile code and fix random stuff until it compiles" will not
bring you to a working product. On top, it may make the product worse
than before the update.

To do the upgrade correctly, it is recommended to have somebody
available (ideally the person who wrote the code originally) and then go
though it line-by line and rewrite it. I am explicitely mentioning
"rewrite" because that's what you should do! If you don't have a person
that undertstands Lucene enough, I'd suggest to get help from outside.
You need to understand every line of code when rewriting it. In addition
there are many new features that make all that sepcial cases like
PayLoads on Tokenstreams obsolete. I'd not recommend to use something
payloads on terms nowadays.

Uwe

Am 24.04.2025 um 12:29 schrieb Mikhail Khludnev:
Right. TextField.TYPE_NOT_STORED should be used then.

On Thu, Apr 24, 2025 at 10:37 AM Saha, Rajib <rajib.s...@sap.com.invalid>
wrote:

Thanks Mikhail for the suggestion.
Now the previous exception has gone. But a new exception has come from
Field.java.
Here below are the exception details.
========
java.lang.IllegalArgumentException: TokenStream fields cannot be stored
          at org.apache.lucene.document.Field.<init>(Field.java:155)
=========

Can you please suggest here too?

Regards
Rajib


-----Original Message-----
From: Mikhail Khludnev <m...@apache.org>
Sent: 24 April 2025 12:10
To: java-user@lucene.apache.org
Subject: Re: Suggestion needed for a case of Lucene Migration with
TokenStream

Hi
Use TextField.TYPE_STORED as the third argument in new Field()
see

https://github.com/apache/lucene-solr/blob/e27f44e3d78dfcec230c97e0a1240e3751daeff9/lucene/core/src/java/org/apache/lucene/document/TextField.java#L35C33-L35C44


On Thu, Apr 24, 2025 at 8:37 AM Saha, Rajib <rajib.s...@sap.com.invalid>
wrote:

Hi Experts,

We are migrating Lucene from 2.4.1 to 8.11.2.

During Migration for a part of code, we are getting below exception in
8.11.2 based changes from Red line colored.
=============
java.lang.IllegalArgumentException: TokenStream fields must be indexed
and
tokenized
at org.apache.lucene.document.Field.<init>(Field.java:152)

I tied few options. But, could not able to resolve the error. Beiiw
Can somebody of you please help me to identify, where it is going as
wrong?
We had code based on 2.4.1 as like below:
===================================
Int currentVal<
http://10.238.236.101:8080/source/s?defs=currentVal&project=2025_RTM> =
//some value
PayloadTokenStream<

http://10.238.236.101:8080/source/s?defs=PayloadTokenStream&project=2025_RTM
tokenStream<
http://10.238.236.101:8080/source/s?refs=tokenStream&project=2025_RTM> =
new PayloadTokenStream<

http://10.238.236.101:8080/source/s?defs=PayloadTokenStream&project=2025_RTM
();
tokenStream<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#tokenStream
.setPayload<
http://10.238.236.101:8080/source/s?defs=setPayload&project=2025_RTM
(currentVal<
http://10.238.236.101:8080/source/s?defs=currentVal&project=2025_RTM>);
lucField<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#lucField
= new Field<
http://10.238.236.101:8080/source/s?defs=Field&project=2025_RTM>(config<
http://10.238.236.101:8080/source/s?defs=config&project=2025_RTM
.payloadUid<
http://10.238.236.101:8080/source/s?defs=payloadUid&project=2025_RTM
().name<http://10.238.236.101:8080/source/s?defs=name&project=2025_RTM
,
tokenStream<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#tokenStream
);
doc<http://10.238.236.101:8080/source/s?defs=doc&project=2025_RTM>.add<
http://10.238.236.101:8080/source/s?defs=add&project=2025_RTM>(lucField<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/serviceplugins/src/com/sap/businessobjects/platform/search/lucene/index/engine/LuceneIndexEngine.java#lucField
);
......
public class PayloadTokenStream<

http://10.238.236.101:8080/source/s?refs=PayloadTokenStream&project=2025_RTM
extends TokenStream<
http://10.238.236.101:8080/source/s?defs=TokenStream&project=2025_RTM>{
public static String<
http://10.238.236.101:8080/source/s?defs=String&project=2025_RTM>
UID_PAYLOAD_START_VAL<

http://10.238.236.101:8080/source/s?refs=UID_PAYLOAD_START_VAL&project=2025_RTM
= "_UID_";
                private Token<
http://10.238.236.101:8080/source/s?defs=Token&project=2025_RTM> token<
http://10.238.236.101:8080/source/s?refs=token&project=2025_RTM> = new
Token<http://10.238.236.101:8080/source/s?defs=Token&project=2025_RTM
(UID_PAYLOAD_START_VAL<
http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#UID_PAYLOAD_START_VAL
,0,0);
                private byte[] buffer<
http://10.238.236.101:8080/source/s?refs=buffer&project=2025_RTM> = new
byte[4];
                private boolean returnToken<
http://10.238.236.101:8080/source/s?refs=returnToken&project=2025_RTM> =
false;

                public void setPayload<
http://10.238.236.101:8080/source/s?refs=setPayload&project=2025_RTM
(int
uid<http://10.238.236.101:8080/source/s?refs=uid&project=2025_RTM>){
                               buffer<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
[0]
= (byte)uid<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
;
                               buffer<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
[1]
= (byte)(uid<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
8);
                               buffer<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
[2]
= (byte)(uid<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
16);
                               buffer<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
[3]
= (byte)(uid<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
24);
                               token<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#token
.setPayload<
http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#setPayload
(new
Payload<
http://10.238.236.101:8080/source/s?defs=Payload&project=2025_RTM
(buffer<
http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
));
                               returnToken = true;
                }
                public Token<
http://10.238.236.101:8080/source/s?defs=Token&project=2025_RTM> next<
http://10.238.236.101:8080/source/s?refs=next&project=2025_RTM>() throws
IOException<
http://10.238.236.101:8080/source/s?defs=IOException&project=2025_RTM>{
                               if (returnToken<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#returnToken
){
returnToken<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#returnToken
= false; return token<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#token
;
}
                               else { return null<
http://10.238.236.101:8080/source/s?defs=null&project=2025_RTM>; }

                }
}


We have made code based on 8.11.2 as like below:
==========================================
PayloadTokenStream tokenStream = new PayloadTokenStream();
tokenStream.setPayload(currentVal);
FieldType fieldType = new FieldType();
lucField = new Field(config.payloadUid().name, tokenStream, fieldType);
doc.add(lucField);
----
public class PayloadTokenStream<

http://10.238.236.101:8080/source/s?refs=PayloadTokenStream&project=2025_RTM
extends TokenStream<
http://10.238.236.101:8080/source/s?defs=TokenStream&project=2025_RTM>{
public static String<
http://10.238.236.101:8080/source/s?defs=String&project=2025_RTM>
UID_PAYLOAD_START_VAL<

http://10.238.236.101:8080/source/s?refs=UID_PAYLOAD_START_VAL&project=2025_RTM
= "_UID_";
private byte[] buffer<
http://10.238.236.101:8080/source/s?refs=buffer&project=2025_RTM> = new
byte[4];
                private boolean returnToken<
http://10.238.236.101:8080/source/s?refs=returnToken&project=2025_RTM> =
false;

                public void setPayload<
http://10.238.236.101:8080/source/s?refs=setPayload&project=2025_RTM
(int
uid<http://10.238.236.101:8080/source/s?refs=uid&project=2025_RTM>){
                               buffer<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
[0]
= (byte)uid<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
;
                               buffer<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
[1]
= (byte)(uid<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
8);
                               buffer<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
[2]
= (byte)(uid<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
16);
                               buffer<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#buffer
[3]
= (byte)(uid<

http://10.238.236.101:8080/source/xref/2025_RTM/platform.services.search.java/framework/java/sdk/src/com/sap/businessobjects/platform/search/sdk/index/PayloadTokenStream.java#uid
24);
                               PayloadAttributeImpl attributeImpl = new
PayloadAttributeImpl(new BytesRef(buffer));
                              addAttributeImpl(attributeImpl);
                               returnToken = true;
                }
                public boolean incrementToken() throws IOException {
                               if (returnToken){
                                             returnToken = false;
                                             return true;
                               }
                               else {
                                             return false;
                               }
                }
}

Regards
Rajib


--
Sincerely yours
Mikhail Khludnev

--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de/
eMail: u...@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: u...@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to