Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Gautam Worah
I'm confused as to what could be happening. Google led me to this StackOverflow link: https://stackoverflow.com/questions/36402235/lucene-stringfield-gets-tokenized-when-doc-is-retrieved-and-stored-again which references some longstanding old issues about fields changing their "types" and so on. Th

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Wojtek
Hi, thank you for reply and apologies for being somewhat "all over the place". Regarding "tokenization" - should it happen if I use StringField? When the document is created (before writing) i see in the debugger it's not tokenized and is of type StringField: ``` doc = {Document@4830} "Documen

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Gautam Worah
Hey, I don't think I understand the email well but I'll try my best. In your printed docs, I see that the flag data is still tokenized. See the string that you printed: DOCS stored,indexed,tokenized,omitNorms. What does your code for adding the doc look like? Are you using StringField for adding

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Wojtek
Addendum, output is: ``` maxDoc: 3 maxDoc (after second flag): 3 Document stored,indexed,tokenized,omitNorms,indexOptions=DOCS stored,indexed,tokenized,omitNorms,indexOptions=DOCS stored> Document stored,indexed,tokenized,omitNorms,indexOptions=DOCS stored,indexed,tokenized,omitNorms,indexOpti

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Wojtek
Thank you Gautam! This works. Now I went back to Lucene and I'm hitting the wall. In James they set document with "id" being constructed as "flag--" (e.g. ""). I run the code that updates the documents with flags and afterwards check the result. The code simple code I use new reader from the wri

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Gautam Worah
Hey, Use a StringField instead of a TextField for the title and your test will pass. Tokenization which is enabled for TextFields, is breaking your fancy title into tokens split by spaces, which is causing your docs to not match. https://lucene.apache.org/core/9_11_0/core/org/apache/lucene/documen

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Wojtek
Hi Froh, thank you for the information. I updated the code and re-open the reader - it seems that the update is reflected and search for old document doesn't yield anything but the search for new term fails. I output all documents (there are 2) and the second one has new title but when searching