Thank you Michael! I solved this requirement by setting the tokenStream at the field level and not leaving it to the analyzer. This gives control over altering the full text before tokenization using custom methods. This has memory overhead which is handled by writing the documents one at a time as against earlier approach of writing all documents in one go.
If required, I can share code snippets to show what I did? Regards Amitesh -- Sent from: https://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org