Hello,
I am using lucene 6.3.0 and I am trying to index file names and allow search
on them.
I'm facing problem because StandardAnalyzer isn't giving me tokens as I was
expecting.
input:
           mkt-4-elltvs-101_electrical_load_list.pdf

Expected output: 
           mkt
           4
           elltvs
           101
           electrical
           load
           list
           pdf

Actual output: 
           mkt
           4
           elltvs
           101_electrical_load_list.pdf


So basically I want StandardAnalyzer to treat underscores(_) and periods(.)
too as delimiters. Also I may have to add more delimiters in the future as
per my testing observations.
Which class do I need to edit/extend/rewrite to achieve this? Or is there
any option to provide a list of delimiters?

Also the other analyzers I've tried - Classic, Shingle, WhiteSpace, Simple ;
but none were close



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Add-more-stop-characters-to-StandardAnalyzer-tp4348048.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to