Hi there, I have a document and its title is "20111213_solr_apache conference report".
When I use analysis web interface to see what tokens exactly solr analyze and the following is the result term text20111213_solrapacheconferencereportterm type<NUM><ALPHANUM> <ALPHANUM><ALPHANUM> Why 20111213_solr tokenized as <NUM> and "_" char won't be removed? (I've add "_" as stop word in stopwords.txt) I did another test when "20111213_solr_apache conference_report". As you can see the difference is I add an underscore char between conference and report. To analyze this string term text20111213_solrapacheconferencereportterm type<NUM><ALPHANUM> <ALPHANUM><ALPHANUM> this time the underscore char between conference and report is removed! Why? How to make solr remove underscore char and behave consistent? Please help on this. Thanks in advance. Floyd