Chris Hostetter-3 wrote: > > CharFilters and TokenFilters have different purposes though... > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#When_To_use_a_CharFilter_vs_a_TokenFilter > > (ie: If you use MappingCharFilter, you can't then tokenize on some of the > characters you filtered away) >
Right, but it’s hard to imagine wanting to tokenize on an accent character or some other modification specified in these particular mapping files. Steven A Rowe wrote: > > AFAIK, ISOLatin1AccentFilter was deprecated because ASCIIFoldingFilter > provides a superset of it mappings. > *If* that is the case then this file should also be removed: solr/example/solr/conf/mapping-ISOLatin1Accent.txt Steven A Rowe wrote: > > I haven't done any benchmarking, but I'm pretty sure that > ASCIIFoldingFilter can achieve a significantly higher throughput rate than > MappingCharFilter, and given that, it probably makes sense to keep both, > to allow people to make the choice about the tradeoff between the > flexibility provided by the human-readable (and editable) mapping file and > the speed provided by ASCIIFoldingFilter. > I'm skeptical that whatever the difference is is relevant in the scheme of things. The cost to keeping it is introducing confusion on users, and more code to maintain. ~ David Smiley ----- Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Should-ASCIIFoldingFilter-be-deprecated-tp2448919p2451504.html Sent from the Solr - Dev mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org