Re: Should ASCIIFoldingFilter be deprecated?

2011-02-08 Thread Robert Muir
On Mon, Feb 7, 2011 at 10:51 PM, Steven A Rowe sar...@syr.edu wrote: I haven't done any benchmarking, but I'm pretty sure that ASCIIFoldingFilter can achieve a significantly higher throughput rate than MappingCharFilter, and given that, it probably makes sense to keep both, to allow people to

RE: Should ASCIIFoldingFilter be deprecated?

2011-02-08 Thread David Smiley (@MITRE.org)
Chris Hostetter-3 wrote: CharFilters and TokenFilters have different purposes though... http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#When_To_use_a_CharFilter_vs_a_TokenFilter (ie: If you use MappingCharFilter, you can't then tokenize on some of the characters you

Re: Should ASCIIFoldingFilter be deprecated?

2011-02-08 Thread Robert Muir
On Tue, Feb 8, 2011 at 9:12 AM, David Smiley (@MITRE.org) dsmi...@mitre.org wrote: I'm skeptical that whatever the difference is is relevant in the scheme of things. The cost to keeping it is introducing confusion on users, and more code to maintain. its pretty significant. charfilters are

Re: Should ASCIIFoldingFilter be deprecated?

2011-02-08 Thread David Smiley (@MITRE.org)
Robert Muir wrote: On Tue, Feb 8, 2011 at 9:12 AM, David Smiley (@MITRE.org) dsmi...@mitre.org wrote: I'm skeptical that whatever the difference is is relevant in the scheme of things. The cost to keeping it is introducing confusion on users, and more code to maintain. its pretty

Re: Should ASCIIFoldingFilter be deprecated?

2011-02-08 Thread Robert Muir
On Tue, Feb 8, 2011 at 10:05 AM, David Smiley (@MITRE.org) dsmi...@mitre.org wrote: Well then I see a path forward to speed up MappingCharFilter substantially. There's your LUCENE-2788, and then you could easily add the same no-op optimization for the smallest char value in the HashMap. only

Re: Should ASCIIFoldingFilter be deprecated?

2011-02-08 Thread Robert Zotter
unsubscribe On 2/8/11 7:05 AM, David Smiley (@MITRE.org) wrote: Robert Muir wrote: On Tue, Feb 8, 2011 at 9:12 AM, David Smiley (@MITRE.org) dsmi...@mitre.org wrote: I'm skeptical that whatever the difference is is relevant in the scheme of things. The cost to keeping it is introducing

RE: Should ASCIIFoldingFilter be deprecated?

2011-02-07 Thread Steven A Rowe
AFAIK, ISOLatin1AccentFilter was deprecated because ASCIIFoldingFilter provides a superset of it mappings. I haven't done any benchmarking, but I'm pretty sure that ASCIIFoldingFilter can achieve a significantly higher throughput rate than MappingCharFilter, and given that, it probably makes

Re: Should ASCIIFoldingFilter be deprecated?

2011-02-07 Thread Chris Hostetter
: : ISOLatin1AccentFilter is deprecated, presumably because you can (and should) : use MappingCharFilter configured with mapping-ISOLatin1Accent.txt. By that : same reasoning, shouldn't ASCIIFoldingFilter be deprecated in favor of using : mapping-FoldToASCII.txt ? CharFilters and TokenFilters