On Tue, Nov 2, 2010 at 10:08 PM, Mark Miller <[email protected]> wrote: > On 11/2/10 9:57 PM, Robert Muir wrote: >> On Tue, Nov 2, 2010 at 9:50 PM, Lance Norskog <[email protected]> wrote: >>> I just used One Fish Two Fish Red Fish Blue Fish but I think that has >>> license problems. >>> Also, the sample should include multi-word left-hand values because they >>> work. >>> >> >> I don't think we should do this... i suggest only using single word >> synonyms in the example for performance reasons! >> >> it doesnt really matter how rare they are: even "the quick brown fox" >> => something is terrible, because its going to invoke SynonymFilter's >> "slow path" for every single instance of "the". >> >> i know some insist its just an "example" and not defaults, but this >> isn't true, else why did this email thread even come up? its used as >> "defaults", and we should keep it very fast. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > We have discussed this before - there is always nasty compromise when it > comes to example vs default. Good for one is often not good for the > other. But like it or not, our example pretty much is the defacto > default as you say. > > As a reminder, in the past we have talked about doing both an example > with all the bells and whistles, and a performance config that you > should really start from. But we have not gotten there obviously ;) Adds > some dev/maint overhead as well. > > No real points, just chiming in with that. >
another idea i started for textTight, happy to try and wrap it up / contribute if there is interest. but this is really only applicable to 'textTight', since its stemming etc isn't insane like 'text' I generated the following with a mix of automatic and manual methods from 2+2lemma.txt (http://wordlist.sourceforge.net/ public domain/BSD) i'm sure other people must suffer with similar tuning like this... here's just some examples sample synonyms for textTight, built from only variant spellings (mostly brit <-> us): barbeque => barbecue blonde => blond conventionalising => conventionalizing convertor => converter conveyers => conveyors ... sample stemmer corrections for textTight, the plural-only stemmer (via StemmerOverrideFilter): errata erratum news news radii radius cavalrymen cavalryman ... --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
