Hi, I see a lots of thread about apostrophe not being considered a separator and I see lots of french people complaining about that (I also complain since I am french ;) ).
My question is "what is the status of http://tinyurl.com/ynskw3 ?" I think the patch given in this thread will work for english and french without disturbing the filtering of english words such as O'Reilly since it only cares about "m', t', s', n', l', d'" as the first letters which I think is not going to happen in any english construction. so what is planned : 1. having a FrenchStandardFilter and an EnglishStandardFilter and removing StandardFilter 2. include that in the StandardFilter 3. having a EuropeanStandardFilter (with the most common rules of english, french, german, spanish, italian, ...) 3. doing nothing Personnaly, I'd like a EuropeanStandardFilter (i.e the 3rd point) which will handle most of the cases as I often find myself indexing french and english documents (as well as some spanish and italian) and I do not care losing some terms (for example, if the document was english but a word as been lost because of a very common italian rule). Thanks for the time you will spend answering my question chris -- View this message in context: http://www.nabble.com/Apostrophe-filtering-in-StandardFilter-tp15156768p15156768.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
