Hello,
I need to develop a "french" parser. Google index french documents parsing "�" (HTML : e´) and "�" characters to "e". I think there's is already french parser for Lucene, so this is not really a problem.
Problem is : can it be created as a nutch plugin ? where should I put it ? Is there any started project about it ?
Thanks
Christophe.
