Hi Jérôme, I found your idea very interesting. I will be interested to contribute to the Parse Plugins Framework. I have developed similar one using Lucene. The project name is Lius.
If you are interested please let me know. On 4/7/06, Jérôme Charron <[EMAIL PROTECTED]> wrote: > > Hi all, > > While chatting with Chris Mattmann, it seems to be evident to us that > there > is a need for a new sub-project within Lucene. > > For now, Lucene's sub-projects used in Nutch are : > 1. Lucene-java - The basis for search technology > 2. Hadoop - The distributed computing platform > 3. Nutch - The search engine that relies on Lucene and Hadoop. > > Since Nutch contains some value added pieces of code that focus on content > analysis, > we think it would be a good idea to split Nutch into a new sub-project > based > on content analysis > manipulation. The components we have identified are : > > 1. MimeType Repository > 2. Language Identifier > 3. Content Signature (MD5Signature / TextProfileSignature / ...) > (4. Generic Meta Data Infrastructure) > (5. Charset Detector) > (6. Parse Plugins Framework) > > The idea is to expose these pieces of codes into a standalone lib, since > we > are convinced they could be usefull > in many other projects than Nutch. > The benefits will be to have some code more widely used / tested / > contributed. > If this proposal is accepted, we have a candidate name for this new > project: > Tika (comes from my son ;-) ) > > Any comment is welcome. > > Jérôme > >
