Hi Jérôme,

I found your idea very interesting. I will be interested to contribute to
the Parse Plugins Framework. I have developed similar one using Lucene. The
project name is Lius.

If you are interested please let me know.



On 4/7/06, Jérôme Charron <[EMAIL PROTECTED]> wrote:
>
> Hi all,
>
> While chatting with Chris Mattmann, it seems to be evident to us that
> there
> is a need for a new sub-project within Lucene.
>
> For now, Lucene's sub-projects used in Nutch are :
> 1. Lucene-java - The basis for search technology
> 2. Hadoop - The distributed computing platform
> 3. Nutch - The search engine that relies on Lucene and Hadoop.
>
> Since Nutch contains some value added pieces of code that focus on content
> analysis,
> we think it would be a good idea to split Nutch into a new sub-project
> based
> on content analysis
> manipulation. The components we have identified are :
>
> 1. MimeType Repository
> 2. Language Identifier
> 3. Content Signature (MD5Signature / TextProfileSignature / ...)
> (4. Generic Meta Data Infrastructure)
> (5. Charset Detector)
> (6. Parse Plugins Framework)
>
> The idea is to expose these pieces of codes into a standalone lib, since
> we
> are convinced they could be usefull
> in many other projects than Nutch.
> The benefits will be to have some code more widely used / tested /
> contributed.
> If this proposal is accepted, we have a candidate name for this new
> project:
> Tika (comes from my son  ;-) )
>
> Any comment is welcome.
>
> Jérôme
>
>

Reply via email to