Re: Lius into apache incubator

2007-02-28 Thread Otis Gospodnetic
Wednesday, February 28, 2007 1:06:16 PM Subject: Re: Lius into apache incubator Hi Rida, I've been talking with Jukka Zitting (involved in Nutch) about parsing/Tika and we started to sketch out some project objectives on the Wiki over there which may be of interest: http://code.google.com/p

Re: Lius into apache incubator

2007-02-28 Thread mark harwood
Sent: Wednesday, 28 February, 2007 4:46:36 PM Subject: Re: Lius into apache incubator Hi Otis, Many thanks for your comments, I'm so sorry for this late answer. I will add lius as lucene contrib and I will change the licence to ASL. There are some developper contributing to Lius but there ar

Re: Lius into apache incubator

2007-02-28 Thread Rida Benjelloun
Hi Otis, Many thanks for your comments, I'm so sorry for this late answer. I will add lius as lucene contrib and I will change the licence to ASL. There are some developper contributing to Lius but there are not very active. For the question : this is a Laval University project, right? But you wo

Re: Lius into apache incubator

2007-01-31 Thread Erik Hatcher
I'll echo what both Otis and Mark have said. Lius does look useful, but there are many non-ASL'd dependencies (on a quick glance in your lib directory) that would be very difficult to resolve with the codebase here at the ASF. Erik On Jan 31, 2007, at 5:19 AM, markharw00d wrote:

Re: Lius into apache incubator

2007-01-31 Thread markharw00d
I would prefer to see a good open-source framework pulling together a collection of document parsers but which isn't tied directly to Lucene (that binding would be via *another* project). If the parser framework extracted document text in a standard document-and-application-neutral form (XML/Jav

Re: Lius into apache incubator

2007-01-30 Thread Otis Gospodnetic
Hi Rida, Some comments in no particular order: - Looks useful - This looks like a more expanded version of what Erik and I wrote for LIA, and I know people often ask and use that code, so I know there is a need for a framework that knows how to parse various document formats - Nutch has some