Re: LIUS/Fulltext indexing

2007-06-12 Thread Bertrand Delacretaz

On 6/12/07, Yonik Seeley [EMAIL PROTECTED] wrote:


... I think Tika will be the way forward (some of the code for Tika is
coming from LIUS)...


Work has indeed started to incoroporate the Lius code into Tika, see
https://issues.apache.org/jira/browse/TIKA-7 and
http://incubator.apache.org/projects/tika.html

-Bertrand


Re: LIUS/Fulltext indexing

2007-06-12 Thread Vish D.

Sounds interesting. I can't seem to find any clear dates on the project
website. Do you know? ...V1 shipping date?

Thanks!
On 6/12/07, Bertrand Delacretaz [EMAIL PROTECTED] wrote:


On 6/12/07, Yonik Seeley [EMAIL PROTECTED] wrote:

... I think Tika will be the way forward (some of the code for Tika is
 coming from LIUS)...

Work has indeed started to incoroporate the Lius code into Tika, see
https://issues.apache.org/jira/browse/TIKA-7 and
http://incubator.apache.org/projects/tika.html

-Bertrand



Re: LIUS/Fulltext indexing

2007-06-12 Thread Bertrand Delacretaz

On 6/12/07, Vish D. [EMAIL PROTECTED] wrote:

...Sounds interesting. I can't seem to find any clear dates on the project
website. Do you know? ...V1 shipping date?...


Not at the moment, Tika just entered incubation and it's impossible to
predict what will happen.

But help is welcome, of course ;-)

-Bertrand


Re: LIUS/Fulltext indexing

2007-06-12 Thread Vish D.

Wonder if TOM could be useful to integrate?
http://tom.library.upenn.edu/convert/sofar.html

On 6/12/07, Bertrand Delacretaz [EMAIL PROTECTED] wrote:


On 6/12/07, Vish D. [EMAIL PROTECTED] wrote:
 ...Sounds interesting. I can't seem to find any clear dates on the
project
 website. Do you know? ...V1 shipping date?...

Not at the moment, Tika just entered incubation and it's impossible to
predict what will happen.

But help is welcome, of course ;-)

-Bertrand



Re: LIUS/Fulltext indexing

2007-06-11 Thread Yonik Seeley

On 6/11/07, Vish D. [EMAIL PROTECTED] wrote:

Anyone have experience working with LIUS (
http://sourceforge.net/projects/lius/)? I can't seem to find any real
documentation on it, even though it seems 'active' @ sourceforge. I need a
way to index various types of fulltext, and LIUS seems very promising at
first glance. What do you guys think? Is there a similar implementation you
recommend, even something that might provide the simple text extraction
functionality for the various types? I figure, I would need to do that
anyways, and massage the text into Solr-type docs.


I think Tika will be the way forward (some of the code for Tika is
coming from LIUS)

-Yonik