Re: LIUS/Fulltext indexing
On 6/12/07, Yonik Seeley [EMAIL PROTECTED] wrote: ... I think Tika will be the way forward (some of the code for Tika is coming from LIUS)... Work has indeed started to incoroporate the Lius code into Tika, see https://issues.apache.org/jira/browse/TIKA-7 and http://incubator.apache.org/projects/tika.html -Bertrand
Re: LIUS/Fulltext indexing
Sounds interesting. I can't seem to find any clear dates on the project website. Do you know? ...V1 shipping date? Thanks! On 6/12/07, Bertrand Delacretaz [EMAIL PROTECTED] wrote: On 6/12/07, Yonik Seeley [EMAIL PROTECTED] wrote: ... I think Tika will be the way forward (some of the code for Tika is coming from LIUS)... Work has indeed started to incoroporate the Lius code into Tika, see https://issues.apache.org/jira/browse/TIKA-7 and http://incubator.apache.org/projects/tika.html -Bertrand
Re: LIUS/Fulltext indexing
On 6/12/07, Vish D. [EMAIL PROTECTED] wrote: ...Sounds interesting. I can't seem to find any clear dates on the project website. Do you know? ...V1 shipping date?... Not at the moment, Tika just entered incubation and it's impossible to predict what will happen. But help is welcome, of course ;-) -Bertrand
Re: LIUS/Fulltext indexing
Wonder if TOM could be useful to integrate? http://tom.library.upenn.edu/convert/sofar.html On 6/12/07, Bertrand Delacretaz [EMAIL PROTECTED] wrote: On 6/12/07, Vish D. [EMAIL PROTECTED] wrote: ...Sounds interesting. I can't seem to find any clear dates on the project website. Do you know? ...V1 shipping date?... Not at the moment, Tika just entered incubation and it's impossible to predict what will happen. But help is welcome, of course ;-) -Bertrand
Re: LIUS/Fulltext indexing
On 6/11/07, Vish D. [EMAIL PROTECTED] wrote: Anyone have experience working with LIUS ( http://sourceforge.net/projects/lius/)? I can't seem to find any real documentation on it, even though it seems 'active' @ sourceforge. I need a way to index various types of fulltext, and LIUS seems very promising at first glance. What do you guys think? Is there a similar implementation you recommend, even something that might provide the simple text extraction functionality for the various types? I figure, I would need to do that anyways, and massage the text into Solr-type docs. I think Tika will be the way forward (some of the code for Tika is coming from LIUS) -Yonik