Have a look at Aperture: http://aperture.sourceforge.net/
It provides components for crawling and text and metadata extraction.
It's still in alpha stage though. The development code in CVS has
already improved a lot over the last official alpha release.
Chris
--
James liu wrote:
i wanna fin
thk,,,Cohen and lin.
2006/9/6, Doron Cohen <[EMAIL PROTECTED]>:
I think that Nutch would crawl and search all these 3 types. Not sure that
Nutch would provide the framework you seem to look for, but perhaps it is
worth to take a look - http://lucene.apache.org/nutch/
"James liu" <[EMAIL PROT
I think that Nutch would crawl and search all these 3 types. Not sure that
Nutch would provide the framework you seem to look for, but perhaps it is
worth to take a look - http://lucene.apache.org/nutch/
"James liu" <[EMAIL PROTECTED]> wrote on 05/09/2006 23:10:16:
> i wanna find frame which can
i wanna find frame which can index xml,word,excel,pdf,,,not one.
i just wanna know who know the frame like what i wanna.
2006/9/6, yueyu lin <[EMAIL PROTECTED]>:
First, Lucene is just a index toolkit, you have to USE it to implement
your
application.
If you want to index something, you must
First, Lucene is just a index toolkit, you have to USE it to implement your
application.
If you want to index something, you must have knowledge how to extract
information from them and what kind of keys they need to be set.
Then you can do what you want to.
On 9/5/06, James liu <[EMAIL PROTECTE
i wanna find frame which can index xml,word,excel,pdf,,,not one.
2006/9/6, Doron Cohen <[EMAIL PROTECTED]>:
Lucene FAQ - http://wiki.apache.org/jakarta-lucene/LuceneFAQ - has a few
entries just for this:
How can I index HTML documents?
How can I index XML documents?
How can I index Open
Lucene FAQ - http://wiki.apache.org/jakarta-lucene/LuceneFAQ - has a few
entries just for this:
How can I index HTML documents?
How can I index XML documents?
How can I index OpenOffice.org files?
How can I index MS-Word documents?
How can I index MS-Excel documents?
How can I index MS
i find lius many question so i wanna give up and find new.
who recommend ?