I'll echo what both Otis and Mark have said.
Lius does look useful, but there are many non-ASL'd dependencies (on
a quick glance in your lib directory) that would be very difficult to
resolve with the codebase here at the ASF.
Erik
On Jan 31, 2007, at 5:19 AM, markharw00d wrote:
I would prefer to see a good open-source framework pulling together
a collection of document parsers but which isn't tied directly to
Lucene (that binding would be via *another* project).
If the parser framework extracted document text in a standard
document-and-application-neutral form (XML/Java object?) this could
underpin *any* IR/IE project wanting to make use of the parser
functionality e.g. the GATE framework for example. That would
ultimately make a much more valuable piece of functionality and is
the approach taken by Stellent (used by many search engines,
recently purchased by Oracle).
Cheers
Mark
___________________________________________________________ All new
Yahoo! Mail "The new Interface is stunning in its simplicity and
ease of use." - PC Magazine http://uk.docs.yahoo.com/nowyoucan.html
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]