Greetings,
My name is Andrew Libby, and at Andrew Coliver's suggestion
I'm forwarding this. Initially I hesitated
to post to the dev list because I'm not (at least yet) developing.
I've been working with Lucene for a few days now and am having a blast,
however.
Thanks for the time.
Andy
----- Forwarded message from acoliver <[EMAIL PROTECTED]> -----
Date: Thu, 7 Feb 2002 06:07:55 -0800 (PST)
From: acoliver <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Re: Re: Proposal for Lucene
X-Mailer: E-mailanywhere V2.0 (Windows)
>On Thu, 7 Feb 2002 08:57:30 -0500 Andrew Libby <[EMAIL PROTECTED]> wrote.
>o Indexer vs Crawlers. I'm thinking that calling them AbstraceCrawler, etc
> will provide less ambiguity. Of course, it seems like you're
> consolidating the crawling and indexing behind an abstraction so there
> may be good reason. I think it'd be nice to avoid ambiguity with
> other index related classes. I suppose there are also namespace
> solutions to this also.
Agreed. I can't remember why I decided to call that Indexer versus Crawler.
>o Filters. It would be nice if Lucene had filters to deal with
> many document types. I have been giving this some thought, and
> Since Lucene (as it stands today) is so flexible, there is no
> standard interface to write filters to (so far as I can tell).
> It seems like you are suggesting a layer on top of Lucene which
> might be able to build up to such an interface. If so, having
> a collaborative effort to develop document filters would be
> of great value.
What got me interested in this is the Jakarta project I founded, POI (which
will appear on jakarta as soon as Sam get around to copying the website
files), provides a pure Java XLS (Excel) abstraction, and will soon provide
a DOC abstraction. I need these features to connect these filters to Lucene
efficiently and to have some idea
>I work for a company that is considering using Lucene to index a
>document repository. We're going to need filters, and I'm making a case
>that we contribute the filters we develop to Lucene.
>
That sounds like a great idea. If these filters happen to be based on the
OLE 2 Compound Document Format (Excel, Powerpoint, XLS), you might suggest
they take a look at POI as well. Old site is http://poi.sourceforge.net,
new site will be up on jakarta soon.... You can find the new site in the
sources at:
http://jakarta.apache.org/site/cvsindex.html
module = jakarta-poi
I look forward to our continued collaborative opensource development,
-Andy
>Andy
>
>
>On Thu, Feb 07, 2002 at 07:35:01AM -0500, Andrew C. Oliver wrote:
----- End forwarded message -----
--
--------------------------------------------------
Andrew Libby
CommNav, Inc
[EMAIL PROTECTED]
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>