Is this open source? APL'd? Where can I look at it? On Thu, 2002-02-07 at 22:00, Erik Hatcher wrote: > I've developed something similar myself. I've created an Ant task <index> > that uses DocumentHandler interface implementing classes - one that can be > used (<index class="...">) is a FileExtensionDocumentHandler. At build-time > I generate a Lucene index of static documents, and roll that into a web > application. > > Its got some kinks, like how to deal with the documents because they contain > relative hyperlinks... so these documents either should be copied into the > WAR too (or somehow made accessible to the web app) or incorporated directly > into a Lucene field ("rawcontents" is what I'm using now). These issues are > not tough to solve and having some additional parameters to my IndexTask > could allow such things to be customized by the user. > > My task is still evolving, but my plan all along has been to donate it to > lucene-dev for incorporation in some form or another. > > Let me know if you'd like it, and what package name you'd like to use. > > Erik > > > ----- Original Message ----- > From: "Kelvin Tan" <[EMAIL PROTECTED]> > To: "Lucene Developers List" <[EMAIL PROTECTED]> > Sent: Thursday, February 07, 2002 8:27 PM > Subject: Re: Proposal for Lucene > > > Great suggestions all around, and I'm pretty much in agreement with what's > been said. > > In my app, I've built a mini-framework around the searching such that I'm > able to map ContentHandlers (which index file contents) to file extensions. > I've been wanting to clean it up and contribute it for awhile, but haven't > overcome the intertia to do so. Also introduced a DataSource (which can > pretty much be anything, like a filesystem, a database, a URL, etc) from > which to obtain the data to index, so I think it _could_ be inline with what > some of you have in mind. > > I could also use alot of feedback with what's been done too... > > So what's the plan to move forward? > > K > ----- Original Message ----- > From: Mark Tucker > To: Lucene Developers List > Sent: Friday, February 08, 2002 4:03 AM > Subject: RE: Proposal for Lucene > > > I like what you included in your proposal and suggest doing all that (over > time) and taking the following into consideration: > > Indexers/Crawlers > > General Settings > SleeptimeBetweenCalls - can be used to avoid flooding a machine with too > many requests > IndexerTimeout - kill this crawler thread after long period of inactivity > IncludeFilter - include only items matching filter > ExcludeFilter - exclude items matching filter (can be used with > IncludeFilter) > MaxItems - stops indexing after x items > MaxMegs - stops indexing after x MB of data > > File System Indexer > URLReplacePrefix - can crawl c:\ but expose URL as http://mysever/docs/ > > Web Indexer > HTTPUser > HTTPPassword > HTTPUserAgent > ProxyServer > ProxyUser > ProxyPassword > HTTPSCertificate > HTTPSPrivateKey > > Other Possible Indexers > Microsoft Exchange 5.5/2000 > Lotus Notes > Newsgroup (NNTP) > Documentum > ODBC/OLEDB > XML - index single XML that represents multiple documents > > > Document Factory > General > The minimum properties for each document should be: > URL > Title > Abstract > Full Text > Score > > HTML > Support for META tags including Dublic Core syntax > > Other Possible Document Factories > Office Docs - DOC, XLS, PPT > PDF > > > Thanks for the great proposal. > > Mark Tucker > > > -----Original Message----- > From: Andrew C. Oliver [mailto:[EMAIL PROTECTED]] > Sent: Thursday, February 07, 2002 5:35 AM > To: Lucene Developers List > Subject: Proposal for Lucene > > > Hi All, > > This is just a few thoughts about Lucene. Please send me your feedback, > critiques and thought. > > If you folks would take a look: > > http://www.trilug.org/~acoliver/luceneplan.html > > if you'd like to submit patches: > > http://www.trilug.org/~acoliver/luceneplan.xml > > Once I've gotten feedback from the developer community I'll send this to > the user community as well. > > Thanks, > > Andy > -- > www.superlinksoftware.com > www.sourceforge.net/projects/poi - port of Excel format to java > http://developer.java.sun.com/developer/bugParade/bugs/4487555.html > - fix java generics! > > > The avalanche has already started. It is too late for the pebbles to > vote. > -Ambassador Kosh > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > > > > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > -- www.superlinksoftware.com www.sourceforge.net/projects/poi - port of Excel format to java http://developer.java.sun.com/developer/bugParade/bugs/4487555.html - fix java generics!
The avalanche has already started. It is too late for the pebbles to vote. -Ambassador Kosh -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>