Otis, Thanks for cranking and getting all this stuff rolling. --Peter
On 5/4/02 7:09 AM, "Otis Gospodnetic" <[EMAIL PROTECTED]> wrote: > Woohoo, I can feel the energy all the way here in NYC! :) > > Clemens' contribution is in jakarta-lucene-sandbox now, so go ahead, > look and play. > > > I will send a separate note about this contribution now. > > Otis > > > --- "Andrew C. Oliver" <[EMAIL PROTECTED]> wrote: >> cool dude lets put it all in the same place and refactor it together. >> I >> didn't even repackage this yet I figured we'd put it in, get it >> building >> and then pick, choose and enhance. >> >> (like i want to rip jdom out and use commons-logging for this bad boy >> because log4j has bitten me cleanly in the rump too many times with >> its >> forever changing interfaces) >> >> I'm not wanting to do this alone. Lets work together.. My interest >> is >> in getting some interfaces and pluggable architecture in here and you >> had some great ideas on that IIRC. >> >> I've even got the build and all set up. When these guys say go I'll >> hit >> the button and we can go to town. >> >> -Andy >> >> On Fri, 2002-05-03 at 22:48, Otis Gospodnetic wrote: >>> Note that I will also be putting some web crawler code in the >> sandbox >>> soon. The code is from Clemens, who posted a few messages >> recently. >>> >>> Good, lets see some refactoring! >>> >>> Otis >>> >>> >>> --- "Andrew C. Oliver" <[EMAIL PROTECTED]> wrote: >>>> Hi Manfred/Kelvin (whose name I saw on a lot of this), >>>> >>>> I'm back on the on cycle and I was about to commit this stuff so >> we >>>> could start refactoring, I've got it building and all set up and >>>> ready. >>>> But I wanted to make sure that you're still okay with it. >>>> >>>> Once I get it in lucene-sandbox we can start refactoring it and >>>> adding >>>> the new features. >>>> >>>> Are we good to go? Let me know and then we can watch the CVS >> commit >>>> messages fly into lucene-sandbox... >>>> >>>> Thanks, >>>> >>>> -Andy >>>> >>>> On Fri, 2002-02-08 at 05:26, Manfred Sch?fer wrote: >>>>> Hi, >>>>> >>>>> i would suggest two sub-projects: >>>>> >>>>> 1.Crawler - retrieving docs, wherever they are..... >>>>> >>>>> 2. DocumentHandler extract Text, create apropriate fields etc.. >>>>> >>>>> The second is a layer on top of lucene. First is a autonomous >>>> package, wich >>>>> should be nicely integrated with lucene/Document-Handler, but >>>> should also be >>>>> usable for other projects. >>>>> >>>>> I've included my code, to show you, what i've done. It isn't >> too >>>> useful yet, >>>>> because it is integrated in our product, but you can get the >> idea. >>>> Actually i've >>>>> written two things: >>>>> >>>>> 1: A robot for crawling a remote server via http and writing >> all >>>> the data to >>>>> local filesystem, then importing it into our db and >>>>> (at the same time) replacing all links with internal links. So >> we >>>> could emulate >>>>> a web-Site from this crawled Data! >>>>> [com.synformation.script.utilities.importtool] >>>>> >>>>> 2: (I've rewritten some of the code from 1 for that, so this is >>>> much cleaner) A >>>>> customer needs a tool for importing local mini-Websites on the >>>> file-system via >>>>> an applet, send it to the Web-Server and import it as described >> in >>>> point 1. I've >>>>> tried to write it in a way, that it could include the >> functionality >>>> of point 1 >>>>> (retrieving vie http), but that is mostly untested. >>>>> [com.synformation.script.utilities.fileimport] >>>>> >>>>> I don't say, that you(we) should use this. But i think it's >> time to >>>> come to a >>>>> more concrete plans. I'm interested to help on that for the >>>> crawler. >>>>> >>>>> >>>>> mfg, >>>>> >>>>> manfred >>>>> >>>>> >>>>> >>>>> >>>>> ---- >>>>> >>>> >>>>> -- >>>>> To unsubscribe, e-mail: >>>> <mailto:[EMAIL PROTECTED]> >>>>> For additional commands, e-mail: >>>> <mailto:[EMAIL PROTECTED]> >>>> -- >>>> http://www.superlinksoftware.com >>>> http://jakarta.apache.org/poi - port of Excel/Word/OLE 2 Compound >>>> Document >>>> format to java >>>> >> http://developer.java.sun.com/developer/bugParade/bugs/4487555.html >>>> - fix java generics! >>>> The avalanche has already started. It is too late for the pebbles >> to >>>> vote. >>>> -Ambassador Kosh >>>> >>>> >>>> -- >>>> To unsubscribe, e-mail: >>>> <mailto:[EMAIL PROTECTED]> >>>> For additional commands, e-mail: >>>> <mailto:[EMAIL PROTECTED]> >>>> >>> >>> >>> __________________________________________________ >>> Do You Yahoo!? >>> Yahoo! Health - your guide to health and wellness >>> http://health.yahoo.com >>> >>> -- >>> To unsubscribe, e-mail: >> <mailto:[EMAIL PROTECTED]> >>> For additional commands, e-mail: >> <mailto:[EMAIL PROTECTED]> >>> >> -- >> http://www.superlinksoftware.com >> http://jakarta.apache.org/poi - port of Excel/Word/OLE 2 Compound >> Document >> format to java >> http://developer.java.sun.com/developer/bugParade/bugs/4487555.html >> - fix java generics! >> The avalanche has already started. It is too late for the pebbles to >> vote. >> -Ambassador Kosh >> >> >> -- >> To unsubscribe, e-mail: >> <mailto:[EMAIL PROTECTED]> >> For additional commands, e-mail: >> <mailto:[EMAIL PROTECTED]> >> > > > __________________________________________________ > Do You Yahoo!? > Yahoo! Health - your guide to health and wellness > http://health.yahoo.com > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>