I suggest you look at: http://www.manageability.org/blog/stuff/open-source-web-crawlers-java
>From what I know of nutch, it's meant as the basic for a competitor to the big search engines (i.e. google). For a small web site, it might be overkill especially if it requires you to build from CVS (unless there are distributions). Note: I've got the book "Programming Spiders, Bots and Aggregators in Java", it describes spiders using a project called: j-spider http://sourceforge.net/projects/j-spider/ It could probably be adapted for your needs. HTH, sv On Wed, 28 Apr 2004, Kelvin Tan wrote: > As far as I know, LARM is defunct. I read somewhere, perhaps apocryphal, that > Clemens got a job which wasn't supportive of his continued development on LARM. > AFAIK there aren't any other active developers of LARM (at least at the time it > branched off to SF). > > Otis recently posted to use Nutch instead of LARM. > > Kelvin > > On 28 Apr 2004 09:44:04 +0800, Sebastian Ho said: > > Hi > > > > I have look at LARM website and I get different results > > > > http://nagoya.apache.org/wiki/apachewiki.cgi?LuceneLARMPages > > It says that development has stopped for this project. > > > > LARM hosted on sourceforge. > > The last message was dated 2003 in the mailing list. Is it still > > supported and active? > > > > LARM hosted on apache. > > It says the project is moved to sourceforge. > > > > Any one here who is active in LARM can comment on the status? > > > > Regards > > > > Sebastian Ho > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
