lucene webcrawler/dbms indexing framework

Woolly Mammoth Thu, 01 Apr 2004 14:59:35 -0800

Hi All,
        I have seen some discussion in the past around LARM & other web
crawler indexing code, but not much output. I have started a project on
SF http://sourceforge.net/projects/knine, and have commited some
initial framework code to CVS (despite the front page saying there are
not commits...), I haven't done a release yet, mainly because I need to
check licencing & am also having some trouble getting PDFBox to get all
fields in docs. If anyone has time to help/review would be great. I
wanted to try & licence as Apache style for contributers & gpl for
others, anyone know about this ?


The real goal of this is an easy to deploy lucene implementation, but
also scalable & flexible for customisation.
I will be putting all the currently hardcoded indexing rules into
config files asap.. - then hopefully getting a mgmt interface over the
files & indexing process

thanks
Dave


__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway 
http://promotions.yahoo.com/design_giveaway/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

lucene webcrawler/dbms indexing framework

Reply via email to