Re: [Nutch-dev] Have anybody thought of replacing CrawlDb with any kind of Rational DB?

Andrzej Bialecki Fri, 13 Apr 2007 12:19:48 -0700

Howie Wang wrote:
> I definitely don't expect people to write it just because it happens
> to be useful to me :-)  Call me crazy, but I'm thinking of
> implementing  this when I get some free time (whenever that will be).
> It seems that I  would just need to implement IWebDBWriter and
> IWebDBReader, and  then add a command line option to the tools
> (something like -mysql) to  specify the type of db to instantiate. It
> would affect about 15 files, but  the tools changes would be simple
> -- a few if statements here and there. Does that sound right?  Howie


You are talking about the codebase from branch 0.7. This branch is not 
under active development. The current codebase is very different - it 
uses the MapReduce framework to process data in a distributed fashion.

So, there is no single interface for writing the CrawlDb. There is one 
class for reading the CrawlDb, but usually the data in the DB is used 
not standalone, but as one of many inputs to a map-reduce job.

To summarize - I think it would be very difficult to do this with the 
current codebase.

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Re: [Nutch-dev] Have anybody thought of replacing CrawlDb with any kind of Rational DB?

Reply via email to