Byron Miller wrote:
Is anyone using this right now? Any measure of
performance/overhead when distributing across multiple
systems?

I still have a "BIG BOX" server doing all of the webdb
and it can take 2-3 days to analyze a single iteration
of my webdb - would  the noted distributed webdb offer
any gain?

i see a lot of the ndfs switches/command line args
enabled across the board, but the doc's don't
reference  distributed webdb as beeing fully
integrated.

First of all, welcome back! - it's nice to see that Mozdex is again up and running.


AFAIK, NDFS has been integrated into all tools that deal with segments and WebDB, as an abstraction layer above the real filesystem. So, you can use NDFS to distribute the processing of any tool, with the notable exception of tools that use Lucene indexes - because there is no NDFS-aware version of Lucene Directory (yet).

Regarding the DistributedWebDB... I've never tried it yet, but from my reading the code it looks like it will happily use NDFS, too.

--
Best regards,
Andrzej Bialecki
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to