3. Distributed crawling (i.e. give a bunch of links to it and some
distributed compute resources and have it go to town)

If you mean hadoop style, no. However you can start various droids on
different systems.


I'm looking into the same thing... (ideally sans-hadoop)

I have not had time to look at implementing it, but design wise, (i *think*) it would just take making a distributed Queue implementation. perhaps this could be done with jmx or maybe something akin to amazon simple queue service. That is a bunch of workers can read and write "tasks" from a central queue.

... but i have not (yet) had time to investigate if this is a relativly easy option

ryan



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to