dayz...@gmail.com wrote:
Hi,
If I want to run several parsers on a single quad-core machine
simultaneously, would I still need to have Hadoop setup as a
single-node cluster?
I think that the fetcher is currently the only component that can take
advantage of multiple cores when running in "local" mode. We should
perhaps address that at some point since it is not that hard to
parallelize at least some of the processing inside individual tools so
single machine users could benefit from multiple cores.
I am not sure but I think that the only way to do it properly is run
jobtracker and tasktracker on that machine and configure proper block
sizes & number of map and reduce tasks.
Can several updatedbs be run simultaneously? I believe not, since the
db seems to be locked when it's being updated.
Locking prevents multiple applications of accessing crawl db
simultaneously (also linkdb).
--
Sami Siren