Gal Nitzan wrote:
Hi,

Just installed 0-8 with hadoop from trunk.

I noticed all tasktrackers are participating in the fetch.

I have only one site in the injected seed file

I have 5 tasktrackers all except one access the same site.

I am using nu0.8 dev with hadoop.

Please, any idea?

Hadoop doesn't have any mechanism for coordinating simultaneous access to resources across the cluster (global locking). I described the problem on the hadoop-dev list, no comments yet...

(FYI: if you wonder how it was working before, the trick was to generate just 1 split for the fetch job, which then lead to just one task being created for any input fetchlist. This was a hack that apparently stopped working after Hadoop was moved to its own codebase; the proper solution is to implement a global lock manager).

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to