Re: All tasktrackers access same site at the same time (hadoop) please help

Doug Cutting Wed, 15 Feb 2006 15:02:55 -0800

Gal Nitzan wrote:

I noticed all tasktrackers are participating in the fetch.


I have only one site in the injected seed file

I have 5 tasktrackers all except one access the same site.


I just fixed a bug related to this.  Please try updating.

The problem was that MapReduce recently started supporting speculativeexecution, where, if some tasks appear to be executing slowly and thereare idle nodes, then these tasks automatically are run in parallel onanother node, and the results of the first that finishes are used. Butthis is not appropriate for fetching. So I just added a mechanism toHadoop to disable it and then disabled it in the Fetcher.

Note also that the slaves file is now located in the conf/ directory, asis a new file named hadoop-env.sh. This contains all relevantenvironment variables, so that we no longer have to rely on ssh'senvironment passing feature.


Doug

Re: All tasktrackers access same site at the same time (hadoop) please help

Reply via email to