Thanks for the prompt reply. I have updated Fetcher.java and hadoop.jar from trunk but I still get the aforementioned behavior.
On Wed, 2006-02-15 at 15:02 -0800, Doug Cutting wrote: > Gal Nitzan wrote: > > I noticed all tasktrackers are participating in the fetch. > > > > I have only one site in the injected seed file > > > > I have 5 tasktrackers all except one access the same site. > > I just fixed a bug related to this. Please try updating. > > The problem was that MapReduce recently started supporting speculative > execution, where, if some tasks appear to be executing slowly and there > are idle nodes, then these tasks automatically are run in parallel on > another node, and the results of the first that finishes are used. But > this is not appropriate for fetching. So I just added a mechanism to > Hadoop to disable it and then disabled it in the Fetcher. > > Note also that the slaves file is now located in the conf/ directory, as > is a new file named hadoop-env.sh. This contains all relevant > environment variables, so that we no longer have to rely on ssh's > environment passing feature. > > Doug >
