Re: All tasktrackers access same site at the same time (hadoop) please help

Gal Nitzan Thu, 16 Feb 2006 01:46:54 -0800

Thanks for the prompt reply.

I have updated Fetcher.java and hadoop.jar from trunk but I still get
the aforementioned behavior.




On Wed, 2006-02-15 at 15:02 -0800, Doug Cutting wrote:
> Gal Nitzan wrote:
> > I noticed all tasktrackers are participating in the fetch.
> > 
> > I have only one site in the injected seed file
> > 
> > I have 5 tasktrackers all except one access the same site.
> 
> I just fixed a bug related to this.  Please try updating.
> 
> The problem was that MapReduce recently started supporting speculative 
> execution, where, if some tasks appear to be executing slowly and there 
> are idle nodes, then these tasks automatically are run in parallel on 
> another node, and the results of the first that finishes are used.  But 
> this is not appropriate for fetching.  So I just added a mechanism to 
> Hadoop to disable it and then disabled it in the Fetcher.
> 
> Note also that the slaves file is now located in the conf/ directory, as 
> is a new file named hadoop-env.sh.  This contains all relevant 
> environment variables, so that we no longer have to rely on ssh's 
> environment passing feature.
> 
> Doug
>

Re: All tasktrackers access same site at the same time (hadoop) please help

Reply via email to