Re: Map-Reduce Slow Down

Aaron Kimball Tue, 14 Apr 2009 10:53:52 -0700

Are there any error messages in the log files on those nodes?
- Aaron

On Tue, Apr 14, 2009 at 9:03 AM, Mithila Nagendra <mnage...@asu.edu> wrote:


> I ve drawn a blank here! Can't figure out what s wrong with the ports. I
> can
> ssh between the nodes but cant access the DFS from the slaves - says "Bad
> connection to DFS". Master seems to be fine.
> Mithila
>
> On Tue, Apr 14, 2009 at 4:28 AM, Mithila Nagendra <mnage...@asu.edu>
> wrote:
>
> > Yes I can..
> >
> >
> > On Mon, Apr 13, 2009 at 5:12 PM, Jim Twensky <jim.twen...@gmail.com
> >wrote:
> >
> >> Can you ssh between the nodes?
> >>
> >> -jim
> >>
> >> On Mon, Apr 13, 2009 at 6:49 PM, Mithila Nagendra <mnage...@asu.edu>
> >> wrote:
> >>
> >> > Thanks Aaron.
> >> > Jim: The three clusters I setup had ubuntu running on them and the dfs
> >> was
> >> > accessed at port 54310. The new cluster which I ve setup has Red Hat
> >> Linux
> >> > release 7.2 (Enigma)running on it. Now when I try to access the dfs
> from
> >> > one
> >> > of the slaves i get the following response: dfs cannot be accessed.
> When
> >> I
> >> > access the DFS throught the master there s no problem. So I feel there
> a
> >> > problem with the port. Any ideas? I did check the list of slaves, it
> >> looks
> >> > fine to me.
> >> >
> >> > Mithila
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Apr 13, 2009 at 2:58 PM, Jim Twensky <jim.twen...@gmail.com>
> >> > wrote:
> >> >
> >> > > Mithila,
> >> > >
> >> > > You said all the slaves were being utilized in the 3 node cluster.
> >> Which
> >> > > application did you run to test that and what was your input size?
> If
> >> you
> >> > > tried the word count application on a 516 MB input file on both
> >> cluster
> >> > > setups, than some of your nodes in the 15 node cluster may not be
> >> running
> >> > > at
> >> > > all. Generally, one map job is assigned to each input split and if
> you
> >> > are
> >> > > running your cluster with the defaults, the splits are 64 MB each. I
> >> got
> >> > > confused when you said the Namenode seemed to do all the work. Can
> you
> >> > > check
> >> > > conf/slaves and make sure you put the names of all task trackers
> >> there? I
> >> > > also suggest comparing both clusters with a larger input size, say
> at
> >> > least
> >> > > 5 GB, to really see a difference.
> >> > >
> >> > > Jim
> >> > >
> >> > > On Mon, Apr 13, 2009 at 4:17 PM, Aaron Kimball <aa...@cloudera.com>
> >> > wrote:
> >> > >
> >> > > > in hadoop-*-examples.jar, use "randomwriter" to generate the data
> >> and
> >> > > > "sort"
> >> > > > to sort it.
> >> > > > - Aaron
> >> > > >
> >> > > > On Sun, Apr 12, 2009 at 9:33 PM, Pankil Doshi <
> forpan...@gmail.com>
> >> > > wrote:
> >> > > >
> >> > > > > Your data is too small I guess for 15 clusters ..So it might be
> >> > > overhead
> >> > > > > time of these clusters making your total MR jobs more time
> >> consuming.
> >> > > > > I guess you will have to try with larger set of data..
> >> > > > >
> >> > > > > Pankil
> >> > > > > On Sun, Apr 12, 2009 at 6:54 PM, Mithila Nagendra <
> >> mnage...@asu.edu>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Aaron
> >> > > > > >
> >> > > > > > That could be the issue, my data is just 516MB - wouldn't this
> >> see
> >> > a
> >> > > > bit
> >> > > > > of
> >> > > > > > speed up?
> >> > > > > > Could you guide me to the example? I ll run my cluster on it
> and
> >> > see
> >> > > > what
> >> > > > > I
> >> > > > > > get. Also for my program I had a java timer running to record
> >> the
> >> > > time
> >> > > > > > taken
> >> > > > > > to complete execution. Does Hadoop have an inbuilt timer?
> >> > > > > >
> >> > > > > > Mithila
> >> > > > > >
> >> > > > > > On Mon, Apr 13, 2009 at 1:13 AM, Aaron Kimball <
> >> aa...@cloudera.com
> >> > >
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > > Virtually none of the examples that ship with Hadoop are
> >> designed
> >> > > to
> >> > > > > > > showcase its speed. Hadoop's speedup comes from its ability
> to
> >> > > > process
> >> > > > > > very
> >> > > > > > > large volumes of data (starting around, say, tens of GB per
> >> job,
> >> > > and
> >> > > > > > going
> >> > > > > > > up in orders of magnitude from there). So if you are timing
> >> the
> >> > pi
> >> > > > > > > calculator (or something like that), its results won't
> >> > necessarily
> >> > > be
> >> > > > > > very
> >> > > > > > > consistent. If a job doesn't have enough fragments of data
> to
> >> > > > allocate
> >> > > > > > one
> >> > > > > > > per each node, some of the nodes will also just go unused.
> >> > > > > > >
> >> > > > > > > The best example for you to run is to use randomwriter to
> fill
> >> up
> >> > > > your
> >> > > > > > > cluster with several GB of random data and then run the sort
> >> > > program.
> >> > > > > If
> >> > > > > > > that doesn't scale up performance from 3 nodes to 15, then
> >> you've
> >> > > > > > > definitely
> >> > > > > > > got something strange going on.
> >> > > > > > >
> >> > > > > > > - Aaron
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Sun, Apr 12, 2009 at 8:39 AM, Mithila Nagendra <
> >> > > mnage...@asu.edu>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hey all
> >> > > > > > > > I recently setup a three node hadoop cluster and ran an
> >> > examples
> >> > > on
> >> > > > > it.
> >> > > > > > > It
> >> > > > > > > > was pretty fast, and all the three nodes were being used
> (I
> >> > > checked
> >> > > > > the
> >> > > > > > > log
> >> > > > > > > > files to make sure that the slaves are utilized).
> >> > > > > > > >
> >> > > > > > > > Now I ve setup another cluster consisting of 15 nodes. I
> ran
> >> > the
> >> > > > same
> >> > > > > > > > example, but instead of speeding up, the map-reduce task
> >> seems
> >> > to
> >> > > > > take
> >> > > > > > > > forever! The slaves are not being used for some reason.
> This
> >> > > second
> >> > > > > > > cluster
> >> > > > > > > > has a lower, per node processing power, but should that
> make
> >> > any
> >> > > > > > > > difference?
> >> > > > > > > > How can I ensure that the data is being mapped to all the
> >> > nodes?
> >> > > > > > > Presently,
> >> > > > > > > > the only node that seems to be doing all the work is the
> >> Master
> >> > > > node.
> >> > > > > > > >
> >> > > > > > > > Does 15 nodes in a cluster increase the network cost? What
> >> can
> >> > I
> >> > > do
> >> > > > > to
> >> > > > > > > > setup
> >> > > > > > > > the cluster to function more efficiently?
> >> > > > > > > >
> >> > > > > > > > Thanks!
> >> > > > > > > > Mithila Nagendra
> >> > > > > > > > Arizona State University
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: Map-Reduce Slow Down

Reply via email to