you can get the script from hadoop codebase at http://svn.apache.org/viewcvs.cgi/hadoop/common<http://svn.apache.org/viewcvs.cgi/hadoop/common/trunk/>
On Fri, Nov 2, 2012 at 12:41 AM, Kartashov, Andy <[email protected]>wrote: > People, > > While I did not find start-balancer.sh script on my machine I successfully > utilized the following command: > > "$hadoop balancer -threshold 10" and achieved the exact same result. > > One issue remains. Controlling start/stop daemons of the slaves through > the master. Somehow I don't have dfs-start/stop.sh nor dfs-start-all.sh > script on my machine either. For now, I am starting dfs and mapreduce > daemons on each slave manually and individually. > > Can someone post the content of the script star-all.sh so I could utilize > it for my environment. > > Thanks, > AK47 > > > -----Original Message----- > From: Kartashov, Andy > Sent: Friday, October 26, 2012 3:56 PM > To: [email protected] > Subject: RE: cluster set-up / a few quick questions - SOLVED > > Hadoopers, > > The problem was in EC2 security. While I could passwordlessly ssh into > another node and back I could not telnet to it due to EC2 firewall. Needed > to open ports for the NN and JT. :) > > Now I can see 2 DNs running "hadoop fsck " and can also -ls into NN from > the slave. Sweet!!! > > Is this possible to balance data over DNs without copying them with > hadoop -put command? I read about bin/start-balancer.sh somewhere but > cannot find it on my current hadoop installation. > Besides, is balancing data over DN going to improve perfomance of MR job? > > Cheers, > Happy Hadooping. > > -----Original Message----- > From: Nitin Pawar [mailto:[email protected]] > Sent: Friday, October 26, 2012 3:18 PM > To: [email protected] > Subject: Re: cluster set-up / a few quick questions > > questions > > 1) Have you setup password less ssh between both hosts for the user who > owns the hadoop processes (or root) > 2) If answer to questions 1 is yes, how did you start NN, JT DN and TT > 3) If you started them one by one, there is no reason running a command on > one node will execute it on other. > > > On Sat, Oct 27, 2012 at 12:17 AM, Kartashov, Andy <[email protected]> > wrote: > > Andy, many thanks. > > > > I am stuck here now so please put me in the right direction. > > > > I successfully ran a job on a cluster on foo1 in pseudo-distributed mode > and are now trying to try fully-dist'ed one. > > > > a. I created another instance foo2 on EC2. Installed hadoop on it and > copied conf/ folder from foo1 to foo2. I created /hadoop/dfs/data folder > on the local linux system on foo2. > > > > b. on foo1 I created file conf/slaves and added: > > localhost > > <hostname-of-foo2> > > > > At this point I cannot find an answer on what to do next. > > > > I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck > /user/bar -files -blocks -locations", it showed # of datanode as 1. I was > expecting DN and TT on foo2 to be started by foo1. But it didn't happen, so > I started them myself and tried the the command again. Still one DD. > > I realise that boo2 has no data at this point but I could not find > bin/start-balancer.sh script to help me to balance data over to DD from > foo1 to foo2. > > > > What do I do next? > > > > Thanks > > AK > > > > -----Original Message----- > > From: Andy Isaacson [mailto:[email protected]] > > Sent: Friday, October 26, 2012 2:21 PM > > To: [email protected] > > Subject: Re: cluster set-up / a few quick questions > > > > On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <[email protected]> > wrote: > >> Gents, > > > > We're not all male here. :) I prefer "Hadoopers" or "hi all,". > > > >> 1. > >> - do you put Master's node <hostname> under fs.default.name in > core-site.xml on the slave machines or slaves' hostnames? > > > > Master. I have a 4-node cluster, named foo1 - foo4. My fs.default.nameis > > hdfs:// > foo1.domain.com. > > > >> - do you need to run "sudo -u hdfs hadoop namenode -format" and create > /tmp /var folders on the HDFS of the slave machines that will be running > only DN and TT or not? Do you still need to create hadoop/dfs/name folder > on the slaves? > > > > (The following is the simple answer, for non-HA non-federated HDFS. > > You'll want to get the simple example working before trying the > > complicated ones.) > > > > No. A cluster has one namenode, running on the machine known as the > master, and the admin must "hadoop namenode -format" on that machine only. > > > > In my example, I ran "hadoop namenode -format" on foo1. > > > >> 2. > >> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties we specify > /hadoop/dfs/name /hadoop/dfs/data being local linux NFS directories by > running command "mkdir -p /hadoop/dfs/data" > >> but mapred.system.dir property is to point to HDFS and not NFS since > we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"?? > >> If so and since it is exactly the same format /far/boo/baz how does > hadoop know which directory is local on NFS or HDFS? > > > > This is very confusing, to be sure! There are a few places where paths > are implicitly known to be on HDFS rather than a Linux filesystem path. > mapred.system.dir is one of those. This does mean that given a string that > starts with "/tmp/" you can't necessarily know whether it's a Linux path or > a HDFS path without looking at the larger context. > > > > In the case of mapred.system.dir, the docs are the place to check; > according to cluster_setup.html, mapred.system.dir is "Path on the HDFS > where where the Map/Reduce framework stores system files". > > > > http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html > > > > Hope this helps, > > -andy > > NOTICE: This e-mail message and any attachments are confidential, > > subject to copyright and may be privileged. Any unauthorized use, > > copying or disclosure is prohibited. If you are not the intended > > recipient, please delete and contact the sender immediately. Please > > consider the environment before printing this e-mail. AVIS : le > > présent courriel et toute pièce jointe qui l'accompagne sont > > confidentiels, protégés par le droit d'auteur et peuvent être couverts > > par le secret professionnel. Toute utilisation, copie ou divulgation > > non autorisée est interdite. Si vous n'êtes pas le destinataire prévu > > de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. > > Veuillez penser à l'environnement avant d'imprimer le présent courriel > > > > -- > Nitin Pawar > NOTICE: This e-mail message and any attachments are confidential, subject > to copyright and may be privileged. Any unauthorized use, copying or > disclosure is prohibited. If you are not the intended recipient, please > delete and contact the sender immediately. Please consider the environment > before printing this e-mail. AVIS : le présent courriel et toute pièce > jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur > et peuvent être couverts par le secret professionnel. Toute utilisation, > copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le > destinataire prévu de ce courriel, supprimez-le et contactez immédiatement > l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent > courriel > -- Nitin Pawar
