questions 1) Have you setup password less ssh between both hosts for the user who owns the hadoop processes (or root) 2) If answer to questions 1 is yes, how did you start NN, JT DN and TT 3) If you started them one by one, there is no reason running a command on one node will execute it on other.
On Sat, Oct 27, 2012 at 12:17 AM, Kartashov, Andy <[email protected]> wrote: > Andy, many thanks. > > I am stuck here now so please put me in the right direction. > > I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and > are now trying to try fully-dist'ed one. > > a. I created another instance foo2 on EC2. Installed hadoop on it and copied > conf/ folder from foo1 to foo2. I created /hadoop/dfs/data folder on the > local linux system on foo2. > > b. on foo1 I created file conf/slaves and added: > localhost > <hostname-of-foo2> > > At this point I cannot find an answer on what to do next. > > I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar > -files -blocks -locations", it showed # of datanode as 1. I was expecting DN > and TT on foo2 to be started by foo1. But it didn’t happen, so I started them > myself and tried the the command again. Still one DD. > I realise that boo2 has no data at this point but I could not find > bin/start-balancer.sh script to help me to balance data over to DD from foo1 > to foo2. > > What do I do next? > > Thanks > AK > > -----Original Message----- > From: Andy Isaacson [mailto:[email protected]] > Sent: Friday, October 26, 2012 2:21 PM > To: [email protected] > Subject: Re: cluster set-up / a few quick questions > > On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <[email protected]> > wrote: >> Gents, > > We're not all male here. :) I prefer "Hadoopers" or "hi all,". > >> 1. >> - do you put Master's node <hostname> under fs.default.name in core-site.xml >> on the slave machines or slaves' hostnames? > > Master. I have a 4-node cluster, named foo1 - foo4. My fs.default.name is > hdfs://foo1.domain.com. > >> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp >> /var folders on the HDFS of the slave machines that will be running only DN >> and TT or not? Do you still need to create hadoop/dfs/name folder on the >> slaves? > > (The following is the simple answer, for non-HA non-federated HDFS. > You'll want to get the simple example working before trying the complicated > ones.) > > No. A cluster has one namenode, running on the machine known as the master, > and the admin must "hadoop namenode -format" on that machine only. > > In my example, I ran "hadoop namenode -format" on foo1. > >> 2. >> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties we specify >> /hadoop/dfs/name /hadoop/dfs/data being local linux NFS directories by >> running command "mkdir -p /hadoop/dfs/data" >> but mapred.system.dir property is to point to HDFS and not NFS since we >> are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"?? >> If so and since it is exactly the same format /far/boo/baz how does hadoop >> know which directory is local on NFS or HDFS? > > This is very confusing, to be sure! There are a few places where paths are > implicitly known to be on HDFS rather than a Linux filesystem path. > mapred.system.dir is one of those. This does mean that given a string that > starts with "/tmp/" you can't necessarily know whether it's a Linux path or a > HDFS path without looking at the larger context. > > In the case of mapred.system.dir, the docs are the place to check; according > to cluster_setup.html, mapred.system.dir is "Path on the HDFS where where the > Map/Reduce framework stores system files". > > http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html > > Hope this helps, > -andy > NOTICE: This e-mail message and any attachments are confidential, subject to > copyright and may be privileged. Any unauthorized use, copying or disclosure > is prohibited. If you are not the intended recipient, please delete and > contact the sender immediately. Please consider the environment before > printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui > l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent > être couverts par le secret professionnel. Toute utilisation, copie ou > divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire > prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. > Veuillez penser à l'environnement avant d'imprimer le présent courriel -- Nitin Pawar
