Try restarting the daemons. Are they running? Are there errors in the log files in /tmp?
On Wed, Nov 2, 2011 at 5:34 PM, Paolo Castagna < [email protected]> wrote: > Hi Andrei, > this cluster is still running, I am running a distcp job to copy my > data from S3 to HDFS. > > The NameNode (via the Web UI) is sitll reporting: > > Live Nodes : 8 > Dead Nodes : 0 > Decommissioning Nodes : 0 > > I do not see errors in the logs. > > I can try to connect to one of the machines which did not join the cluster, > but I am not sure what to do to make it join the cluster once I am > connected > to it. > > Paolo > > On 2 November 2011 15:29, Andrei Savu <[email protected]> wrote: > > Are you seeing any errors in the logs? Can you check one of the machines > > that failed to join the cluster? > > Are you sure they've tried to join the rest of the cluster? Maybe you > have > > to wait a bit more. > > > > -- Andrei Savu > > > > On Wed, Nov 2, 2011 at 5:25 PM, Paolo Castagna > > <[email protected]> wrote: > >> > >> Hi > >> > >> On 2 November 2011 14:59, Paolo Castagna <[email protected] > > > >> wrote: > >> > Hi Andrei, > >> > I've just tried again, the only difference in the recipe: > >> > whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,16 > >> > hadoop-datanode+hadoop-tasktracker > >> > I saw the same exception, but now I can connect to the web UIs as > usual. > >> > >> Well, I spoken too soon. > >> > >> The very same cluster had 17 instances, I can see all of them running > >> via the Amazon console (i.e. I am paying for them), however the > >> NameNode and the JobTracker see only 8 nodes. :-( > >> > >> Paolo > > > > >
