Got it resolved. Actually not a hadoop problem but an hbase client side config problem (didn't suspect... sorry!)
(Hbase client config is slightly mysterious sometimes...) Thanks, henning On Fri, 2010-11-05 at 13:56 +0100, Henning Blohm wrote: > Hi Michael, > > just tried. Client is an application node now also in all /etc/hosts. > No change unfortunately. > > But just noted, that my netstat-derived assumption is actually wrong > and the connectivity problem > must indeed be elsewhere. Will keep trying. Sorry for the misleading > info. > > Thanks, > Henning > > > Am Freitag, den 05.11.2010, 07:13 -0500 schrieb Michael Segel: > > > Well... > > > > 0.0.0.0 means that its listening on all networks and in your case... eth0 > > and 127.0.0.1. > > > > I'd try adding your client to the /etc/hosts on the machines. > > > > > > > > > Subject: RE: namenode and jobtracker remote access problem > > > From: [email protected] > > > To: [email protected] > > > Date: Fri, 5 Nov 2010 13:04:19 +0100 > > > > > > Hi Mike, > > > > > > 1) yes. My client can ssh into any of nodes. > > > 2) No, unfortunately not (hosted machines, no domain yet, just IP > > > addresses). My client is not in /etc/hosts of any of the nodes. Why? > > > Would they do reverse lookups? > > > 3) looking at ifconfig's output there is only eth0 and lo. So I assume > > > that is a yes to your question. > > > > > > My wild guess is that the namenode (and jobtracker) code by default try > > > to resolve the host name specified in fs.default.name and > > > mapred.job.tracker resp. > > > and use the resulting IP to open the server socket (or channel). Rather > > > than 0.0.0.0. > > > > > > But if that was the case, many, really many people should have the same > > > problem.... > > > > > > Thanks, > > > Henning > > > > > > > > > Am Freitag, den 05.11.2010, 06:55 -0500 schrieb Michael Segel: > > > > > > > Hi, > > > > > > > > First things to check... > > > > > > > > 1) Can you ping the machines from an external client machine. > > > > 2) /etc/hosts? Not a centralized DNS server? Is your client also in > > > > your /etc/hosts? > > > > 3) Do you only have one active NIC card? > > > > > > > > And of course I'm assuming that when you say you have the cloud up, you > > > > can launch jobs on the namenode and they run on all of the nodes? > > > > > > > > -Mike > > > > > > > > > Subject: namenode and jobtracker remote access problem > > > > > From: [email protected] > > > > > To: [email protected] > > > > > Date: Fri, 5 Nov 2010 12:23:30 +0100 > > > > > > > > > > Hi, > > > > > > > > > > I have problems making namenode and jobtracker remotely accessible. > > > > > > > > > > It seems several people have had this problem before but I was > > > > > unfortunately not able to find a solution yet. > > > > > > > > > > I have a hadoop 0.20.6 cluster setup. All nodes with static IP > > > > > addresses, > > > > > all wired up via short names, data0, data1, data2, master in > > > > > /etc/hosts. > > > > > > > > > > The master node hosts the name node as well as the job tracker. Both > > > > > listen > > > > > only to connection from the master node and will not accept remote > > > > > connections: > > > > > > > > > > > netstat -nltp > > > > > > > > > > Proto Recv-Q Send-Q Local Address Foreign Address > > > > > State PID/Program name > > > > > tcp 0 0 127.0.0.1:3306 0.0.0.0:* > > > > > LISTEN - > > > > > tcp 0 0 0.0.0.0:10000 0.0.0.0:* > > > > > LISTEN - > > > > > tcp 0 0 0.0.0.0:22 0.0.0.0:* > > > > > LISTEN - > > > > > tcp6 0 0 a.b.c.d:60000 :::* LISTEN > > > > > 19800/java > > > > > tcp6 0 0 :::52038 :::* > > > > > LISTEN 19235/java > > > > > tcp6 0 0 a.b.c.d:9000 :::* LISTEN > > > > > 19235/java > > > > > tcp6 0 0 a.b.c.d:9001 :::* LISTEN > > > > > 19507/java > > > > > tcp6 0 0 :::60010 :::* > > > > > LISTEN 19800/java > > > > > tcp6 0 0 :::50090 :::* > > > > > LISTEN 19409/java > > > > > tcp6 0 0 :::56429 :::* > > > > > LISTEN 19507/java > > > > > tcp6 0 0 :::2222 :::* > > > > > LISTEN 19717/java > > > > > tcp6 0 0 :::50030 :::* > > > > > LISTEN 19507/java > > > > > tcp6 0 0 :::38126 :::* > > > > > LISTEN 19409/java > > > > > tcp6 0 0 :::80 :::* > > > > > LISTEN - > > > > > tcp6 0 0 :::21 :::* > > > > > LISTEN - > > > > > tcp6 0 0 :::50070 :::* > > > > > LISTEN 19235/java > > > > > tcp6 0 0 :::22 :::* > > > > > LISTEN - > > > > > > > > > > (changed the real IP address to a.b.c.d). > > > > > > > > > > My hadoop/conf/core-site.xml looks like this: > > > > > > > > > > <?xml version="1.0"?> > > > > > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > > > > <!-- Put site-specific property overrides in this file. --> > > > > > <configuration> > > > > > <property> > > > > > <name>fs.default.name</name> > > > > > <value>hdfs://master:9000</value> > > > > > </property> > > > > > <property> > > > > > <name>hadoop.tmp.dir</name> > > > > > <value>/home/hadoop/data</value> > > > > > </property> > > > > > </configuration> > > > > > > > > > > and hadoop/conf/mapred-site.xml like this: > > > > > > > > > > <?xml version="1.0"?> > > > > > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > > > > <!-- Put site-specific property overrides in this file. --> > > > > > <configuration> > > > > > <property> > > > > > <name>mapred.job.tracker</name> > > > > > <value>master:9001</value> > > > > > </property> > > > > > </configuration> > > > > > > > > > > > > > > > Using IP adresses rather than host names in core-site.xml or > > > > > hdfs-site.xml didn't > > > > > change anything (contrary to what other mailing list submissions > > > > > suggest). > > > > > > > > > > Otherwise, the cluster starts up fine, all processes running, web > > > > > interfaces are reachable > > > > > and report nothing unusual. > > > > > > > > > > Any idea? I am blocked :-( > > > > > > > > > > Thanks, > > > > > Henning > > > > > > > > > > >
