OK, not sure what I did (restarting the firewall, perhaps?), but I now have ports 8020 and 8021 listening and no more errors in my logs. Wooo! Only problem is I still can't get any hadoop stuff to work from a remote client:

hadoop fs -ls /
2012-01-09 17:53:53.559 java[13396:1903] Unable to load realm info from SCDynamicStore 12/01/09 17:53:55 INFO ipc.Client: Retrying connect to server: *my_server/my_ip*:8020. Already tried 0 time(s). 12/01/09 17:53:56 INFO ipc.Client: Retrying connect to server: *my_server/my_ip*:8020. Already tried 1 time(s). 12/01/09 17:53:57 INFO ipc.Client: Retrying connect to server: *my_server/my_ip*:8020. Already tried 2 time(s).
...

I feel like I'm almost there. Might this have to do with the fact that core-site.xml and mapred-site.xml specify localhost for ports 8020 and 8021 (thus not listening to any attempted outside connections?)

Thanks for all the help so far, everyone!

Eli

On 1/9/12 5:43 PM, alo.alt wrote:
Firewall online?
and be sure that in /etc/hosts ONLY 127.0.0.1 is linked to localhost. Nothing 
like YOURHOSTNAME.YOURDOMAIN (Redhat kudzu bug)

- Alex

--
Alexander Lorenz
http://mapredit.blogspot.com

On Jan 9, 2012, at 2:39 PM, Eli Finkelshteyn wrote:

Good call! netstat -anl gives me:
tcp        0      0 ::ffff:127.0.0.1:8020       :::*                        
LISTEN

Now it just looks like nothing is running on 8021. And now I'm really confused 
about why I get no communication over 8020 from the datanode.

Just to reiterate, this definitely is not the firewall, running iptables -nvL 
gives:

...
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0   
        state NEW tcp dpt:50070
    1    64 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0   
        state NEW tcp dpt:50030
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0   
        state NEW tcp dpt:8021
    1    64 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0   
        state NEW tcp dpt:8020
...

On 1/9/12 5:08 PM, alo.alt wrote:
What happen when you try a "telnet localhost 8020"?
netstat -anl would also useful.

best,
  Alex

--
Alexander Lorenz
http://mapredit.blogspot.com

On Jan 9, 2012, at 2:02 PM, Eli Finkelshteyn wrote:

A bit more info:

When I start up only the namenode by itself, I'm not seeing any errors, but 
what I am seeing that's really odd is:

   2012-01-09 16:48:45,530 INFO org.apache.hadoop.ipc.Server: Starting
   Socket Reader #1 for port 8020
   2012-01-09 16:48:45,531 INFO
   org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics
   with hostName=NameNode, port=8020
   2012-01-09 16:48:45,532 INFO
   org.apache.hadoop.ipc.metrics.RpcDetailedMetrics: Initializing RPC
   Metrics with hostName=NameNode, port=8020
   2012-01-09 16:48:45,541 INFO
   org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
   localhost.localdomain/127.0.0.1:8020

That's despite the fact that doing netstat -a | grep 8020 still returns 
nothing.  To me, that makes absolutely no sense. I feel like I should be 
getting an error telling me Namenode did not in fact go up on 8020, but I'm not 
getting that at all.

Eli

On 1/9/12 3:22 PM, Idris Ali wrote:
Hi,

Looks like problem in starting DFS and MR, can you run 'jps' and see if NN,
DN, SNN, JT and TT are running,

also make sure for pseudo-distributed mode, the following entries are
present:

1. In core-site.xml
  <property>
     <name>fs.default.name</name>
     <value>hdfs://localhost:8020</value>
   </property>

   <property>
      <name>hadoop.tmp.dir</name>
      <value><SOME TMP dir with Read/Write acces not system temp></value>
   </property>
   <property>

2.  In hdfs-site.xml
<property>
     <name>dfs.replication</name>
     <value>1</value>
   </property>
   <property>
      <name>dfs.permissions</name>
      <value>false</value>
   </property>
   <property>
      <!-- specify this so that running 'hadoop namenode -format' formats
the right dir -->
      <name>dfs.name.dir</name>
      <value>Local dir with Read/Write access</value>
   </property>

3. In mapred-stie.xml
   <property>
     <name>mapred.job.tracker</name>
     <value>localhost:8021</value>
   </property>

Thanks,
-Idris

On Tue, Jan 10, 2012 at 1:07 AM, Eli Finkelshteyn<[email protected]>wrote:

Positive. Like I said before, netstat -a | grep 8020 gives me nothing.
Even if the firewall was the problem, that should still give me output that
the port is listening, but I'd just be unable to hit it from an outside box
(I tested this by blocking port 50070, at which point it still showed up in
netstat -a, but was inaccessible through http from a remote machine). This
problem is something else.


On 1/9/12 2:31 PM, zGreenfelder wrote:

On Mon, Jan 9, 2012 at 1:58 PM, Eli 
Finkelshteyn<iefinkel@gmail.**com<[email protected]>>
  wrote:

More info:

In the DataNode log, I'm also seeing:

2012-01-09 13:06:27,751 INFO org.apache.hadoop.ipc.Client: Retrying
connect
to server: localhost/127.0.0.1:8020. Already tried 9 time(s).

Why would things just not load on port 8020? I feel like all the errors
I'm
seeing are caused by this, but I can't see any errors about why this
occurred in the first place.

  are you sure there isn't a firewall in place blocking port 8020?
e.g. iptables on the local machines?   if you do
telnet localhost 8020
do you make a connection? if you use lsof and/or netstat can you see
the port open?
if you have root access you can try turning off the firewall with
iptables -F to see if things work without firewall rules.


Reply via email to