Unable to access data from non hadoop application (Version mismatch in DataNode)
--------------------------------------------------------------------------------

                 Key: HADOOP-3781
                 URL: https://issues.apache.org/jira/browse/HADOOP-3781
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.17.1
         Environment: Client Eclipse, server windows/cygwin
            Reporter: Thibaut


Hi, I'm trying to access the hdfs of my hadoop cluster in a non hadoop 
application. Hadoop 0.17.1 is running on standart ports (The same error also 
occured on earlier verisons). The code however will fail, as there is a version 
conflict.


This is the code I use:

FileSystem fileSystem = null;
                String hdfsurl = "hdfs://localhost:50010";
fileSystem = new DistributedFileSystem();

                try {
                        fileSystem.initialize(new URI(hdfsurl), new 
Configuration());
                } catch (Exception e) {
                        e.printStackTrace();
                        System.out.println("init error:");
                        System.exit(1);

                }


which fails with the exception:


java.net.SocketTimeoutException: timed out waiting for rpc response
        at org.apache.hadoop.ipc.Client.call(Client.java:559)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
        at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:313)
        at org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:102)
        at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:178)
        at 
org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:68)
        at com.iterend.spider.conf.Config.getRemoteFileSystem(Config.java:72)
        at tests.RemoteFileSystemTest.main(RemoteFileSystemTest.java:22)
init error:


The haddop logfile contains the following error:
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = bluelu-PC/192.168.1.130
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.17.1
STARTUP_MSG:   build = 
http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.17 -r 669344; 
compiled by 'hadoopqa' on Thu Jun 19 01:18:25 UTC 2008 
2008-07-10 23:05:47,840 INFO org.apache.hadoop.dfs.Storage: Storage directory 
\hadoop\tmp\hadoop-sshd_server\dfs\data is not formatted.
2008-07-10 23:05:47,840 INFO org.apache.hadoop.dfs.Storage: Formatting ...
2008-07-10 23:05:47,928 INFO org.apache.hadoop.dfs.DataNode: Registered 
FSDatasetStatusMBean
2008-07-10 23:05:47,929 INFO org.apache.hadoop.dfs.DataNode: Opened server at 
50010
2008-07-10 23:05:47,933 INFO org.apache.hadoop.dfs.DataNode: Balancing bandwith 
is 1048576 bytes/s
2008-07-10 23:05:48,128 INFO org.mortbay.util.Credential: Checking Resource 
aliases
2008-07-10 23:05:48,344 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4
2008-07-10 23:05:48,346 INFO org.mortbay.util.Container: Started 
HttpContext[/static,/static]
2008-07-10 23:05:48,346 INFO org.mortbay.util.Container: Started 
HttpContext[/logs,/logs]
2008-07-10 23:05:49,047 INFO org.mortbay.util.Container: Started [EMAIL 
PROTECTED]
2008-07-10 23:05:49,244 INFO org.mortbay.util.Container: Started 
WebApplicationContext[/,/]
2008-07-10 23:05:49,247 INFO org.mortbay.http.SocketListener: Started 
SocketListener on 0.0.0.0:50075
2008-07-10 23:05:49,247 INFO org.mortbay.util.Container: Started [EMAIL 
PROTECTED]
2008-07-10 23:05:49,257 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: 
Initializing JVM Metrics with processName=DataNode, sessionId=null
2008-07-10 23:05:49,535 INFO org.apache.hadoop.dfs.DataNode: New storage id 
DS-2117780943-192.168.1.130-50010-1215723949510 is assigned to data-node 
127.0.0.1:50010
2008-07-10 23:05:49,586 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010In 
DataNode.run, data = 
FSDataset{dirpath='c:\hadoop\tmp\hadoop-sshd_server\dfs\data\current'}
2008-07-10 23:05:49,586 INFO org.apache.hadoop.dfs.DataNode: using 
BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 60000msec
2008-07-10 23:06:04,636 INFO org.apache.hadoop.dfs.DataNode: BlockReport of 0 
blocks got processed in 11 msecs
2008-07-10 23:19:54,512 ERROR org.apache.hadoop.dfs.DataNode: 
127.0.0.1:50010:DataXceiver: java.io.IOException: Version Mismatch
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:961)
        at java.lang.Thread.run(Thread.java:619)


When compiling my own jar from the 0.17-1, I see that the distributed version 
has the revision number compiled into version number, instead of using the one 
from the source code (26738 vs 9). Skipping this check triggers another 
exception:

2008-07-17 17:28:51,268 ERROR org.apache.hadoop.dfs.DataNode: 
127.0.0.1:50010:DataXceiver: java.io.IOException: Unknown opcode 112 in data 
stream
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1002)
        at java.lang.Thread.run(Thread.java:619)


What do I do different from a hadoop application accessing hdfs?


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to