Hi,
Thanks for your reply. I do not know about cascading. Should I google it as 
"cascading in hadoop"? Also, what I was thinking is to implement a file system 
which overrides the functions provided by fs.FileSystem interface in Hadoop. I 
tried to write some portions of the filesystem (for my external server) so that 
it recompiles successfully but when I submit a MR job I get the following error:

13/03/26 06:09:10 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 0 time(s).
13/03/26 06:09:11 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 1 time(s).
13/03/26 06:09:12 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 2 time(s).
13/03/26 06:09:13 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 3 time(s).
13/03/26 06:09:14 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 4 time(s).
13/03/26 06:09:15 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 5 time(s).
13/03/26 06:09:16 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 6 time(s).
13/03/26 06:09:17 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 7 time(s).
13/03/26 06:09:18 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 8 time(s).
13/03/26 06:09:19 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/>. Already tried 9 time(s).
13/03/26 06:10:20 ERROR security.UserGroupInformation: 
PriviledgedActionException as:nikhil cause:java.net.ConnectException: Call to 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/> failed on connection 
exception: java.net.ConnectException: Connection refused
java.net.ConnectException: Call to 
localhost/127.0.0.1:54312<http://127.0.0.1:54312/> failed on connection 
exception: java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099)
    at org.apache.hadoop.ipc.Client.call(Client.java:1075)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at org.apache.hadoop.mapred.$Proxy2.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:480)
    at org.apache.hadoop.mapred.JobClient.init(JobClient.java:474)
    at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:457)
    at org.apache.hadoop.mapreduce.Job$1.run(Job.java:513)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapreduce.Job.connect(Job.java:511)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:499)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
    at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
    at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
    at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206)
    at org.apache.hadoop.ipc.Client.call(Client.java:1050)
    ... 27 more

Basically, my job tracker is running at localhost:54312 and I have set the 
value of fs.default.name parameter as myexternalserver://ip:port and 
fs.myexternalserver.impl as the filesystem class which I made. I am not able to 
figure out why this error is there. Why is it trying to connect to 
localhost:54312. Please suggest where am I going wrong.

Also, if you feel cascading would be better for this then please do let me know.

Thanks & Regards,
Nikhil

From: Agarwal, Nikhil
Sent: Tuesday, March 26, 2013 2:49 PM
To: '[email protected]'
Subject: How to tell my Hadoop cluster to read data from an external server

Hi,

I have a Hadoop cluster up and running. I want to submit an MR job to it but 
the input data is kept on an external server (outside the hadoop cluster). Can 
anyone please suggest how do I tell my hadoop cluster to load the input data 
from the external servers and then do a MR on it ?

Thanks & Regards,
Nikhil

Reply via email to