Dear users,
I'm new to hadoop and pig and I really feel I need some help...
I managed to set up a hadoop cluster on two Ubuntu boxes. All hadoop deamons
begin without any problems.
I can also successfully copy files to the HDFS from the local filesystem.
The problem is that I can't run pig in mapreduce mode.I can do it in local mode
though...
Every time I try to run an example script (from the Pig wiki examples), I get
this:
2009-12-30 20:50:08,872 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to
hadoop file system at: file:///
2009-12-30 20:50:09,037 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics -
Initializing JVM Metrics with processName=JobTracker, sessionId=
Furthermore, from grunt shell I seem to be connected to the local filesystem
grunt> ls shows me all the local files and not the HDFS ones.
I don't know how to make the settings in the file
<PIG_HOME>/conf/pig.properties ,please note also that I have created this file
manually. Which environments variables --PIG_CLASSPATH,HADOOPDIR , other ?--
should I set there? Should it be this way:
<property><name>....</name><value>....</value></property>
or this way: export <variable_name>=value ?
Any example concerning this file would be highly appreciated, as I didn't find
any so far.
I also tried to change pig execution mode using this command 'pig -x mapreduce'
but I took this message in bash, pig: invalid option -- 'x'
You can find below the full error stack I took when I tried to access a file
from the hdfs through grunt shell.
My commands in the grunt shell
grunt> A= load 'hdfs://master:54310/id.out';
grunt> dump A;
(I took the same error below by using this command that describes the full path
to the file in the hdfs.
grunt> A= load 'hdfs://master:54310/user/hadoop/id.out';
grunt> dump A; )
The error stack
2009-12-30 21:11:15,266 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics
- Cannot initialize JVM Metrics with processName=JobTracker, sessionId= -
already initialized
2009-12-30 21:11:15,268 [Thread-21] WARN org.apache.hadoop.mapred.JobClient -
Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2009-12-30 21:11:20,267 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2009-12-30 21:11:20,267 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Map reduce job failed
2009-12-30 21:11:20,267 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- java.io.IOException: Call failed on local exception
at org.apache.hadoop.ipc.Client.call(Client.java:718)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
at org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:103)
at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:173)
at
org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:67)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1339)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1351)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:189)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:499)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:441)
2009-12-30 21:11:20,268 [main] ERROR org.apache.pig.tools.grunt.GruntParser -
java.io.IOException: Unable to open iterator for alias: A [Job terminated with
anomalous status FAILED]
at org.apache.pig.PigServer.openIterator(PigServer.java:410)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178)
at
org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:94)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
at org.apache.pig.Main.main(Main.java:282)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
... 6 more
2009-12-30 21:11:20,268 [main] ERROR org.apache.pig.tools.grunt.GruntParser -
Unable to open iterator for alias: A [Job terminated with anomalous status
FAILED]
2009-12-30 21:11:20,268 [main] ERROR org.apache.pig.tools.grunt.GruntParser -
java.io.IOException: Unable to open iterator for alias: A [Job terminated with
anomalous status FAILED]
If any more experienced user can figure out what the problem(s)is, I would be
grateful!
Regards,
Anastasia Th.
_________________________________________________________________
Windows Live: Keep your friends up to date with what you do online.
http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_1:092010