PigServer not connecting to HDFS?

Zach Bailey Wed, 27 Oct 2010 16:26:03 -0700
                        Hi all,Facing a weird problem and wondering if anyone 
has run into this before. I've been playing with PigServer to programmatically 
run some simple pig scripts and it does not seem to be connecting to HDFS when 
I pass in ExecType.MAPREDUCE.I am running in pseudo-distributed mode and have 
the tasktracker and namenode both running on default ports. When I run scripts 
by using "pig script.pig" or from the grunt console it connects to hdfs and 
works fine.Do I need to specify some additional properties in the PigServer 
constructor, or construct a custom PigContext? I had assumed that by passing 
ExecType.MAPREDUCE and using the defaults, everything would be fine.Would 
really appreciate any insight or anecdotes of others using PigServer and how 
they have it set up. Thanks a bunch!-ZachHere is the code I'm using:PigServer 
pigServer = new 
PigServer("mapreduce");pigServer.setBatchOn();pigServer.registerScript("/Users/zach/Desktop/test.pig");List<ExecJob>
 jobs = pigServer.executeBat
 ch();and
 here is the log output:0    [main] INFO  
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine  - Connecting to 
hadoop file system at: file:///622  [main] INFO  
org.apache.pig.impl.logicalLayer.optimizer.PruneColumns  - No column pruned for 
pages622  [main] INFO  org.apache.pig.impl.logicalLayer.optimizer.PruneColumns  
- No map keys pruned for pages659  [main] INFO  
org.apache.hadoop.metrics.jvm.JvmMetrics  - Initializing JVM Metrics with 
processName=JobTracker, sessionId=751  [main] INFO  
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine  - (Name: 
Store(file:///output:PigStorage) - 1-70 Operator Key: 1-70)789  [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size before optimization: 1790  [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size after optimization: 1815  [main] INFO  
org.apache.hadoop.metrics.jvm.JvmMetri
 cs  - C
annot initialize JVM Metrics with processName=JobTracker, sessionId= - already 
initialized822  [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics  - Cannot 
initialize JVM Metrics with processName=JobTracker, sessionId= - already 
initialized822  [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
 - mapred.job.reduce.markreset.buffer.percent is not set, set to default 
0.32534 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
 - Setting up single store job2582 [main] INFO  
org.apache.hadoop.metrics.jvm.JvmMetrics  - Cannot initialize JVM Metrics with 
processName=JobTracker, sessionId= - already initialized2582 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- 1 map-reduce job(s) waiting for submission.2590 [Thread-4] WARN  
org.apache.hadoop.mapred.JobClient  - Use GenericOptionsParser for parsing the 
arguments. Applications should imp
 lement T
ool for the same.2746 [Thread-4] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics 
 - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - 
already initialized2765 [Thread-4] INFO  
org.apache.hadoop.metrics.jvm.JvmMetrics  - Cannot initialize JVM Metrics with 
processName=JobTracker, sessionId= - already initialized3083 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- 0% complete3084 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- 100% complete3084 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- 1 map reduce job(s) failed!3085 [main] WARN  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher  - There 
is no log file to write to.3085 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher  - 
Backend error message during job 
submissionorg.apache.pig.backend.executionengine
 .ExecExc
eption: ERROR 2118: Unable to create input splits for: file:///input    at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
       at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) 
       at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)     at 
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)     at 
org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) at 
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
   at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)   
   at java.lang.Thread.run(Thread.java:637)Caused by: 
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does 
not exist: file:/input  at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224)
   at 
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:55)
    at org.apac
 he.hadoo
p.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)       
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:258)
       ... 7 more3092 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- Failed to produce result in: "file:///output"3092 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- Failed!
PigServer not connecting to HDFS?

Reply via email to