Folks-
I am running into issues with passing parameters to a pig script within a Java
program.
Details are below.
Any pointers are greatly appreciated.
Version:
Pig 0.11.0
Attempted:
Run a pig script from Java, in mapreduce mode, passing script parameter map as
an argument to the PigServer RegisterScript method. One of the parameters is
the HDFS source data path.
Issue:
One of the parameters is the source HDFS data path.. The parameters passed are
not getting resolved. See error message pasted below. I am trying to understand
if there are issues with my code or if this is a bug with PigServer...Any help
is greatly appreciated.
Tested:
- Tested the pig script on CLI successfully
- Tested the embedded pig program successfully without parameters (hard-coded
parameter values in pig script directly)
Alternatives explored:
Tried passing the pig script parameters as part of the first parameter to
RegisterScript. Received the same error reported.
Error:
ERROR mapReduceLayer.Launcher: Backend error message during job submission
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path
does not exist: hdfs://xxxxx.xxxxx.net:8020/user/xxxxx/$INPUT_FILE
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1105)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1122)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:177)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1021)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:974)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:974)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)
at
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:319)
at
org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:239)
at
org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:270)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:160)
at java.lang.Thread.run(Thread.java:662)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
path does not exist: hdfs://xxxxx.xxxxx.net:8020/user/xxxxx/$INPUT_FILE
Java program:
import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.pig.ExecType;
import org.apache.pig.PigServer;
import org.apache.pig.backend.executionengine.ExecException;
import org.apache.pig.data.Tuple;
public class RunRecon {
public static void main(String args[]) {
PigServer pigServer;
Map<String, String> params = new HashMap<String, String>();
params.put("INPUT_FILE","hdfs://xxxxx.xxxxx.net:8020/data/pi/opsbia/callanalytics/rawlogs/nuance/processed/*"
);
params.put("SEARCH_STRING", ".*SWIclnd.*");
try {
try {
pigServer = new PigServer(ExecType.MAPREDUCE);
pigServer.registerScript("nuanceivranalytics/scripts/pig/reconUtil.pig",params);
Iterator<Tuple> it = pigServer.openIterator("recordCountDS");
while (it.hasNext()) {
System.out.println("Record count=" + it.next().get(1));
}
} catch (ExecException ex) {
Logger.getLogger(RunRecon.class.getName()).log(Level.SEVERE, null,
ex);
}
} catch (IOException ex) {
Logger.getLogger(RunRecon.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
________________________________
This communication, including attachments, is confidential, may be subject to
legal privileges, and is intended for the sole use of the addressee. Any use,
duplication, disclosure or dissemination of this communication, other than by
the addressee, is prohibited. If you have received this communication in error,
please notify the sender immediately and delete or destroy this communication
and all copies.
TRVDiscDefault::1201