Issues with embedded Pig (Java) and passing paramaters to pig script

Khanolkar,Anagha Wed, 14 May 2014 12:43:06 -0700

Folks-
I am running into issues with passing parameters to a pig script within a Java 
program.
Details are below.
Any pointers are greatly appreciated.



Version:
Pig 0.11.0

Attempted:
Run a pig script from Java, in mapreduce mode, passing script parameter map as 
an argument to the PigServer RegisterScript method. One of the parameters is 
the HDFS source data path.

Issue:
One of the parameters is the source HDFS data path.. The parameters passed are 
not getting resolved. See error message pasted below. I am trying to understand 
if there are issues with my code or if this is a bug with PigServer...Any help 
is greatly appreciated.

Tested:
- Tested the pig script on CLI successfully
- Tested the embedded pig program successfully without parameters (hard-coded 
parameter values in pig script directly)

Alternatives explored:
Tried passing the pig script parameters as part of the first parameter to 
RegisterScript. Received the same error reported.

Error:
ERROR mapReduceLayer.Launcher: Backend error message during job submission
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path 
does not         exist: hdfs://xxxxx.xxxxx.net:8020/user/xxxxx/$INPUT_FILE
    at     
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
    at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1105)
    at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1122)
    at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:177)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1021)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:974)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:974)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)
    at 
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:319)
    at 
org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:239)
    at 
org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:270)
    at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:160)
    at java.lang.Thread.run(Thread.java:662)
    at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input 
path does not exist: hdfs://xxxxx.xxxxx.net:8020/user/xxxxx/$INPUT_FILE

Java program:
import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.pig.ExecType;
import org.apache.pig.PigServer;
import org.apache.pig.backend.executionengine.ExecException;
import org.apache.pig.data.Tuple;


public class RunRecon {

public static void main(String args[]) {

    PigServer pigServer;

    Map<String, String> params = new HashMap<String, String>();
    
params.put("INPUT_FILE","hdfs://xxxxx.xxxxx.net:8020/data/pi/opsbia/callanalytics/rawlogs/nuance/processed/*"
 );
    params.put("SEARCH_STRING", ".*SWIclnd.*");

    try {
        try {
            pigServer = new PigServer(ExecType.MAPREDUCE);
            
pigServer.registerScript("nuanceivranalytics/scripts/pig/reconUtil.pig",params);

            Iterator<Tuple> it = pigServer.openIterator("recordCountDS");
            while (it.hasNext()) {
                System.out.println("Record count=" + it.next().get(1));
            }
        } catch (ExecException ex) {
            Logger.getLogger(RunRecon.class.getName()).log(Level.SEVERE, null, 
ex);
        }
    } catch (IOException ex) {
        Logger.getLogger(RunRecon.class.getName()).log(Level.SEVERE, null, ex);
    }
}
}

________________________________
This communication, including attachments, is confidential, may be subject to 
legal privileges, and is intended for the sole use of the addressee. Any use, 
duplication, disclosure or dissemination of this communication, other than by 
the addressee, is prohibited. If you have received this communication in error, 
please notify the sender immediately and delete or destroy this communication 
and all copies.

TRVDiscDefault::1201

Issues with embedded Pig (Java) and passing paramaters to pig script

Reply via email to