[ 
https://issues.apache.org/jira/browse/PIG-546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santhosh Srinivasan updated PIG-546:
------------------------------------

    Patch Info: [Patch Available]

> FilterFunc calls empty constructor when it should be calling parameterized 
> constructor
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-546
>                 URL: https://issues.apache.org/jira/browse/PIG-546
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Viraj Bhat
>             Fix For: types_branch
>
>         Attachments: FILTERFROMFILE.java, insetfilterfile, mydata.txt, 
> PIG-546.patch
>
>
> The following piece of Pig Script uses a custom UDF known as FILTERFROMFILE 
> which extends the FilterFunc. It contains two constructors, an empty 
> constructor which is mandatory and the parameterized constructor. The 
> parameterized constructor  passes the HDFS filename, which the exec function 
> uses to construct a HashMap. The HashMap is later used for filtering records 
> based on the match criteria in the HDFS file.
> {code}
> register util.jar;
> --util.jar contains the FILTERFROMFILE class
> define FILTER_CRITERION util.FILTERFROMFILE('/user/viraj/insetfilterfile');
> RAW_LOGS = load 'mydata.txt' as (url:chararray, numvisits:int);
> FILTERED_LOGS = filter RAW_LOGS by FILTER_CRITERION(numvisits);
> dump FILTERED_LOGS;
> {code}
> When you execute the above script,  it results in a single Map only job with 
> 1 Map. It seems that the empty constructor is called 5 times, and ultimately 
> results in failure of the job.
> ===========================================
> parameterized constructor: /user/viraj/insetfilterfile
> parameterized constructor: /user/viraj/insetfilterfile
> empty constructor
> empty constructor
> empty constructor
> empty constructor
> empty constructor
> ===========================================
> Error in the Hadoop backend
> ===========================================
> java.lang.IllegalArgumentException: Can not create a Path from an empty string
>       at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
>       at org.apache.hadoop.fs.Path.(Path.java:90)
>       at 
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:199)
>       at 
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:130)
>       at 
> org.apache.pig.impl.io.FileLocalizer.openDFSFile(FileLocalizer.java:164)
>       at util.FILTERFROMFILE.init(FILTERFROMFILE.java:70)
>       at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:89)
>       at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:52)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:179)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:217)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:170)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>       at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
> ===========================================
> Attaching the sample data and the filter function UDF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to