[ https://issues.apache.org/jira/browse/PIG-546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates reassigned PIG-546: ------------------------------ Assignee: Santhosh Srinivasan > FilterFunc calls empty constructor when it should be calling parameterized > constructor > -------------------------------------------------------------------------------------- > > Key: PIG-546 > URL: https://issues.apache.org/jira/browse/PIG-546 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.2.0 > Reporter: Viraj Bhat > Assignee: Santhosh Srinivasan > Fix For: 0.2.0 > > Attachments: FILTERFROMFILE.java, insetfilterfile, mydata.txt, > PIG-546.patch > > > The following piece of Pig Script uses a custom UDF known as FILTERFROMFILE > which extends the FilterFunc. It contains two constructors, an empty > constructor which is mandatory and the parameterized constructor. The > parameterized constructor passes the HDFS filename, which the exec function > uses to construct a HashMap. The HashMap is later used for filtering records > based on the match criteria in the HDFS file. > {code} > register util.jar; > --util.jar contains the FILTERFROMFILE class > define FILTER_CRITERION util.FILTERFROMFILE('/user/viraj/insetfilterfile'); > RAW_LOGS = load 'mydata.txt' as (url:chararray, numvisits:int); > FILTERED_LOGS = filter RAW_LOGS by FILTER_CRITERION(numvisits); > dump FILTERED_LOGS; > {code} > When you execute the above script, it results in a single Map only job with > 1 Map. It seems that the empty constructor is called 5 times, and ultimately > results in failure of the job. > =========================================== > parameterized constructor: /user/viraj/insetfilterfile > parameterized constructor: /user/viraj/insetfilterfile > empty constructor > empty constructor > empty constructor > empty constructor > empty constructor > =========================================== > Error in the Hadoop backend > =========================================== > java.lang.IllegalArgumentException: Can not create a Path from an empty string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) > at org.apache.hadoop.fs.Path.(Path.java:90) > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:199) > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:130) > at > org.apache.pig.impl.io.FileLocalizer.openDFSFile(FileLocalizer.java:164) > at util.FILTERFROMFILE.init(FILTERFROMFILE.java:70) > at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:89) > at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:52) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:179) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:217) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:170) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) > =========================================== > Attaching the sample data and the filter function UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.