The logic inside of my exec() function is different than that of FILTERFROMFILE.java, but the rest of my class differs very little except for the fact that I have two parameters. The other difference and what causes my FilterFunc implementation to fail is the Override of getArgToFuncMapping(). I don't really need that, so I've commented it out and everything works fine now. I'm not sure why the Override was a problem however.

 /* (non-Javadoc)
  * @see org.apache.pig.EvalFunc#getArgToFuncMapping()
* This is needed to make sure that both bytearrays and chararrays can be passed as arguments
  */
 @Override
 public List<FuncSpec> getArgToFuncMapping() throws FrontendException {
     List<FuncSpec> funcList = new ArrayList<FuncSpec>();
funcList.add(new FuncSpec(this.getClass().getName(), new Schema(new Schema.FieldSchema(null, DataType.CHARARRAY))));

     return funcList;
 }

-Sean

Alan Gates wrote:
Can you include the load function from your script to show how you're using it? One issue is that you cannot define constructor arguments for your load function in DEFINE, you have to do it in LOAD, USING X(args go here). Also, the load function is called on the user's box with arguments passed to it in the USING clause. It is then serialized and passed to the hadoop machines, where it is deserialized. At this point the default constructor is called (because that's how Java deserializes objects). So if those constructor arguments are needed on the backend they need to be cached when the function is constructed on the front end. So you may need to add logic to explicitly store the filename so it's available at run time.

Alan.

On Apr 20, 2009, at 2:27 PM, Sean Timm wrote:

PIG-546 indicates that it is now possible to pass arguments into a custom UDF filter function via a parameterized constructor. I'm using a TRUNK build from April 1 (svn rev. 761067) which appears to have the patch applied, but I'm getting the same errors that the patch describes. Should this work? Is there a better way to pass parameters/configuration into a UDF filter function?

The parameterized constructor is called 3 times, followed by the default constructor being called 4 times.

On the Hadoop backend:

2009-04-20 17:11:29,935 ERROR com.aol.search.pig.udf.ValidateQuery: default constructor 2009-04-20 17:11:30,034 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.lang.IllegalArgumentException: Can not create a Path from a null string

  at org.apache.hadoop.fs.Path.checkPathArg(Path.java:78)
  at org.apache.hadoop.fs.Path.<init>(Path.java:90)
at com.aol.search.pig.udf.ValidateQuery.loadList(ValidateQuery.java:74)
  at com.aol.search.pig.udf.ValidateQuery.init(ValidateQuery.java:66)
  at com.aol.search.pig.udf.ValidateQuery.exec(ValidateQuery.java:91)
  at com.aol.search.pig.udf.ValidateQuery.exec(ValidateQuery.java:35)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:251) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:217) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:208) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)

Thanks,
Sean


Reply via email to