David Ciemiewicz commented on PIG-756:

BTW, there used to be a mechanism to do this in early versions of Pig that was 
last in the transition to the new execution system.

> UDFs should have API for transparently opening and reading files from HDFS or 
> from local file system with only relative path
> ----------------------------------------------------------------------------------------------------------------------------
>                 Key: PIG-756
>                 URL: https://issues.apache.org/jira/browse/PIG-756
>             Project: Pig
>          Issue Type: Bug
>            Reporter: David Ciemiewicz
> I have a utility function util.INSETFROMFILE() that I pass a file name during 
> initialization.
> {code}
> define inQuerySet util.INSETFROMFILE(analysis/queries);
> A = load 'logs' using PigStorage() as ( date int, query chararray );
> B = filter A by inQuerySet(query);
> {code}
> This provides a computationally inexpensive way to effect map-side joins for 
> small sets plus functions of this style provide the ability to encapsulate 
> more complex matching rules.
> For rapid development and debugging purposes, I want this code to run without 
> modification on both my local file system when I do pig -exectype local and 
> on HDFS.
> Pig needs to provide an API for UDFs which allow them to either:
> 1) "know"  when they are in local or HDFS mode and let them open and read 
> from files as appropriate
> 2) just provide a file name and read statements and have pig transparently 
> manage local or HDFS opens and reads for the UDF
> UDFs need to read configuration information off the filesystem and it 
> simplifies the process if one can just flip the switch of -exectype local.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to