Gunther Hagleitner resolved PIG-757.

    Resolution: Duplicate

> Using schemes in load and store paths
> -------------------------------------
>                 Key: PIG-757
>                 URL: https://issues.apache.org/jira/browse/PIG-757
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Gunther Hagleitner
> As part of the multiquery optimization work there's a need to use absolute 
> paths for load and store operations (because the current directory changes 
> during the execution of the script). In order to do so, the suggestion is to 
> change the semantics of the location/filename string used in LoadFunc and 
> Slicer/Slice.
> The proposed change is:
>    * Load locations without a scheme part are expected to be hdfs (mapreduce 
> mode) or local (local mode) paths
>    * Any hdfs or local path will be translated to a fully qualified absolute 
> path before it is handed to either a LoadFunc or Slicer
>    * Any scheme other than file or hdfs will result in the load path be 
> passed through to the LoadFunc or Slicer without any modification.
> Example:
> If you have a LoadFunc that reads from a database, right now the following 
> could be used:
> {{{
> a = load 'table' using DBLoader();
> }}}
> With the proposed changes table would be translated into an hdfs path though 
> ("hdfs://..../table"). Probably not what the loader wants to see. So in order 
> to make this work one would use:
> {{{
> a = load 'sql://table' using DBLoader();
> }}}
> Now the DBLoader would see the unchanged string "sql://table". And pig will 
> not use the string as an hdfs location.
> This is an incompatible change but it's hopefully few existing 
> Slicers/Loaders that are affected. This behavior is part of the multiquery 
> work and can be turned off (reverted back) by using the "no_multiquery" flag.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to