[ https://issues.apache.org/jira/browse/PIG-757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gunther Hagleitner resolved PIG-757. ------------------------------------ Resolution: Duplicate > Using schemes in load and store paths > ------------------------------------- > > Key: PIG-757 > URL: https://issues.apache.org/jira/browse/PIG-757 > Project: Pig > Issue Type: Bug > Reporter: Gunther Hagleitner > > As part of the multiquery optimization work there's a need to use absolute > paths for load and store operations (because the current directory changes > during the execution of the script). In order to do so, the suggestion is to > change the semantics of the location/filename string used in LoadFunc and > Slicer/Slice. > The proposed change is: > * Load locations without a scheme part are expected to be hdfs (mapreduce > mode) or local (local mode) paths > * Any hdfs or local path will be translated to a fully qualified absolute > path before it is handed to either a LoadFunc or Slicer > * Any scheme other than file or hdfs will result in the load path be > passed through to the LoadFunc or Slicer without any modification. > Example: > If you have a LoadFunc that reads from a database, right now the following > could be used: > {{{ > a = load 'table' using DBLoader(); > }}} > With the proposed changes table would be translated into an hdfs path though > ("hdfs://..../table"). Probably not what the loader wants to see. So in order > to make this work one would use: > {{{ > a = load 'sql://table' using DBLoader(); > }}} > Now the DBLoader would see the unchanged string "sql://table". And pig will > not use the string as an hdfs location. > This is an incompatible change but it's hopefully few existing > Slicers/Loaders that are affected. This behavior is part of the multiquery > work and can be turned off (reverted back) by using the "no_multiquery" flag. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.