Creation of output path should be done by storage function
----------------------------------------------------------

                 Key: PIG-1174
                 URL: https://issues.apache.org/jira/browse/PIG-1174
             Project: Pig
          Issue Type: Bug
            Reporter: Bill Graham


When executing a STORE command, Pig creates the output location before the 
storage function gets called. This causes problems with storage functions that 
have logic to determine the output location. See this thread:

http://www.mail-archive.com/pig-user%40hadoop.apache.org/msg01538.html

For example, when making a request like this:

STORE A INTO '/my/home/output' USING MultiStorage('/my/home/output','0', 
'none', '\t');

Pig creates a file '/my/home/output' and then an exception is thrown when 
MultiStorage tries to make a directory under '/my/home/output'. The workaround 
is to instead specify a dummy location as the first path like so:

STORE A INTO '/my/home/output/temp' USING MultiStorage('/my/home/output','0', 
'none', '\t');

Two changes should be made:
1. The path specified in the INTO clause should be available to the storage 
function so it doesn't need to be duplicated.
2. The creation of the output paths should be delegated to the storage function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to