[ https://issues.apache.org/jira/browse/PIG-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich resolved PIG-1168. --------------------------------- Resolution: Won't Fix This is by design. Dump is meant for interactive, not batch mode and as such is executed right away and not as part of multiquery > Dump produces wrong results > --------------------------- > > Key: PIG-1168 > URL: https://issues.apache.org/jira/browse/PIG-1168 > Project: Pig > Issue Type: Bug > Reporter: Ankur > > For a map-only job, dump just re-executes every pig-latin statement from the > begininng assuming that they would produce same result. the assumption is not > valid if there are UDFs that are invoked. Consider the following script:- > raw = LOAD '$input' USING PigStorage() AS (text_string:chararray); > DUMP raw; > ccm = FOREACH raw GENERATE MyUDF(text_string); > DUMP ccm; > bug = FOREACH ccm GENERATE ccmObj; > DUMP bug; > The UDF MyUDF generates a tuple with one of the fields being a randomly > generated UUID. So even though one would expect relations 'ccm' and 'bug' to > contain identical data, they are different because of re-execution from the > begininng. This breaks the application logic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.