Dump produces wrong results --------------------------- Key: PIG-1168 URL: https://issues.apache.org/jira/browse/PIG-1168 Project: Pig Issue Type: Bug Reporter: Ankur
For a map-only job, dump just re-executes every pig-latin statement from the begininng assuming that they would produce same result. the assumption is not valid if there are UDFs that are invoked. Consider the following script:- raw = LOAD '$input' USING PigStorage() AS (text_string:chararray); DUMP raw; ccm = FOREACH raw GENERATE MyUDF(text_string); DUMP ccm; bug = FOREACH ccm GENERATE ccmObj; DUMP bug; The UDF MyUDF generates a tuple with one of the fields being a randomly generated UUID. So even though one would expect relations 'ccm' and 'bug' to contain identical data, they are different because of re-execution from the begininng. This breaks the application logic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.