lineage tracking for casting should compare LoadCaster returned from LoadFunc instead of comparing the FuncSpec ----------------------------------------------------------------------------------------------------------------
Key: PIG-2023 URL: https://issues.apache.org/jira/browse/PIG-2023 Project: Pig Issue Type: Improvement Reporter: Thejas M Nair Fix For: 0.10 When lineage of a column is tracked for the purpose of finding the LoadCaster associated with a column, and it finds that a column has two possible sources, it associates a LoadCaster (through a LoadFunc) only if the funcspec for LoadFunc in both cases are the same. But it is possible that the two LoadFunc with different func spec actually use the same LoadCaster (for example the default of Utf8StorageConverter). If the LoadFunc funcspec don't match, the LoadCaster returned by the LoadFunc should also be compred. If they are equal, this LoadCaster should be associated with the column . The LoadCaster implementation would need to override equals(). For example, in this case the columns in relation u use the same LoadCaster - {code} l1 = load 'x' using PigStorage(',') as (a,b); l2 = load 'y' using PigStorage(':') as (a,b); u = union l1,l2; {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira