[jira] Updated: (PIG-1272) Column pruner causes wrong results
[ https://issues.apache.org/jira/browse/PIG-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1272: Status: Patch Available (was: Reopened) Column pruner causes wrong results -- Key: PIG-1272 URL: https://issues.apache.org/jira/browse/PIG-1272 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1272-1.patch, PIG-1272-2.patch For a simple script the column pruner optimization removes certain columns from the original relation, which results in wrong results. Input file kv contains the following columns (tab separated) {code} a 1 a 2 a 3 b 4 c 5 c 6 b 7 d 8 {code} Now running this script in Pig 0.6 produces {code} kv = load 'kv' as (k,v); keys= foreach kv generate k; keys = distinct keys; keys = limit keys 2; rejoin = join keys by k, kv by k; dump rejoin; {code} (a,a) (a,a) (a,a) (b,b) (b,b) Running this in Pig 0.5 version without column pruner results in: (a,a,1) (a,a,2) (a,a,3) (b,b,4) (b,b,7) When we disable the ColumnPruner optimization it gives right results. Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1272) Column pruner causes wrong results
[ https://issues.apache.org/jira/browse/PIG-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1272: Resolution: Fixed Status: Resolved (was: Patch Available) Manual unit test pass. Column pruner causes wrong results -- Key: PIG-1272 URL: https://issues.apache.org/jira/browse/PIG-1272 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1272-1.patch, PIG-1272-2.patch For a simple script the column pruner optimization removes certain columns from the original relation, which results in wrong results. Input file kv contains the following columns (tab separated) {code} a 1 a 2 a 3 b 4 c 5 c 6 b 7 d 8 {code} Now running this script in Pig 0.6 produces {code} kv = load 'kv' as (k,v); keys= foreach kv generate k; keys = distinct keys; keys = limit keys 2; rejoin = join keys by k, kv by k; dump rejoin; {code} (a,a) (a,a) (a,a) (b,b) (b,b) Running this in Pig 0.5 version without column pruner results in: (a,a,1) (a,a,2) (a,a,3) (b,b,4) (b,b,7) When we disable the ColumnPruner optimization it gives right results. Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1272) Column pruner causes wrong results
[ https://issues.apache.org/jira/browse/PIG-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1272: Attachment: PIG-1272-1.patch Column pruner causes wrong results -- Key: PIG-1272 URL: https://issues.apache.org/jira/browse/PIG-1272 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1272-1.patch For a simple script the column pruner optimization removes certain columns from the original relation, which results in wrong results. Input file kv contains the following columns (tab separated) {code} a 1 a 2 a 3 b 4 c 5 c 6 b 7 d 8 {code} Now running this script in Pig 0.6 produces {code} kv = load 'kv' as (k,v); keys= foreach kv generate k; keys = distinct keys; keys = limit keys 2; rejoin = join keys by k, kv by k; dump rejoin; {code} (a,a) (a,a) (a,a) (b,b) (b,b) Running this in Pig 0.5 version without column pruner results in: (a,a,1) (a,a,2) (a,a,3) (b,b,4) (b,b,7) When we disable the ColumnPruner optimization it gives right results. Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.