[
https://issues.apache.org/jira/browse/PIG-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279967#comment-13279967
]
Bejoy KS commented on PIG-2713:
-------------------------------
The following script throws the above mentioned error
{code}
data_1 = LOAD '/userdata/bejoy/samples/pigissue/input1'
as (
clmn_1:int,
clmn_2:int,
clmn_3:chararray,
clmn_4:chararray,
unique_id:chararray,
clmn_6:chararray,
clmn_7:chararray,
clmn_8:chararray,
clmn_9:chararray,
clmn_10:chararray,
clmn_11:chararray,
clmn_12:int,
num_sessions:int,
clmn_14:int,
clmn_15:int
);
data_2 = LOAD '/userdata/bejoy/samples/pigissue/input2'
as (
unique_id
);
good_use_data = join use_data by unique_id, good_users by unique_id USING
'merge';
top_grouping = group good_use_data all;
top_users = foreach top_grouping generate TOP($TOP_COUNT, 2, good_use_data);
user_lines = foreach top_users generate flatten($0);
top_data = foreach user_lines generate use_data.unique_id, num_sessions;
store top_data into '/userdata/bejoy/samples/pigissue/output/top_users';
{code}
If the same script is modified to use different column names (unique_id) then
it works flawlessly.
Modified Script:
{code}
.
.
.
data_2 = LOAD '/userdata/bejoy/samples/pigissue/input2'
as (
data_2_unique_id
);
good_use_data = join use_data by unique_id, good_users by data_2_unique_id
USING 'merge';
top_grouping = group good_use_data all;
top_users = foreach top_grouping generate TOP($TOP_COUNT, 2, good_use_data);
user_lines = foreach top_users generate flatten($0);
top_data = foreach user_lines generate unique_id, num_sessions;
store top_data into '/userdata/bejoy/samples/pigissue/output/top_users';
{code}
> Pig query planner throwing parse error on Joins
> ------------------------------------------------
>
> Key: PIG-2713
> URL: https://issues.apache.org/jira/browse/PIG-2713
> Project: Pig
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.8.1, 0.9.2
> Environment: CentOS 6
> Reporter: Bejoy KS
>
> Pig parser is throwing an exception when two columns in a table has the same
> name and when they are used as part of some projection operation after join.
> Error message
> ERROR 1103: Merge join/Cogroup only supports Filter, Foreach, filter and Load
> as its predecessor. Found :
> Error would be thrown for common join as well.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira