Dipankar created PIG-4541:
-----------------------------
Summary: Skewed full outer join does not return records if any
relation is empty.
Key: PIG-4541
URL: https://issues.apache.org/jira/browse/PIG-4541
Project: Pig
Issue Type: Bug
Components: build
Affects Versions: 0.14.0
Environment: HDP 2.2.4
Reporter: Dipankar
Test1:
Perform full join on two relation with left relation being blank and right
containing records
empty_relation = FILTER a_relation by (join_column=='eliminate everything');
Test_output = JOIN empty_relation by (join_column) FULL , non_empty_relation by
(join_column);
Result : Zero records returned.
Test2:
Perform full join on two relation with left relation being blank and right
containing records using skewed
Test_output = JOIN empty_relation by (join_column) FULL , non_empty_relation by
(join_column) using ‘skewed’;
Result : Zero records returned.
Test3:
Perform full join on two relation with left relation being blank and right
containing records using parallel
Test_output = JOIN empty_relation by (join_column) FULL , non_empty_relation by
(join_column) PARALLEL 10;
Result : Zero records returned.
Test4:
Perform full join on two relation with left relation being non empty and right
not containing records using parallel
Test_output = JOIN , non_empty_relation by (join_column) FULL , empty_relation
by (join_column) PARALLEL 10;
Result : valid records returned.
Observation:
1) If the either relation is blank , skewed full outer join does not return
anything
2) If the non empty relation is kept on left, everything works except skewed
3) FULL OUTER will only work if the left relation is not empty
4) Skewed will only work if both relation is non empty.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)