[jira] Commented: (HIVE-1213) sort-merge join does not work for sub-queries

Namit Jain (JIRA) Fri, 05 Mar 2010 18:07:53 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842153#action_12842153
 ]


Namit Jain commented on HIVE-1213:
----------------------------------


set hive.enforce.bucketing = true;
set hive.enforce.sorting = true;

drop table bucketmapjoin6_1;
CREATE TABLE bucketmapjoin6_1(key int, value string) CLUSTERED BY (key) SORTED 
BY (key) INTO 2 BUCKETS;

drop table bucketmapjoin6_2;
CREATE TABLE bucketmapjoin6_2(key int, value string) CLUSTERED BY (key) SORTED 
BY (key) INTO 2 BUCKETS;

drop table bucketmapjoin6_3;
CREATE TABLE bucketmapjoin6_3(key int, value string) CLUSTERED BY (key) SORTED 
BY (key) INTO 2 BUCKETS;

insert overwrite table bucketmapjoin6_1
select * from src where key < 100 and key != 44 and key != 28 and key != 15;

select * from bucketmapjoin6_1;


insert overwrite table bucketmapjoin6_2
select * from src where key > 20 and key < 50;

select * from bucketmapjoin6_2;

insert overwrite table bucketmapjoin6_3
select * from src where key > 10 and key < 30;

select * from bucketmapjoin6_3;

explain
select * from
(
select
  CASE
    WHEN M.key is not null THEN M.key
    ELSE S.key
  END as key,
  M.value as Mvalue, S.value as Svalue from
(select * from bucketmapjoin6_1 where (key <= 20 or key >= 50) ) M full outer 
join bucketmapjoin6_2 S on
M.key = S.key
) subq2
full outer join
bucketmapjoin6_3 L
on subq2.key = L.key;


set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;

explain
select /*+ MAPJOIN(L) */ * from
(
select /*+ MAPJOIN(S) */
  CASE
    WHEN M.key is not null THEN M.key
    ELSE S.key
  END as key,
  M.value as Mvalue, S.value as Svalue from
(select * from bucketmapjoin6_1 where (key <= 20 or key >= 50) ) M full outer 
join bucketmapjoin6_2 S on
M.key = S.key
) subq2
full outer join
bucketmapjoin6_3 L
on subq2.key = L.key;

set hive.optimize.bucketmapjoin = false;
select * from
(
select
  CASE
    WHEN M.key is not null THEN M.key
    ELSE S.key
  END as key,
  M.value as Mvalue, S.value as Svalue from
(select * from bucketmapjoin6_1 where (key <= 20 or key >= 50) ) M full outer 
join bucketmapjoin6_2 S on
M.key = S.key
) subq2
full outer join
bucketmapjoin6_3 L
on subq2.key = L.key;

---

The mapjoin query should be executed as a single map-only job

> sort-merge join does not work for sub-queries
> ---------------------------------------------
>
>                 Key: HIVE-1213
>                 URL: https://issues.apache.org/jira/browse/HIVE-1213
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>             Fix For: 0.6.0
>
>
> A query like:
> select count(1) from (select /*+ MAPJOIN(x) */ from x join y ON ... ) subq;
> does not work - since there is no mapping between the join operator and the 
> corresponding source

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1213) sort-merge join does not work for sub-queries

Reply via email to