Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19430 )

Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table
......................................................................


Patch Set 13:

> Patch Set 13:
>
> (1 comment)
>
> > Patch Set 13:
> >
> > (1 comment)

The problem is that it would not be practical to check the block locations for 
for potential relocations when doing the query planning.  Given N blocks in a 
bucket for one table and M blocks for the second table, it would be O(N+M) time 
to decide which distribution method to use. This would add up depending on the 
number of joins in the query. We really want to 'pin' the location but AFAIK 
HDFS does not allow us to do that. Other systems such as MemSQL that do bucket 
join don't have to worry about this since the data is memory resident.


--
To view, visit http://gerrit.cloudera.org:8080/19430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316
Gerrit-Change-Number: 19430
Gerrit-PatchSet: 13
Gerrit-Owner: Baike Xia <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Baike Xia <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Wed, 22 Feb 2023 08:21:43 +0000
Gerrit-HasComments: No

Reply via email to