Re: Review Request 64688: HIVE-18218

Deepak Jaiswal Fri, 09 Feb 2018 20:33:53 -0800


> On Feb. 10, 2018, 2:44 a.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java
> > Line 548 (original), 579 (patched)
> > <https://reviews.apache.org/r/64688/diff/3/?file=1955847#file1955847line580>
> >
> >     If a bucket file is missing in the list of files, then bucketNum < 
> > numBuckets .. so this will trigger the fallback loop below?


Yes. It is part of the fallback logic which is basically the existing logic and 
works with existing user data.

If a file is missing, the join is screwed up, just like it is right now. With 
the current naming convention it is not possible to identify a file with its 
name.


- Deepak


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64688/#review197216
-----------------------------------------------------------


On Feb. 10, 2018, 12:41 a.m., Deepak Jaiswal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64688/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2018, 12:41 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Jason Dere.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucket based Join : Handle buckets with no splits.
> 
> The current logic in CustomPartitionVertex assumes that there is a split for 
> each bucket whereas in Tez, we can have no splits for empty buckets.
> Also falls back to reduceside join if small table has more buckets than big 
> table.
> 
> Disallow loading files in bucketed tables if the file name format is not like 
> 000000_0, 000001_0_copy_1 etc.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java 
> 26afe90faa 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java 
> ef5e7edcd6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 9885038588 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
> dc698c8de8 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 54f5bab6de 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_16.q 8216b538c2 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_16.q.out 
> 91408df129 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_16.q.out_spark 
> 91408df129 
> 
> 
> Diff: https://reviews.apache.org/r/64688/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>

Re: Review Request 64688: HIVE-18218

Reply via email to