-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63427/#review189657
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Line 1417 (original), 1417 (patched)
<https://reviews.apache.org/r/63427/#comment266862>

    Trying to think about how to think about this setting if we're going to use 
this for tuning. I think a better way of being able to think about this setting 
is, what kind of selectivity we want from the semijoin reduction before we 
decide it is worth keeping. For me this setting might be a bit more intuitive 
(basically float value between 0-1) - for example setting config to 0.5, 
compared to what you have now, where I think you would set it to 2.0.



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 1419 (patched)
<https://reviews.apache.org/r/63427/#comment266860>

    Some concerns about long-to-float conversion going on here .. nDVs is cast 
to float then multiplied to 1.0, and nDVsOfTS is also converted to float during 
this comparison. This could affect the comparisoin results. Maybe cast 
nDVsFactored to long when doing the comparision to be safe?



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 1421 (patched)
<https://reviews.apache.org/r/63427/#comment266863>

    If you are logging here, mention that setShouldRemove is being set. Would 
be more useful for someone looking at the logs.


- Jason Dere


On Oct. 30, 2017, 7:09 p.m., Deepak Jaiswal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63427/
> -----------------------------------------------------------
> 
> (Updated Oct. 30, 2017, 7:09 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Jason Dere.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin 
> branches
> 
> In method markSemiJoinForDPP (HIVE-17399), the nDVs comparison should not 
> have equality as there is a chance that the values are same on both sides and 
> the branch is still marked as good when it shouldn't be.
> Add a configurable factor to see how useful this is if nDVs on smaller side 
> are only slightly less than that on TS side.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 6631a6e45d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java da30c3b642 
>   ql/src/test/queries/clientpositive/dynamic_semijoin_reduction.q 6cc0a7f7a9 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 
> 1a1a4d9b2d 
> 
> 
> Diff: https://reviews.apache.org/r/63427/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>

Reply via email to