> On Oct. 30, 2017, 11:19 p.m., Jason Dere wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java > > Line 1417 (original), 1417 (patched) > > <https://reviews.apache.org/r/63427/diff/1/?file=1872832#file1872832line1417> > > > > Trying to think about how to think about this setting if we're going to > > use this for tuning. I think a better way of being able to think about this > > setting is, what kind of selectivity we want from the semijoin reduction > > before we decide it is worth keeping. For me this setting might be a bit > > more intuitive (basically float value between 0-1) - for example setting > > config to 0.5, compared to what you have now, where I think you would set > > it to 2.0.
Thanks. I will go with this approach. > On Oct. 30, 2017, 11:19 p.m., Jason Dere wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java > > Lines 1419 (patched) > > <https://reviews.apache.org/r/63427/diff/1/?file=1872832#file1872832line1419> > > > > Some concerns about long-to-float conversion going on here .. nDVs is > > cast to float then multiplied to 1.0, and nDVsOfTS is also converted to > > float during this comparison. This could affect the comparisoin results. > > Maybe cast nDVsFactored to long when doing the comparision to be safe? Will cast to long. > On Oct. 30, 2017, 11:19 p.m., Jason Dere wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java > > Lines 1421 (patched) > > <https://reviews.apache.org/r/63427/diff/1/?file=1872832#file1872832line1421> > > > > If you are logging here, mention that setShouldRemove is being set. > > Would be more useful for someone looking at the logs. Thanks for the suggestion. - Deepak ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63427/#review189657 ----------------------------------------------------------- On Oct. 30, 2017, 7:09 p.m., Deepak Jaiswal wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63427/ > ----------------------------------------------------------- > > (Updated Oct. 30, 2017, 7:09 p.m.) > > > Review request for hive, Ashutosh Chauhan and Jason Dere. > > > Repository: hive-git > > > Description > ------- > > Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin > branches > > In method markSemiJoinForDPP (HIVE-17399), the nDVs comparison should not > have equality as there is a chance that the values are same on both sides and > the branch is still marked as good when it shouldn't be. > Add a configurable factor to see how useful this is if nDVs on smaller side > are only slightly less than that on TS side. > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 6631a6e45d > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java da30c3b642 > ql/src/test/queries/clientpositive/dynamic_semijoin_reduction.q 6cc0a7f7a9 > ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out > 1a1a4d9b2d > > > Diff: https://reviews.apache.org/r/63427/diff/1/ > > > Testing > ------- > > > Thanks, > > Deepak Jaiswal > >