> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/IConfigureJobConf.java > > Lines 24 (patched) > > <https://reviews.apache.org/r/65479/diff/1/?file=1951914#file1951914line24> > > > > Add: Intended only for compilation phase.
added > On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java > > Lines 259 (patched) > > <https://reviews.apache.org/r/65479/diff/1/?file=1951916#file1951916line259> > > > > Is this needed? unfortunately yes; the following makes it needed: * we are currently using mapreduce "old" api * [MapRunner#map](https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapRunner.java#L54) only passes the OutputCollector if there are *at least* one inputs * [ExecMapper](https://github.com/apache/hive/blob/f33db1f68c68b552b9888988f818c03879749461/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java#L144) at the minute the first record is start getting in; it sets the OC correctly * note: It's interesting that the ReduceSink needs the OutputCollector to pass the output...but it can "silently" ignore record if the OC is unset [here](https://github.com/apache/hive/blob/f33db1f68c68b552b9888988f818c03879749461/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L500) - not sure if this have already hidden bugs or not... * ExecMapRunner only adds the ability to set the OC in case there are 0 rows - and it enables closeOp() -s to emit records if they have to > On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapRunner.java > > Lines 29 (patched) > > <https://reviews.apache.org/r/65479/diff/1/?file=1951917#file1951917line29> > > > > Why do we need this class? see previous comment > On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java > > Line 247 (original), 246-248 (patched) > > <https://reviews.apache.org/r/65479/diff/1/?file=1951919#file1951919line247> > > > > This may result in extra memory allocation. If this change is not > > necessary, can we leave it as is? this was some leftover from an earlier version of the patch...removed > On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java > > Lines 453 (patched) > > <https://reviews.apache.org/r/65479/diff/1/?file=1951920#file1951920line453> > > > > Please add comment. added: in case the empty grouping set is preset; but no output has done the "summary row" still needs to be emitted > On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java > > Lines 580 (patched) > > <https://reviews.apache.org/r/65479/diff/1/?file=1951921#file1951921line581> > > > > Add comment on need for this. I've added: If there are no inputs; the Execution engine skips the operator tree. To prevent it from happening; an opaque ZeroRows input is added here - when needed. > On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java > > Lines 213 (patched) > > <https://reviews.apache.org/r/65479/diff/1/?file=1951925#file1951925line219> > > > > Do we need this? Reducers are always launched when when there is no > > mapper. So, this seems unnecessary. I can remove it...but I think IConfigureJobConf can be used as a general way to make modifications like this - in the current case this setting has effectively no effect on the reducers...they are being run anyway I think that after this patch the code from PlanUtils.configureJobConf could be moved to FileSinkOperator by using the new interface I feel that core constructs like ReduceWork should depend less on explicit operator implementations and more on interfaces if possible. - Zoltan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65479/#review196840 ----------------------------------------------------------- On Feb. 2, 2018, 12:23 p.m., Zoltan Haindrich wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/65479/ > ----------------------------------------------------------- > > (Updated Feb. 2, 2018, 12:23 p.m.) > > > Review request for hive, Ashutosh Chauhan and Prasanth_J. > > > Bugs: HIVE-18523 > https://issues.apache.org/jira/browse/HIVE-18523 > > > Repository: hive-git > > > Description > ------- > > * ensure that mapper operators are started up - but only if empty grouping is > present > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java > 6a0f0de2a5e84770c6446af41710d972d813c7bc > ql/src/java/org/apache/hadoop/hive/ql/exec/IConfigureJobConf.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > d7b3e4b2fd3ee1a8e2795095a6c55442de2b38e0 > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java > 976b537033abda5d5ab8b77a7e7d6fb9c84e5a19 > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapRunner.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java > 150382a8d58fd4ba44e4d9b78a80173ab984e776 > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java > 98f4bc01c8526422348a38f8d8632e0899d695ee > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java > 45d809a1820fcb6ea5e1e5c15aee7de91a4c36c8 > ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java > e4dfc009d95f4302bd1fcdff2276e11bed68d2e0 > ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java > c3b846c4d2fee8691b4952b9f6cf4dd1d8bd632f > ql/src/java/org/apache/hadoop/hive/ql/io/NullRowsInputFormat.java > 6a372a3f47e3ac2ae2b2e583541b3a19e5d525f3 > ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java > f2b2fc57a03b368707968eb503139e51218008ca > ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java > ecfb118b41bfa5b7d593b7e801a37f0a7b5b0b5e > ql/src/test/queries/clientpositive/groupby_rollup_empty.q > 432d8c448a05f51db9ecf9940bce599dfd598a70 > ql/src/test/results/clientpositive/groupby_rollup_empty.q.out > 7359140e29fc63eebbab42ab385187be6bfc66e1 > ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out > d2b57455a3640387d8bc5f2d415a7af25eb55341 > > > Diff: https://reviews.apache.org/r/65479/diff/1/ > > > Testing > ------- > > added new testcase for union > > > Thanks, > > Zoltan Haindrich > >