> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/IConfigureJobConf.java
> > Lines 24 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951914#file1951914line24>
> >
> >     Add: Intended only for compilation phase.

added


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java
> > Lines 259 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951916#file1951916line259>
> >
> >     Is this needed?

unfortunately yes; the following makes it needed:

* we are currently using mapreduce "old" api
* 
[MapRunner#map](https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapRunner.java#L54)
 only passes the OutputCollector if there are *at least* one inputs 
* 
[ExecMapper](https://github.com/apache/hive/blob/f33db1f68c68b552b9888988f818c03879749461/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java#L144)
 at the minute the first record is start getting in; it sets the OC correctly
* note: It's interesting that the ReduceSink needs the OutputCollector to pass 
the output...but it can "silently" ignore record if the OC is unset 
[here](https://github.com/apache/hive/blob/f33db1f68c68b552b9888988f818c03879749461/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L500)
 - not sure if this have already hidden bugs or not...
* ExecMapRunner only adds the ability to set the OC in case there are 0 rows - 
and it enables closeOp() -s to emit records if they have to


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapRunner.java
> > Lines 29 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951917#file1951917line29>
> >
> >     Why do we need this class?

see previous comment


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
> > Line 247 (original), 246-248 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951919#file1951919line247>
> >
> >     This may result in extra memory allocation. If this change is not 
> > necessary, can we leave it as is?

this was some leftover from an earlier version of the patch...removed


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java
> > Lines 453 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951920#file1951920line453>
> >
> >     Please add comment.

added:

       in case the empty grouping set is preset; but no output has done
       the "summary row" still needs to be emitted


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
> > Lines 580 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951921#file1951921line581>
> >
> >     Add comment on need for this.

I've added:

If there are no inputs; the Execution engine skips the operator tree.
To prevent it from happening; an opaque  ZeroRows input is added here - when 
needed.


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java
> > Lines 213 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951925#file1951925line219>
> >
> >     Do we need this? Reducers are always launched when when there is no 
> > mapper. So, this seems unnecessary.

I can remove it...but I think IConfigureJobConf can be used as a general way to 
make modifications like this - in the current case this setting has effectively 
no effect on the reducers...they are being run anyway

I think that after this patch the code from PlanUtils.configureJobConf could be 
moved to FileSinkOperator by using the new interface
I feel that core constructs like ReduceWork should depend less on explicit 
operator implementations and more on interfaces if possible.


- Zoltan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65479/#review196840
-----------------------------------------------------------


On Feb. 2, 2018, 12:23 p.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65479/
> -----------------------------------------------------------
> 
> (Updated Feb. 2, 2018, 12:23 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Prasanth_J.
> 
> 
> Bugs: HIVE-18523
>     https://issues.apache.org/jira/browse/HIVE-18523
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> * ensure that mapper operators are started up - but only if empty grouping is 
> present
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 
> 6a0f0de2a5e84770c6446af41710d972d813c7bc 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/IConfigureJobConf.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
> d7b3e4b2fd3ee1a8e2795095a6c55442de2b38e0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 
> 976b537033abda5d5ab8b77a7e7d6fb9c84e5a19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapRunner.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 
> 150382a8d58fd4ba44e4d9b78a80173ab984e776 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
> 98f4bc01c8526422348a38f8d8632e0899d695ee 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
> 45d809a1820fcb6ea5e1e5c15aee7de91a4c36c8 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
> e4dfc009d95f4302bd1fcdff2276e11bed68d2e0 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
> c3b846c4d2fee8691b4952b9f6cf4dd1d8bd632f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/NullRowsInputFormat.java 
> 6a372a3f47e3ac2ae2b2e583541b3a19e5d525f3 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 
> f2b2fc57a03b368707968eb503139e51218008ca 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java 
> ecfb118b41bfa5b7d593b7e801a37f0a7b5b0b5e 
>   ql/src/test/queries/clientpositive/groupby_rollup_empty.q 
> 432d8c448a05f51db9ecf9940bce599dfd598a70 
>   ql/src/test/results/clientpositive/groupby_rollup_empty.q.out 
> 7359140e29fc63eebbab42ab385187be6bfc66e1 
>   ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out 
> d2b57455a3640387d8bc5f2d415a7af25eb55341 
> 
> 
> Diff: https://reviews.apache.org/r/65479/diff/1/
> 
> 
> Testing
> -------
> 
> added new testcase for union
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>

Reply via email to