Github user heary-cao commented on the issue:
https://github.com/apache/spark/pull/18725
@viirya @baibaichen
thank your for review it.
I made a comparison test:
```
select k,k,sum(id) from (select d004 as id, floor(c010 * 10000) as k,
ceil(c010) as cceila from XXX_table) a
group by k,k;
```
WholeStageCodegen subtrees:
```
*HashAggregate(keys=[k#2L], functions=[partial_sum(cast(id#1 as bigint))],
output=[k#2L, sum#399L])
+- *Project [d004#206 AS id#1, FLOOR((c010#215 * 10000.0)) AS k#2L]
+- HiveTableScan [d004#206, c010#215], MetastoreRelation XXX_database,
XXX_Table
== Subtree 2 / 2 ==
*HashAggregate(keys=[k#2L], functions=[sum(cast(id#1 as bigint))],
output=[k#2L, k#2L, sum(id)#396L])
+- Exchange hashpartitioning(k#2L, 200)
+- *HashAggregate(keys=[k#2L], functions=[partial_sum(cast(id#1 as
bigint))], output=[k#2L, sum#399L])
+- *Project [d004#206 AS id#1, FLOOR((c010#215 * 10000.0)) AS k#2L]
+- HiveTableScan [d004#206, c010#215], MetastoreRelation
XXX_database, XXX_Table
```
`ReaderImpl: Reading ORC rows from
hdfs://opena:8020/.../p_date=2017-05-25/p_hour=10/part-00009 with {include:
[true, false, true, false, false, false, false, false, false, false, false,
true, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, fal
se, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false, false,
false, false, false, false, false, false, false, false, false, false], offset:
0, length: 9223372036854775807}`
And Performance performanceï¼557s VS 5997s
Currently, I try to modify this particular scenario by split it to two
Projects.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]