Github user cenyuhai commented on the issue:
https://github.com/apache/spark/pull/19301
my case:
```sql
select dt,
geohash_of_latlng,
sum(mt_cnt),
sum(ele_cnt),
round(sum(mt_cnt) * 1.0 * 100 / sum(mt_cnt_all), 2),
round(sum(ele_cnt) * 1.0 * 100 / sum(ele_cnt_all), 2)
from temp.test_geohash_match_parquet
group by dt, geohash_of_latlng
order by dt, geohash_of_latlng limit 10;
```
before your fix
```java
TakeOrderedAndProject(limit=10, orderBy=[dt#502 ASC NULLS
FIRST,geohash_of_latlng#507 ASC NULLS FIRST],
output=[dt#502,geohash_of_latlng#507,sum(mt_cnt)#521L,sum(ele_cnt)#522L,round((CAST((CAST((CAST(CAST(sum(CAST(mt_cnt
AS BIGINT)) AS DECIMAL(20,0)) AS DECIMAL(21,1)) * CAST(1.0 AS DECIMAL(21,1)))
AS DECIMAL(23,1)) * CAST(CAST(100 AS DECIMAL(23,1)) AS DECIMAL(23,1))) AS
DECIMAL(38,2)) / CAST(CAST(sum(CAST(mt_cnt_all AS BIGINT)) AS DECIMAL(20,0)) AS
DECIMAL(38,2))), 2)#523,round((CAST((CAST((CAST(CAST(sum(CAST(ele_cnt AS
BIGINT)) AS DECIMAL(20,0)) AS DECIMAL(21,1)) * CAST(1.0 AS DECIMAL(21,1))) AS
DECIMAL(23,1)) * CAST(CAST(100 AS DECIMAL(23,1)) AS DECIMAL(23,1))) AS
DECIMAL(38,2)) / CAST(CAST(sum(CAST(ele_cnt_all AS BIGINT)) AS DECIMAL(20,0))
AS DECIMAL(38,2))), 2)#524])
+- *HashAggregate(keys=[dt#502, geohash_of_latlng#507],
functions=[sum(cast(mt_cnt#511 as bigint)), sum(cast(ele_cnt#512 as bigint)),
sum(cast(mt_cnt#511 as bigint)), sum(cast(mt_cnt_all#513 as bigint)),
sum(cast(ele_cnt#512 as bigint)), sum(cast(ele_cnt_all#514 as bigint))])
+- Exchange(coordinator id: 148401229) hashpartitioning(dt#502,
geohash_of_latlng#507, 1000), coordinator[target post-shuffle partition size:
2000000]
+- *HashAggregate(keys=[dt#502, geohash_of_latlng#507],
functions=[partial_sum(cast(mt_cnt#511 as bigint)),
partial_sum(cast(ele_cnt#512 as bigint)), partial_sum(cast(mt_cnt#511 as
bigint)), partial_sum(cast(mt_cnt_all#513 as bigint)),
partial_sum(cast(ele_cnt#512 as bigint)), partial_sum(cast(ele_cnt_all#514 as
bigint))])
+- HiveTableScan [geohash_of_latlng#507, mt_cnt#511, ele_cnt#512,
mt_cnt_all#513, ele_cnt_all#514, dt#502], MetastoreRelation temp,
test_geohash_match_parquet
```
after your fix
```java
TakeOrderedAndProject(limit=10, orderBy=[dt#467 ASC NULLS
FIRST,geohash_of_latlng#472 ASC NULLS FIRST],
output=[dt#467,geohash_of_latlng#472,sum(mt_cnt)#486L,sum(ele_cnt)#487L,round((CAST((CAST((CAST(CAST(sum(CAST(mt_cnt
AS BIGINT)) AS DECIMAL(20,0)) AS DECIMAL(21,1)) * CAST(1.0 AS DECIMAL(21,1)))
AS DECIMAL(23,1)) * CAST(CAST(100 AS DECIMAL(23,1)) AS DECIMAL(23,1))) AS
DECIMAL(38,2)) / CAST(CAST(sum(CAST(mt_cnt_all AS BIGINT)) AS DECIMAL(20,0)) AS
DECIMAL(38,2))), 2)#488,round((CAST((CAST((CAST(CAST(sum(CAST(ele_cnt AS
BIGINT)) AS DECIMAL(20,0)) AS DECIMAL(21,1)) * CAST(1.0 AS DECIMAL(21,1))) AS
DECIMAL(23,1)) * CAST(CAST(100 AS DECIMAL(23,1)) AS DECIMAL(23,1))) AS
DECIMAL(38,2)) / CAST(CAST(sum(CAST(ele_cnt_all AS BIGINT)) AS DECIMAL(20,0))
AS DECIMAL(38,2))), 2)#489])
+- *HashAggregate(keys=[dt#467, geohash_of_latlng#472],
functions=[sum(cast(mt_cnt#476 as bigint)), sum(cast(ele_cnt#477 as bigint)),
sum(cast(mt_cnt#476 as bigint)), sum(cast(mt_cnt_all#478 as bigint)),
sum(cast(ele_cnt#477 as bigint)), sum(cast(ele_cnt_all#479 as bigint))])
+- Exchange(coordinator id: 227998366) hashpartitioning(dt#467,
geohash_of_latlng#472, 1000), coordinator[target post-shuffle partition size:
2000000]
+- *HashAggregate(keys=[dt#467, geohash_of_latlng#472],
functions=[partial_sum(cast(mt_cnt#476 as bigint)),
partial_sum(cast(ele_cnt#477 as bigint)), partial_sum(cast(mt_cnt#476 as
bigint)), partial_sum(cast(mt_cnt_all#478 as bigint)),
partial_sum(cast(ele_cnt#477 as bigint)), partial_sum(cast(ele_cnt_all#479 as
bigint))])
+- HiveTableScan [geohash_of_latlng#472, mt_cnt#476, ele_cnt#477,
mt_cnt_all#478, ele_cnt_all#479, dt#467], MetastoreRelation temp,
test_geohash_match_parquet
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]