lgbo-ustc opened a new issue, #6045:
URL: https://github.com/apache/incubator-gluten/issues/6045
### Description
Found a interesting case
```sql
explain select n_regionkey, n_nationkey, n_nationkey * 2 as x from
tpch_pq.nation order by n_nationkey;
```
The physical plan from `gluten` is
```
CHNativeColumnarToRow
+- ^(16) SortExecTransformer [n_nationkey#0L ASC NULLS FIRST], true, 0
+- ^(16) InputIteratorTransformer[n_regionkey#2L, n_nationkey#0L, x#64L]
+- ColumnarExchange rangepartitioning(n_nationkey#0L ASC NULLS FIRST,
5), ENSURE_REQUIREMENTS, [plan_id=529], [id=#529], [OUTPUT]
List(n_regionkey:LongType, n_nationkey:LongType, x:LongType)
+- ^(15) ProjectExecTransformer [n_regionkey#2L, n_nationkey#0L,
(n_nationkey#0L * cast(2 as bigint)) AS x#64L]
+- ^(15) NativeFileScan parquet
tpch_pq.nation[n_nationkey#0L,n_regionkey#2L] Batched: true, DataFilters: [],
Format: Parquet, Location: InMemoryFileIndex(1
paths)[file:/home/liangjiabiao/workspace/docker/local_gluten/tpch_pq_data/nat...,
PartitionFilters: [], PushedFilters: [], ReadSchema:
struct<n_nationkey:bigint,n_regionkey:bigint>
```
The project action for `n_nationkey * 2 as x ` is before `sort`.
Let's see a similar case in `CH`, the column `a + 1` is generated lazily
after sort.
```
f2386dc7dd0d :) explain pipeline header=1 select key, a, a + 1 from tt1
order by key
EXPLAIN PIPELINE header = 1
SELECT
key,
a,
a + 1
FROM tt1
ORDER BY key ASC
Query id: a47b482c-3b15-4718-a7ce-1b63403ee192
┌─explain────────────────────────────────────────────────────────────────┐
1. │ (Expression)
│
2. │ ExpressionTransform
│
3. │ Header: key UInt32: key UInt32 UInt32(size = 0)
│
4. │ a UInt32: a UInt32 UInt32(size = 0)
│
5. │ plus(a, 1) UInt64: plus(a, 1) UInt64 UInt64(size = 0)
│
6. │ (Sorting)
│
7. │ (Expression)
│
8. │ ExpressionTransform
│
9. │ Header: __table1.key UInt32: __table1.key UInt32 UInt32(size = 0)
│
10. │ a UInt32: a UInt32 UInt32(size = 0)
│
11. │ (ReadFromMergeTree)
│
12. │ MergeTreeSelect(pool: ReadPoolInOrder, algorithm: InOrder) 0 → 1
│
13. │ Header: key UInt32: key UInt32 UInt32(size = 0)
│
14. │ a UInt32: a UInt32 UInt32(size = 0)
│
└────────────────────────────────────────────────────────────────────────┘
14 rows in set. Elapsed: 0.001 sec.
```
In column-based sorting, more columns need to be sorted, the performance is
worse. If new columns are not used as sort keys, generate them after `sort`
should be a good idea.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]