beliefer commented on issue #11980:
URL: https://github.com/apache/gluten/issues/11980#issuecomment-4396090101
Take for example.
```
set spark.gluten.sql.columnar.forceShuffledHashJoin=false;
set spark.gluten.sql.columnar.shuffledHashJoin=false;
SELECT /*+ MERGE(htmp1) */ htmp1.c2 AS 1c2, htmp1.c3 AS 1c3, htmp2.c2 AS 2c2
FROM htmp1 JOIN htmp2 ON htmp1.c1 = htmp2.c1;
```
Execution plan structure (confirmed)
```
SortMergeJoinExecTransformer → Output 0 rows, 0.0 B
:- SortExecTransformer (left) → 100 rows, 2.3 KiB
: +- InputIteratorTransformer
: +- RowToVeloxColumnar ← htmp1 Follow the vanilla path
: +- Scan hive htmp1 ← 100 rows
+- SortExecTransformer (right) → 100 rows, 4.2 KiB
+- InputIteratorTransformer
+- NativeScan hive htmp2 ← htmp2 takes the Velox native path,
100 rows
```
100 rows on each side are sent to `SortExec`, 100 rows on each side are sent
to SMJ, and SMJ outputs 0 rows. The data is fine, the problem lies in comparing
the join keys.
---
Root cause: Different internal representations of DECIMAL generated by two
read paths
It's not an issue with `RowToVeloxColumnar` itself, it's just a format
bridge. The real root cause lies in the fact that two paths generate
semantically equivalent but internally encoded DECIMAL values.
Left side (Scan hive → RowToVeloxColumnar):
1. Hive Metastore record c1 column type is DECIMAL(20, 0)
2. However, the type written in the footer of the ORC file is decimal (38,
18) (which is the schema inconsistency you observed)
3. Scan hive (vanilla Spark ORC reader) to read using Metastore's type
DECIMAL (20,0), and the values read are 1, 2, 3 .. (integer)
4. RowToVeloxColumnar converts Spark's DECIMAL (20,0) to Velox's int128 with
an accuracy of 20 scale 0, Internal value example: 1 → 0x00000001
Right side (NativeScan hive):
1. Velox's ORC reader directly reads the type in the footer of the ORC file,
which is decimal (38, 18)
2. Under the encoding of scale=18, the internal value of int128 is 1 × 10 ^
18=10000000000000000
3. Namely: 1 → 0 x0DE0B6B3A7640000
SMJ performs key comparison in the Velox layer:
Left key: int128=1 (scale=0)
Right key: int128=10 ^ 18 (scale=18)
Comparison result: Unequal → No matching rows
Therefore, no match can be found in the Cartesian space of 100 × 100, and 0
rows are output.
---
`RowToVeloxColumnar`'s role
`RowToVeloxColumnar` is not the root cause, but the key diagnostic clue is:
-Its appearance indicates that the htmp1 side did not go through NativeScan,
but rather went through it vanilla Scan hive
-This is precisely the manifestation of asymmetry: different ORC readers are
used on both sides, and different readers interpret the type of the DECIMAL
column of the same ORC file differently (one using Metastore type, the other
using file footer type)
-If both sides are NativeScan (or both are Scan hive+`RowToVeloxColumnar`),
the internal representation of DECIMAL will be consistent, and join will match
correctly
It can be understood that RowToVeloxColumnar is the symptom, and the
asymmetric scan path is the cause.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]