Mryange opened a new pull request, #61678:
URL: https://github.com/apache/doris/pull/61678
### What problem does this PR solve?
```
-----------------------------------------------------------------------------------------
Benchmark Time CPU
Iterations
-----------------------------------------------------------------------------------------
Handwritten_Unary_Plain 326 ns 326 ns
2151459
ColumnView_Unary_Plain 326 ns 326 ns
2146584
Handwritten_Unary_Nullable 2067 ns 2067 ns
342110
ColumnView_Unary_Nullable 2061 ns 2061 ns
341236
Handwritten_Binary_Plain_Plain 680 ns 680 ns
1028990
ColumnView_Binary_Plain_Plain 679 ns 679 ns
1025809
Handwritten_Binary_Plain_Const 277 ns 277 ns
2534313
ColumnView_Binary_Plain_Const 282 ns 282 ns
2484547
Handwritten_Binary_Plain_Nullable 776 ns 776 ns
881182
ColumnView_Binary_Plain_Nullable 779 ns 779 ns
897644
Handwritten_Binary_Nullable_Nullable 3233 ns 3233 ns
217793
ColumnView_Binary_Nullable_Nullable 4469 ns 4469 ns
157379
Handwritten_Ternary_Plain_Plain_Plain 1016 ns 1016 ns
688153
ColumnView_Ternary_Plain_Plain_Plain 1017 ns 1017 ns
685327
Handwritten_Ternary_Const_Const_Plain 278 ns 278 ns
2506171
ColumnView_Ternary_Const_Const_Plain 285 ns 285 ns
2456870
Handwritten_Ternary_Plain_Const_Plain 678 ns 678 ns
1026683
ColumnView_Ternary_Plain_Const_Plain 681 ns 681 ns
1027665
Handwritten_Ternary_Nullable_Nullable_Nullable 4729 ns 4729 ns
149026
ColumnView_Ternary_Nullable_Nullable_Nullable 8608 ns 8608 ns
82746
```
1. Expensive per-element operations (e.g. geo functions, complex string ops):
Use ColumnView freely — its overhead is negligible relative to the work.
2. Cheap per-element operations that the compiler can inline (e.g. simple
arithmetic):
a) Inputs are NOT nullable (e.g. the function framework already strips
nullable):
Safe to use. The compiler optimizes the is_const branch into code
equivalent
to hand-written direct array access (verified via assembly and
benchmarks).
b) Inputs involve nullable columns:
- Unary operations: safe to use, the compiler still optimizes
effectively.
- Binary / ternary operations: the combined is_null_at checks across
multiple
columns inhibit compiler vectorization and branch optimization,
causing
significant regression (~1.4x for binary, ~1.8x for ternary in
benchmarks).
In this case, hand-written column access is recommended for best
performance.
In summary, ColumnView is designed to eliminate the combinatorial explosion
of
handling 4 column forms. It is suitable for the vast majority of use cases.
Only the specific combination of "cheap computation + nullable +
multi-column"
requires weighing whether to hand-write the access code.
### Release note
None
### Check List (For Author)
- Test <!-- At least one of them must be included. -->
- [ ] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [x] No need to test or manual test. Explain why:
- [x] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason <!-- Add your reason? -->
- Behavior changed:
- [x] No.
- [ ] Yes. <!-- Explain the behavior change -->
- Does this need documentation?
- [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR should
merge into -->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]