Kev1n8 commented on code in PR #12383:
URL: https://github.com/apache/datafusion/pull/12383#discussion_r1750633083
##########
datafusion/functions/src/unicode/substr.rs:
##########
@@ -153,20 +175,15 @@ fn make_and_append_view(
start: u32,
) {
let substr_len = substr.len();
- if substr_len == 0 {
- null_builder.append_null();
- views_buffer.push(0);
+ let sub_view = if substr_len > 12 {
+ let view = ByteView::from(*raw);
+ make_view(substr.as_bytes(), view.buffer_index, view.offset + start)
} else {
- let sub_view = if substr_len > 12 {
- let view = ByteView::from(*raw);
- make_view(substr.as_bytes(), view.buffer_index, view.offset +
start)
- } else {
- // inline value does not need block id or offset
- make_view(substr.as_bytes(), 0, 0)
- };
- views_buffer.push(sub_view);
- null_builder.append_non_null();
- }
+ // inline value does not need block id or offset
+ make_view(substr.as_bytes(), 0, 0)
Review Comment:
Oh I see what you mean. I think `ByteVIew` is not equal to `inlined_view`.
The former one indicates the `str` whose size > 12, while the inlined `str` is
represented by a simple `u128`.
> Maybe we can just let it run as same as the > 12 branch?
It might actually work since `from<u128> for ByteView` would load
meaningless(which are actually the `str`) `prefix, buffer_index, offset`, which
does no harm. But I think it would be ambiguous. It's a good suggestion tho.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]