felipecrv commented on code in PR #35036:
URL: https://github.com/apache/arrow/pull/35036#discussion_r1165585465
##########
cpp/src/arrow/array/data.cc:
##########
@@ -305,8 +305,12 @@ void ArraySpan::FillFromScalar(const Scalar& value) {
Type::type type_id = value.type->id();
// Populate null count and validity bitmap (only for non-union/null types)
Review Comment:
stale comment
##########
cpp/src/arrow/compute/kernels/scalar_validity.cc:
##########
@@ -102,6 +169,15 @@ Status IsNullExec(KernelContext* ctx, const ExecSpan&
batch, ExecResult* out) {
bit_util::SetBitsTo(out_bitmap, out_span->offset, out_span->length, false);
}
+ const auto t = arr.type->id();
+ if (t == Type::SPARSE_UNION) {
+ SetSparseUnionLogicalNullBits(arr, out_bitmap, out_span->offset);
+ } else if (t == Type::DENSE_UNION) {
+ SetDenseUnionLogicalNullBits(arr, out_bitmap, out_span->offset);
+ } else if (t == Type::RUN_END_ENCODED) {
+ SetREELogicalNullBits(arr, out_bitmap, out_span->offset);
+ }
+
Review Comment:
The `// Input has no nulls => output is entirely false.` also needs to be
more nuanced after that move.
##########
cpp/src/arrow/array/data.cc:
##########
@@ -422,6 +426,13 @@ void ArraySpan::FillFromScalar(const Scalar& value) {
this->child_data[i].FillFromScalar(*scalar.value[i]);
}
}
+ } else if (type_id == Type::RUN_END_ENCODED) {
+ const auto& scalar = checked_cast<const RunEndEncodedScalar&>(value);
+ this->child_data.resize(2);
+ auto run_end_type = scalar.run_end_type();
Review Comment:
`auto&` to elide one refcount increment
##########
cpp/src/arrow/array/data.cc:
##########
@@ -304,9 +304,13 @@ void ArraySpan::FillFromScalar(const Scalar& value) {
Type::type type_id = value.type->id();
- // Populate null count and validity bitmap (only for non-union/null types)
- this->null_count = value.is_valid ? 0 : 1;
- if (!is_union(type_id) && type_id != Type::NA) {
+ // Populate null count and validity bitmap (only for non-union/ree/null
types)
Review Comment:
this comment can now be removed as the `if` checks wrapping the `null_count`
assignments document the context of each assignment
##########
cpp/src/arrow/compute/kernels/scalar_validity.cc:
##########
@@ -102,6 +169,15 @@ Status IsNullExec(KernelContext* ctx, const ExecSpan&
batch, ExecResult* out) {
bit_util::SetBitsTo(out_bitmap, out_span->offset, out_span->length, false);
}
+ const auto t = arr.type->id();
+ if (t == Type::SPARSE_UNION) {
+ SetSparseUnionLogicalNullBits(arr, out_bitmap, out_span->offset);
+ } else if (t == Type::DENSE_UNION) {
+ SetDenseUnionLogicalNullBits(arr, out_bitmap, out_span->offset);
+ } else if (t == Type::RUN_END_ENCODED) {
+ SetREELogicalNullBits(arr, out_bitmap, out_span->offset);
+ }
+
Review Comment:
This whole block can be moved into the `else` above to make it more clear
that this only gets called *after* a complete zeroing of `out_bitmap`.
Now we have to know about a logical implication that is not explicit in the
code: `t \in {UNION, REE} \implies arr.GetNullCount() == 0`. That is what in
turn guarantees `out_bitmap` is zeroed before the `Set*LogicalNullBits`. Too
much logic. :sweat_smile:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]