felipecrv commented on code in PR #41021:
URL: https://github.com/apache/arrow/pull/41021#discussion_r1553637741


##########
cpp/src/arrow/array/builder_nested.h:
##########
@@ -181,14 +181,13 @@ class ARROW_EXPORT VarLengthListLikeBuilder : public 
ArrayBuilder {
     if constexpr (is_list_view(TYPE::type_id)) {
       sizes = array.GetValues<offset_type>(2);
     }
+    const bool has_validity = array.buffers[0].data != NULLPTR;
     const bool all_valid = !array.MayHaveLogicalNulls();
-    const uint8_t* validity = array.HasValidityBitmap() ? 
array.buffers[0].data : NULLPTR;
     ARROW_RETURN_NOT_OK(Reserve(length));
     for (int64_t row = offset; row < offset + length; row++) {
-      const bool is_valid =
-          all_valid || (validity && bit_util::GetBit(validity, array.offset + 
row)) ||
-          array.IsValid(row);

Review Comment:
   `IsValid` gets inlined. The assumptions means when `IsValid` is expanded 
here, `if (this->buffers[0].data != NULLPTR) {` becomes `if (has_validity)`. As 
the user of the `IsValid` function I don't have to write code that manually 
inlines its contents, all I have to do is given enough information to the 
compiler and the inliner will do that for me. 
   
   ```diff
     inline bool IsValid(int64_t i) const {
   -    if (this->buffers[0].data != NULLPTR) {
   +    if (has_validity) {
         return bit_util::GetBit(this->buffers[0].data, i + this->offset);
       } else {
         const auto type = this->type->id();
         if (type == Type::SPARSE_UNION) {
           return !IsNullSparseUnion(i);
         }
         if (type == Type::DENSE_UNION) {
           return !IsNullDenseUnion(i);
         }
         if (type == Type::RUN_END_ENCODED) {
           return !IsNullRunEndEncoded(i);
         }
         return this->null_count != this->length;
       }
     }
   ```
   
   The place with the biggest impact (in another PR) is where I specialize the 
entire loop based on the array having a null bitmap from the beggining:
   
   ```cpp
   ARROW_COMPILER_ASSUME(array.buffers[0].data != nullptr);
   if (array.IsValid(i)) {
   ...
   ```
   
   `IsValid` gets inlined and much smaller:
   
   ```diff
      inline bool IsValid(int64_t i) const {
   -    if (this->buffers[0].data != NULLPTR) {
          return bit_util::GetBit(this->buffers[0].data, i + this->offset);
   -    } else {
   -      const auto type = this->type->id();
   -      if (type == Type::SPARSE_UNION) {
   -        return !IsNullSparseUnion(i);
   -      }
   -      if (type == Type::DENSE_UNION) {
   -        return !IsNullDenseUnion(i);
   -      }
   -      if (type == Type::RUN_END_ENCODED) {
   -        return !IsNullRunEndEncoded(i);
   -      }
   -      return this->null_count != this->length;
   -    }
   -  }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to