bkietz commented on code in PR #46271:
URL: https://github.com/apache/arrow/pull/46271#discussion_r2070632065


##########
cpp/src/arrow/array/data.h:
##########
@@ -271,43 +313,62 @@ struct ARROW_EXPORT ArrayData {
     }
   }
 
+  /// \brief Access a buffer's data as a mutable typed C pointer
+  ///
+  /// \param i the buffer index
+  ///
+  /// Like `GetValues(i)`, but allows mutating buffer contents.
+  /// This should only be used when initially populating the ArrayData, before
+  /// it is attached to a Array instance.
   template <typename T>
   inline T* GetMutableValues(int i) {
     return GetMutableValues<T>(i, offset);
   }
 
   /// \brief Construct a zero-copy slice of the data with the given offset and 
length
   ///
-  /// The associated `ArrayStatistics` is always discarded in a sliced
-  /// `ArrayData`. Because `ArrayStatistics` in the original
-  /// `ArrayData` may be invalid in a sliced `ArrayData`. If you want
-  /// to reuse statistics in the original `ArrayData`, you need to do
-  /// it by yourself.
-  ///
-  /// If the specified slice range has the same range as the original
-  /// `ArrayData`, we can reuse statistics in the original
-  /// `ArrayData`. Because it has the same data as the original
-  /// `ArrayData`. But the associated `ArrayStatistics` is discarded
-  /// in this case too. Use `Copy()` instead for the case.
+  /// This method applies the given slice to this ArrayData, taking into 
account
+  /// its existing offset and length.
+  /// If the given `length` is too large, the slice length is clamped so as not
+  /// to go past the offset end.
+  /// If the given `often` is too large, or if either `offset` or `length` is 
negative,
+  /// behavior is undefined.
+  ///
+  /// The associated ArrayStatistics is always discarded in a sliced
+  /// ArrayData, even if the slice is trivially equal to the original 
ArrayData.
+  /// If you want to reuse the statistics from the original ArrayData, you need
+  /// to do it by yourself.

Review Comment:
   ```suggestion
     /// If you want to reuse the statistics from the original ArrayData, you 
must
     /// explicitly reattach them.
   ```



##########
cpp/src/arrow/array/data.h:
##########
@@ -230,7 +234,19 @@ struct ARROW_EXPORT ArrayData {
     return null_count.load() != length;
   }
 
-  // Access a buffer's data as a typed C pointer
+  /// \brief Access a buffer's data as a typed C pointer
+  ///
+  /// \param i the buffer index
+  /// \param absolute_offset the offset into the buffer
+  ///
+  /// If `absolute_offset` is non-zero, the type `T` must match the
+  /// layout of buffer number `i` for the array's data type; otherwise
+  /// offset computation would be incorrect.
+  ///
+  /// If the given buffer is bit-packed (such as a validity bitmap, or
+  /// the data buffer of a boolean array), then `absolute_offset` must be
+  /// zero for correct results, and any bit offset must be applied manually
+  /// by the caller.

Review Comment:
   Perhaps instead we should avoid use of GetValues* for bit packed buffers 
altogether
   ```suggestion
     /// This method is not suitable for accessing the data of a buffer which
     /// is bit-packed (such as a validity bitmap, or the data buffer of a 
boolean array).
     /// To access bits, use `buffers[i]->data()` and apply the bit offset 
manually.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to