[ 
https://issues.apache.org/jira/browse/ARROW-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349279#comment-16349279
 ] 

ASF GitHub Bot commented on ARROW-1757:
---------------------------------------

cpcloud commented on a change in pull request #1535: ARROW-1757: [C++] Add 
DictionaryArray::FromArrays alternate ctor that can check or sanitized 
"untrusted" indices
URL: https://github.com/apache/arrow/pull/1535#discussion_r165488820
 
 

 ##########
 File path: cpp/src/arrow/array.h
 ##########
 @@ -734,6 +745,27 @@ class ARROW_EXPORT DictionaryArray : public Array {
  private:
   void SetData(const std::shared_ptr<ArrayData>& data);
 
+  /// \brief Check if all indices are within valid range
+  ///
+  /// \param[in] indices dictionary indices
+  /// \param[in] range valid range of indices (0 <= index < range)
+  template <typename ArrowType>
+    static bool SanityCheck(const std::shared_ptr<Array>& indices, const 
int64_t range) {
+    using ArrayType = typename TypeTraits<ArrowType>::ArrayType;
+    std::shared_ptr<ArrayType> array = 
std::static_pointer_cast<ArrayType>(indices);
+    const typename ArrowType::c_type* data = array->raw_values();
+    const int64_t size = sizeof(data) / sizeof(data[0]);
 
 Review comment:
   @xuepanchen Note that `sizeof` is a compile time construct, so it cannot 
reliably return the number of valid elements in a chunk of dynamically 
allocated memory. This trick only works on statically allocated C arrays 
because the number of bytes in them is known at compile time.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> [C++] Add DictionaryArray::FromArrays alternate ctor that can check or 
> sanitized "untrusted" indices
> ----------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-1757
>                 URL: https://issues.apache.org/jira/browse/ARROW-1757
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Assignee: Panchen Xue
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>
> Related to ARROW-1658. This is related to the offset sanitization in 
> {{ListArray::FromArrays}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to