[ 
https://issues.apache.org/jira/browse/ARROW-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351078#comment-16351078
 ] 

ASF GitHub Bot commented on ARROW-1757:
---------------------------------------

xuepanchen commented on a change in pull request #1535: ARROW-1757: [C++] Add 
DictionaryArray::FromArrays alternate ctor that can check or sanitized 
"untrusted" indices
URL: https://github.com/apache/arrow/pull/1535#discussion_r165789910
 
 

 ##########
 File path: cpp/src/arrow/array.h
 ##########
 @@ -734,6 +745,27 @@ class ARROW_EXPORT DictionaryArray : public Array {
  private:
   void SetData(const std::shared_ptr<ArrayData>& data);
 
+  /// \brief Check if all indices are within valid range
+  ///
+  /// \param[in] indices dictionary indices
+  /// \param[in] range valid range of indices (0 <= index < range)
+  template <typename ArrowType>
+    static bool SanityCheck(const std::shared_ptr<Array>& indices, const 
int64_t range) {
+    using ArrayType = typename TypeTraits<ArrowType>::ArrayType;
+    std::shared_ptr<ArrayType> array = 
std::static_pointer_cast<ArrayType>(indices);
+    const typename ArrowType::c_type* data = array->raw_values();
+    const int64_t size = sizeof(data) / sizeof(data[0]);
+
+    for (int64_t idx = 0; idx < size; ++idx) {
+      if (!array->IsNull(idx)) {
+        if (data[idx] < 0 || data[idx] >= range) {
+         return false;
+        }
+      }
+    }
+    return true;
+  }
 
 Review comment:
   Do you mean moving SanityCheck() out of DictionaryArray class and making it 
a standalone template function under arrow namespace?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Add DictionaryArray::FromArrays alternate ctor that can check or 
> sanitized "untrusted" indices
> ----------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-1757
>                 URL: https://issues.apache.org/jira/browse/ARROW-1757
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Assignee: Panchen Xue
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>
> Related to ARROW-1658. This is related to the offset sanitization in 
> {{ListArray::FromArrays}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to