[
https://issues.apache.org/jira/browse/ARROW-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349255#comment-16349255
]
ASF GitHub Bot commented on ARROW-1757:
---------------------------------------
wesm commented on a change in pull request #1535: ARROW-1757: [C++] Add
DictionaryArray::FromArrays alternate ctor that can check or sanitized
"untrusted" indices
URL: https://github.com/apache/arrow/pull/1535#discussion_r165483534
##########
File path: cpp/src/arrow/array.h
##########
@@ -734,6 +745,27 @@ class ARROW_EXPORT DictionaryArray : public Array {
private:
void SetData(const std::shared_ptr<ArrayData>& data);
+ /// \brief Check if all indices are within valid range
+ ///
+ /// \param[in] indices dictionary indices
+ /// \param[in] range valid range of indices (0 <= index < range)
+ template <typename ArrowType>
+ static bool SanityCheck(const std::shared_ptr<Array>& indices, const
int64_t range) {
+ using ArrayType = typename TypeTraits<ArrowType>::ArrayType;
+ std::shared_ptr<ArrayType> array =
std::static_pointer_cast<ArrayType>(indices);
+ const typename ArrowType::c_type* data = array->raw_values();
+ const int64_t size = sizeof(data) / sizeof(data[0]);
+
+ for (int64_t idx = 0; idx < size; ++idx) {
+ if (!array->IsNull(idx)) {
+ if (data[idx] < 0 || data[idx] >= range) {
+ return false;
+ }
+ }
+ }
+ return true;
+ }
Review comment:
Not sure if we need a new member function; can put this in the compilation
unit
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> [C++] Add DictionaryArray::FromArrays alternate ctor that can check or
> sanitized "untrusted" indices
> ----------------------------------------------------------------------------------------------------
>
> Key: ARROW-1757
> URL: https://issues.apache.org/jira/browse/ARROW-1757
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Wes McKinney
> Assignee: Panchen Xue
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Related to ARROW-1658. This is related to the offset sanitization in
> {{ListArray::FromArrays}}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)