[GitHub] [arrow] pitrou commented on a change in pull request #9395: ARROW-11470: [C++] Detect overflow on computation of tensor strides

GitBox Wed, 03 Feb 2021 05:48:36 -0800


pitrou commented on a change in pull request #9395:
URL: https://github.com/apache/arrow/pull/9395#discussion_r569422060




##########
File path: cpp/src/arrow/tensor.cc
##########
@@ -127,14 +163,25 @@ Status CheckTensorStridesValidity(const 
std::shared_ptr<Buffer>& data,
     return Status::OK();
   }
 
-  std::vector<int64_t> last_index(shape);
-  const int64_t n = static_cast<int64_t>(shape.size());
-  for (int64_t i = 0; i < n; ++i) {
-    --last_index[i];
+  // Check the largest offset can be computed without overflow
+  const auto ndim = shape.size();
+  int64_t largest_offset = 0;
+  for (auto i = decltype(ndim){0}; i < ndim; ++i) {
+    if (strides[i] <= 0) continue;
+
+    int64_t dim_offset;
+    if (!internal::MultiplyWithOverflow(shape[i] - 1, strides[i], 
&dim_offset)) {
+      if (!internal::AddWithOverflow(largest_offset, dim_offset, 
&largest_offset)) {
+        continue;
+      }
+    }
+
+    return Status::Invalid(
+        "too large number given in strides to compute the item offset");

Review comment:
       I would suggest something more explicit, such as "offsets computed from 
shape and strides would not fit in a 64-bit integer".

##########
File path: cpp/src/arrow/tensor.cc
##########
@@ -40,40 +41,69 @@ using internal::checked_cast;
 
 namespace internal {
 
-void ComputeRowMajorStrides(const FixedWidthType& type, const 
std::vector<int64_t>& shape,
-                            std::vector<int64_t>* strides) {
+Status ComputeRowMajorStrides(const FixedWidthType& type,
+                              const std::vector<int64_t>& shape,
+                              std::vector<int64_t>* strides) {
   const int byte_width = GetByteWidth(type);
-  int64_t remaining = byte_width;
-  for (int64_t dimsize : shape) {
-    remaining *= dimsize;
+  const auto ndim = shape.size();

Review comment:
       Note this is `size_t`.

##########
File path: cpp/src/arrow/tensor.cc
##########
@@ -127,14 +163,25 @@ Status CheckTensorStridesValidity(const 
std::shared_ptr<Buffer>& data,
     return Status::OK();
   }
 
-  std::vector<int64_t> last_index(shape);
-  const int64_t n = static_cast<int64_t>(shape.size());
-  for (int64_t i = 0; i < n; ++i) {
-    --last_index[i];
+  // Check the largest offset can be computed without overflow
+  const auto ndim = shape.size();
+  int64_t largest_offset = 0;
+  for (auto i = decltype(ndim){0}; i < ndim; ++i) {
+    if (strides[i] <= 0) continue;

Review comment:
       Are negative strides possible? If so, we shouldn't ignore them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] pitrou commented on a change in pull request #9395: ARROW-11470: [C++] Detect overflow on computation of tensor strides

Reply via email to