Re: [PR] fix: out-of-bounds read in `StringViewToDLDataType_` with non-null-terminated strings [tvm-ffi]

via GitHub Tue, 06 Jan 2026 19:40:00 -0800


gemini-code-assist[bot] commented on code in PR #387:
URL: https://github.com/apache/tvm-ffi/pull/387#discussion_r2666961011



##########
src/ffi/dtype.cc:
##########
@@ -293,19 +306,23 @@ inline DLDataType 
StringViewToDLDataType_(std::string_view str) {
     scan = str.data();
     TVM_FFI_THROW(ValueError) << "unknown dtype `" << str << '`';
   }
-  char* xdelim;  // emulate sscanf("%ux%u", bits, lanes)
-  uint8_t bits = static_cast<uint8_t>(strtoul(scan, &xdelim, 10));
+  // Parse bits manually to handle non-null-terminated string_view
+  const char* xdelim = scan;
+  uint8_t bits = static_cast<uint8_t>(parse_digits(xdelim, str_end));
   if (bits != 0) dtype.bits = bits;

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The parsed value for `bits` is cast to `uint8_t` without checking for 
overflow. A value like `int257` would be silently truncated to `1`, which is 
incorrect. Adding a check to ensure the value fits within `uint8_t` would make 
the parsing more robust and prevent unexpected behavior.
   
   ```c
     uint32_t bits_val = parse_digits(xdelim, str_end);
     if (bits_val > UINT8_MAX) {
       TVM_FFI_THROW(ValueError) << "Bits value " << bits_val << " is out of 
range for uint8_t in dtype '"
                                 << str << "'";
     }
     uint8_t bits = static_cast<uint8_t>(bits_val);
     if (bits != 0) dtype.bits = bits;
   ```



##########
src/ffi/dtype.cc:
##########
@@ -206,17 +206,30 @@ inline DLDataType 
StringViewToDLDataType_(std::string_view str) {
   dtype.bits = 32;
   dtype.lanes = 1;
   const char* scan;
+  const char* str_end = str.data() + str.length();
+
+  // Helper lambda to parse decimal digits from a bounded string_view
+  // Returns the parsed value and updates ptr to point past the last digit
+  auto parse_digits = [](const char*& ptr, const char* end) -> uint32_t {
+    uint32_t value = 0;
+    while (ptr < end && *ptr >= '0' && *ptr <= '9') {
+      value = value * 10 + (*ptr - '0');
+      ptr++;
+    }
+    return value;
+  };

Review Comment:
   ![high](https://www.gstatic.com/codereviewagent/high-priority.svg)
   
   The `parse_digits` lambda is vulnerable to integer overflow if a parsed 
number exceeds `UINT32_MAX`. While the original `strtoul` also had overflow 
issues that weren't checked in the previous code, this new implementation can 
be made more robust by explicitly detecting and handling overflow. A good 
approach is to use a `uint64_t` for accumulation and then check if the value 
exceeds `UINT32_MAX` before returning.
   
   ```c
     auto parse_digits = [](const char*& ptr, const char* end) -> uint32_t {
       uint64_t value = 0;
       const char* start_ptr = ptr;
       while (ptr < end && *ptr >= '0' && *ptr <= '9') {
         value = value * 10 + (*ptr - '0');
         ptr++;
       }
       if (value > UINT32_MAX) {
         TVM_FFI_THROW(ValueError) << "Integer value in dtype string '"
                                   << std::string_view(start_ptr, ptr - 
start_ptr)
                                   << "' is out of range for uint32_t";
       }
       return static_cast<uint32_t>(value);
     };
   ```



##########
src/ffi/dtype.cc:
##########
@@ -293,19 +306,23 @@ inline DLDataType 
StringViewToDLDataType_(std::string_view str) {
     scan = str.data();
     TVM_FFI_THROW(ValueError) << "unknown dtype `" << str << '`';
   }
-  char* xdelim;  // emulate sscanf("%ux%u", bits, lanes)
-  uint8_t bits = static_cast<uint8_t>(strtoul(scan, &xdelim, 10));
+  // Parse bits manually to handle non-null-terminated string_view
+  const char* xdelim = scan;
+  uint8_t bits = static_cast<uint8_t>(parse_digits(xdelim, str_end));
   if (bits != 0) dtype.bits = bits;
   int scalable_multiplier = 1;
-  if (strncmp(xdelim, "xvscale", 7) == 0) {
+  // Check bounds before dereferencing xdelim
+  if (xdelim < str_end && strncmp(xdelim, "xvscale", 7) == 0) {
     scalable_multiplier = -1;
     xdelim += 7;
   }
-  char* endpt = xdelim;
-  if (*xdelim == 'x') {
-    dtype.lanes = static_cast<uint16_t>(scalable_multiplier * strtoul(xdelim + 
1, &endpt, 10));
+  const char* endpt = xdelim;
+  // Check bounds before dereferencing xdelim
+  if (xdelim < str_end && *xdelim == 'x') {
+    endpt = xdelim + 1;
+    dtype.lanes = static_cast<uint16_t>(scalable_multiplier * 
parse_digits(endpt, str_end));
   }

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   Similar to the `parse_float` case, the number of lanes is not validated 
here. An empty or zero value for lanes after `x` should probably be an error. 
Also, overflow against `uint16_t` is not checked. This could lead to silent 
truncation or misconfiguration.
   
   ```c
     if (xdelim < str_end && *xdelim == 'x') {
       endpt = xdelim + 1;
       const char* digits_start = endpt;
       uint32_t lanes_val = parse_digits(endpt, str_end);
       if (endpt == digits_start || lanes_val == 0) {
         TVM_FFI_THROW(ValueError) << "Invalid lanes specification in dtype '" 
<< str
                                   << "'. Lanes must be a positive integer.";
       }
       if (lanes_val > UINT16_MAX) {
         TVM_FFI_THROW(ValueError) << "Lanes value " << lanes_val
                                   << " is out of range for uint16_t in dtype 
'" << str << "'";
       }
       dtype.lanes = static_cast<uint16_t>(scalable_multiplier * lanes_val);
     }
   ```



##########
src/ffi/dtype.cc:
##########
@@ -206,17 +206,30 @@ inline DLDataType 
StringViewToDLDataType_(std::string_view str) {
   dtype.bits = 32;
   dtype.lanes = 1;
   const char* scan;
+  const char* str_end = str.data() + str.length();
+
+  // Helper lambda to parse decimal digits from a bounded string_view
+  // Returns the parsed value and updates ptr to point past the last digit
+  auto parse_digits = [](const char*& ptr, const char* end) -> uint32_t {
+    uint32_t value = 0;
+    while (ptr < end && *ptr >= '0' && *ptr <= '9') {
+      value = value * 10 + (*ptr - '0');
+      ptr++;
+    }
+    return value;
+  };
 
   auto parse_float = [&](const std::string_view& str, int offset, int code, 
int bits) {
     dtype.code = static_cast<uint8_t>(code);
     dtype.bits = static_cast<uint8_t>(bits);
     scan = str.data() + offset;
-    char* endpt = nullptr;
-    if (*scan == 'x') {
-      dtype.lanes = static_cast<uint16_t>(strtoul(scan + 1, &endpt, 10));
+    const char* endpt = scan;
+    if (scan < str_end && *scan == 'x') {
+      endpt = scan + 1;
+      dtype.lanes = static_cast<uint16_t>(parse_digits(endpt, str_end));
       scan = endpt;
     }

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The current implementation doesn't validate the parsed number of lanes. It's 
possible to have a string like `float32x` (with no digits after `x`) or 
`float32x0`, which would result in `lanes=0`. This is likely incorrect, as the 
number of lanes should be a positive integer. Additionally, a value larger than 
`UINT16_MAX` would be silently truncated. It would be more robust to validate 
the parsed value.
   
   ```c
       if (scan < str_end && *scan == 'x') {
         endpt = scan + 1;
         const char* digits_start = endpt;
         uint32_t lanes_val = parse_digits(endpt, str_end);
         if (endpt == digits_start || lanes_val == 0) {
           TVM_FFI_THROW(ValueError) << "Invalid lanes specification in dtype 
'" << str
                                     << "'. Lanes must be a positive integer.";
         }
         if (lanes_val > UINT16_MAX) {
           TVM_FFI_THROW(ValueError) << "Lanes value " << lanes_val
                                     << " is out of range for uint16_t in dtype 
'" << str << "'";
         }
         dtype.lanes = static_cast<uint16_t>(lanes_val);
         scan = endpt;
       }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] fix: out-of-bounds read in `StringViewToDLDataType_` with non-null-terminated strings [tvm-ffi]

Reply via email to