Copilot commented on code in PR #50157:
URL: https://github.com/apache/arrow/pull/50157#discussion_r3394364143


##########
cpp/src/parquet/page_index.cc:
##########
@@ -973,6 +979,9 @@ std::unique_ptr<ColumnIndex> ColumnIndex::Make(const 
ColumnDescriptor& descr,
     // Guard against UB when moving column_index
     throw ParquetException("Invalid ColumnIndex boundary_order");
   }
+  if (!CanTrustPageIndexMinMax(descr)) {
+    return nullptr;
+  }

Review Comment:
   `ColumnIndex::Make` still deserializes and validates `boundary_order` before 
the new trust guard. For columns with UNKNOWN/UNDEFINED column order (or 
UNKNOWN sort order), the page index is supposed to be ignored; throwing on a 
malformed index in this case defeats the guard and can cause unnecessary read 
failures (and does extra work parsing untrusted data). Move the 
`CanTrustPageIndexMinMax` check to the top of the function, before 
deserialization/validation, and remove the later check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to