pitrou commented on code in PR #50183:
URL: https://github.com/apache/arrow/pull/50183#discussion_r3419407178


##########
cpp/src/parquet/statistics.cc:
##########
@@ -357,7 +389,9 @@ CleanStatistic(std::pair<T, T> min_max, 
LogicalType::Type::type) {
     return ::std::nullopt;
   }
 
-  if (min == std::numeric_limits<T>::max() && max == 
std::numeric_limits<T>::lowest()) {

Review Comment:
   This was a logical AND, but `IsInvalidMinMax` will return true if either 
condition holds. This looks like a behavior change, does it matter in practice?



##########
cpp/src/parquet/statistics_test.cc:
##########
@@ -1559,6 +1560,30 @@ void TestFloatStatistics<T>::TestNaNs() {
                   valid_bitmap_no_nans);
 }
 
+// GH-50182: an all-infinity column must report that infinity as its min/max,
+// rather than the largest finite value (a finite seed is never displaced by an
+// infinity of the same sign). A NaN interspersed in such a column must be
+// ignored without corrupting the infinite min/max.
+template <typename T>
+void TestFloatStatistics<T>::TestInfinities() {
+  NodePtr node = this->MakeNode("f", Repetition::REQUIRED);
+  ColumnDescriptor descr(node, 0, 0);
+
+  constexpr c_type inf = std::numeric_limits<c_type>::infinity();
+  constexpr c_type nan = std::numeric_limits<c_type>::quiet_NaN();
+  std::array<c_type, 3> all_pos_inf{inf, inf, inf};

Review Comment:
   You can just use `std::vector` as a container instead of explicitly 
declaring array sizes like this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to