github-actions[bot] commented on code in PR #63472:
URL: https://github.com/apache/doris/pull/63472#discussion_r3283799587


##########
be/src/util/percentile_util.h:
##########
@@ -54,16 +54,39 @@ class Counts {
     }
 
     void increment(Ty key, uint32_t i) {
+        if constexpr (std::is_floating_point_v<Ty>) {
+            if (std::isnan(key)) {
+                return;
+            }
+        }
         auto old_size = _nums.size();
         _nums.resize(_nums.size() + i);
         for (uint32_t j = 0; j < i; ++j) {
             _nums[old_size + j] = key;
         }
     }
 
-    void increment(Ty key) { _nums.push_back(key); }
+    void increment(Ty key) {
+        if constexpr (std::is_floating_point_v<Ty>) {
+            if (std::isnan(key)) {
+                return;
+            }
+        }
+        _nums.push_back(key);
+    }
 
-    void increment_batch(const PaddedPODArray<Ty>& keys) { 
_nums.insert(keys.begin(), keys.end()); }
+    void increment_batch(const PaddedPODArray<Ty>& keys) {
+        if constexpr (std::is_floating_point_v<Ty>) {
+            _nums.reserve(_nums.size() + keys.size());
+            for (auto key : keys) {
+                if (!std::isnan(key)) {
+                    _nums.push_back(key);

Review Comment:
   This skips NaNs for the legacy `Counts` batch path, but an all-NaN input now 
leaves the `Counts` object initialized with both `_nums` and `_sorted_nums_vec` 
empty. If this state is finalized locally instead of being serialized, 
`PercentileState::get()` / `insert_result_into()` still call 
`Counts::terminate()`, whose empty branch returns `0.0`, so the legacy 
`percentile` / `percentile_array` compatibility path can return `0` or `[0, 
...]` for all-NaN floating input instead of the PR's intended `NaN` result. 
This is distinct from the existing serialization-recursion thread because it 
happens without aggregation-state serialization. Please update the empty 
`Counts` finalization semantics for floating types and add a legacy all-NaN 
finalization test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to