Re: [PR] [exec](feat) Dynamic streaming hash table min reduction based on CPU cache [doris]

via GitHub Tue, 24 Mar 2026 01:25:29 -0700


github-actions[bot] commented on code in PR #61326:
URL: https://github.com/apache/doris/pull/61326#discussion_r2979831391



##########
be/src/util/cpu_info.h:
##########
@@ -148,6 +148,56 @@ class CpuInfo {
 
     static std::string debug_string();
 
+    static long get_cache_size(CacheLevel level) {
+        long cache_sizes[NUM_CACHE_LEVELS];
+        long cache_line_sizes[NUM_CACHE_LEVELS];
+        _get_cache_info(cache_sizes, cache_line_sizes);
+        return cache_sizes[level];
+    }
+
+    static long get_cache_line_size(CacheLevel level) {
+        long cache_sizes[NUM_CACHE_LEVELS];
+        long cache_line_sizes[NUM_CACHE_LEVELS];
+        _get_cache_info(cache_sizes, cache_line_sizes);
+        return cache_line_sizes[level];
+    }
+
+    struct StreamingHtMinReductionEntry {
+        long min_ht_mem;
+        double streaming_ht_min_reduction;
+    };
+
+    static const std::vector<StreamingHtMinReductionEntry>& 
get_streaming_ht_min_reduction() {

Review Comment:
   **[Duplicate Type / ODR Concern]** `StreamingHtMinReductionEntry` is already 
defined in `be/src/exec/operator/streaming_agg_min_reduction.h` with `int 
min_ht_mem`, but this new definition uses `long min_ht_mem`. Both headers are 
included in the same translation units (`streaming_aggregation_operator.cpp` 
and `distinct_streaming_aggregation_operator.cpp`), which would cause a 
compilation error or ODR violation.
   
   Also, this aggregation-specific struct does not belong in a general-purpose 
`CpuInfo` class. `CpuInfo` should only expose cache size information; the 
streaming aggregation reduction logic belongs near the aggregation operators.



##########
be/src/exec/operator/distinct_streaming_aggregation_operator.cpp:
##########
@@ -25,6 +25,7 @@
 #include "common/compiler_util.h" // IWYU pragma: keep
 #include "exec/operator/streaming_agg_min_reduction.h"
 #include "exprs/vectorized_agg_fn.h"
+#include "util/cpu_info.h"

Review Comment:
   **[Dead Code]** This `#include "util/cpu_info.h"` is added but never used — 
`_should_expand_preagg_hash_tables()` at line 96-98 still uses the old static 
`STREAMING_HT_MIN_REDUCTION` / `SINGLE_BE_STREAMING_HT_MIN_REDUCTION` arrays 
from `streaming_agg_min_reduction.h`. The new 
`CpuInfo::get_streaming_ht_min_reduction()` is never called.



##########
be/src/exec/operator/streaming_aggregation_operator.cpp:
##########
@@ -24,18 +24,14 @@
 
 #include "common/cast_set.h"
 #include "common/compiler_util.h" // IWYU pragma: keep
-#include "exec/operator/operator.h"
 #include "exec/operator/streaming_agg_min_reduction.h"
 #include "exprs/aggregate/aggregate_function_simple_factory.h"
 #include "exprs/vectorized_agg_fn.h"
 #include "exprs/vslot_ref.h"

Review Comment:
   **[Dead Code]** Same as the other file: `#include "util/cpu_info.h"` is 
added but `CpuInfo::get_streaming_ht_min_reduction()` is never called. The code 
at line 194-196 still uses the old static `STREAMING_HT_MIN_REDUCTION` arrays. 
The PR has no functional effect.



##########
be/src/util/cpu_info.h:
##########
@@ -148,6 +148,56 @@ class CpuInfo {
 
     static std::string debug_string();
 
+    static long get_cache_size(CacheLevel level) {
+        long cache_sizes[NUM_CACHE_LEVELS];
+        long cache_line_sizes[NUM_CACHE_LEVELS];
+        _get_cache_info(cache_sizes, cache_line_sizes);
+        return cache_sizes[level];
+    }
+
+    static long get_cache_line_size(CacheLevel level) {
+        long cache_sizes[NUM_CACHE_LEVELS];
+        long cache_line_sizes[NUM_CACHE_LEVELS];
+        _get_cache_info(cache_sizes, cache_line_sizes);
+        return cache_line_sizes[level];
+    }
+
+    struct StreamingHtMinReductionEntry {
+        long min_ht_mem;
+        double streaming_ht_min_reduction;
+    };
+
+    static const std::vector<StreamingHtMinReductionEntry>& 
get_streaming_ht_min_reduction() {
+        static std::vector<StreamingHtMinReductionEntry> entries;
+        static bool initialized = false;
+
+        if (!initialized) {
+            long l2_cache_size = CpuInfo::get_cache_size(CpuInfo::L2_CACHE);

Review Comment:
   **[Thread Safety]** This lazy initialization pattern is **not thread-safe**. 
While C++11 guarantees thread-safe initialization of `static` local variables, 
that only applies to the initial construction. Here, `entries` is 
default-constructed (empty vector), and then mutated via `push_back()` 
conditionally on a separate `static bool initialized`. Two threads could both 
observe `initialized == false` simultaneously, leading to a data race on 
`entries`.
   
   The idiomatic fix is to use a lambda for initialization:
   ```cpp
   static const auto entries = []() {
       std::vector<StreamingHtMinReductionEntry> v;
       long l2 = CpuInfo::get_cache_size(CpuInfo::L2_CACHE);
       long l3 = CpuInfo::get_cache_size(CpuInfo::L3_CACHE);
       v.push_back({0, 0.0});
       // ... rest of logic ...
       return v;
   }();
   return entries;
   ```
   This leverages C++11's guarantee that the lambda runs exactly once, 
thread-safely.



##########
be/src/util/cpu_info.h:
##########
@@ -148,6 +148,56 @@ class CpuInfo {
 
     static std::string debug_string();
 
+    static long get_cache_size(CacheLevel level) {

Review Comment:
   **[Performance]** Each call to `get_cache_size()` invokes 
`_get_cache_info()` which performs syscalls (`sysconf` on Linux, `sysctlbyname` 
on macOS) to read all 3 cache levels. Cache sizes are hardware constants that 
don't change at runtime. Consider reading them once during `CpuInfo::init()` 
and storing them as static members, similar to how `num_cores_`, 
`hardware_flags_`, etc. are handled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [exec](feat) Dynamic streaming hash table min reduction based on CPU cache [doris]

Reply via email to