github-actions[bot] commented on code in PR #61326:
URL: https://github.com/apache/doris/pull/61326#discussion_r2979831391
##########
be/src/util/cpu_info.h:
##########
@@ -148,6 +148,56 @@ class CpuInfo {
static std::string debug_string();
+ static long get_cache_size(CacheLevel level) {
+ long cache_sizes[NUM_CACHE_LEVELS];
+ long cache_line_sizes[NUM_CACHE_LEVELS];
+ _get_cache_info(cache_sizes, cache_line_sizes);
+ return cache_sizes[level];
+ }
+
+ static long get_cache_line_size(CacheLevel level) {
+ long cache_sizes[NUM_CACHE_LEVELS];
+ long cache_line_sizes[NUM_CACHE_LEVELS];
+ _get_cache_info(cache_sizes, cache_line_sizes);
+ return cache_line_sizes[level];
+ }
+
+ struct StreamingHtMinReductionEntry {
+ long min_ht_mem;
+ double streaming_ht_min_reduction;
+ };
+
+ static const std::vector<StreamingHtMinReductionEntry>&
get_streaming_ht_min_reduction() {
Review Comment:
**[Duplicate Type / ODR Concern]** `StreamingHtMinReductionEntry` is already
defined in `be/src/exec/operator/streaming_agg_min_reduction.h` with `int
min_ht_mem`, but this new definition uses `long min_ht_mem`. Both headers are
included in the same translation units (`streaming_aggregation_operator.cpp`
and `distinct_streaming_aggregation_operator.cpp`), which would cause a
compilation error or ODR violation.
Also, this aggregation-specific struct does not belong in a general-purpose
`CpuInfo` class. `CpuInfo` should only expose cache size information; the
streaming aggregation reduction logic belongs near the aggregation operators.
##########
be/src/exec/operator/distinct_streaming_aggregation_operator.cpp:
##########
@@ -25,6 +25,7 @@
#include "common/compiler_util.h" // IWYU pragma: keep
#include "exec/operator/streaming_agg_min_reduction.h"
#include "exprs/vectorized_agg_fn.h"
+#include "util/cpu_info.h"
Review Comment:
**[Dead Code]** This `#include "util/cpu_info.h"` is added but never used —
`_should_expand_preagg_hash_tables()` at line 96-98 still uses the old static
`STREAMING_HT_MIN_REDUCTION` / `SINGLE_BE_STREAMING_HT_MIN_REDUCTION` arrays
from `streaming_agg_min_reduction.h`. The new
`CpuInfo::get_streaming_ht_min_reduction()` is never called.
##########
be/src/exec/operator/streaming_aggregation_operator.cpp:
##########
@@ -24,18 +24,14 @@
#include "common/cast_set.h"
#include "common/compiler_util.h" // IWYU pragma: keep
-#include "exec/operator/operator.h"
#include "exec/operator/streaming_agg_min_reduction.h"
#include "exprs/aggregate/aggregate_function_simple_factory.h"
#include "exprs/vectorized_agg_fn.h"
#include "exprs/vslot_ref.h"
Review Comment:
**[Dead Code]** Same as the other file: `#include "util/cpu_info.h"` is
added but `CpuInfo::get_streaming_ht_min_reduction()` is never called. The code
at line 194-196 still uses the old static `STREAMING_HT_MIN_REDUCTION` arrays.
The PR has no functional effect.
##########
be/src/util/cpu_info.h:
##########
@@ -148,6 +148,56 @@ class CpuInfo {
static std::string debug_string();
+ static long get_cache_size(CacheLevel level) {
+ long cache_sizes[NUM_CACHE_LEVELS];
+ long cache_line_sizes[NUM_CACHE_LEVELS];
+ _get_cache_info(cache_sizes, cache_line_sizes);
+ return cache_sizes[level];
+ }
+
+ static long get_cache_line_size(CacheLevel level) {
+ long cache_sizes[NUM_CACHE_LEVELS];
+ long cache_line_sizes[NUM_CACHE_LEVELS];
+ _get_cache_info(cache_sizes, cache_line_sizes);
+ return cache_line_sizes[level];
+ }
+
+ struct StreamingHtMinReductionEntry {
+ long min_ht_mem;
+ double streaming_ht_min_reduction;
+ };
+
+ static const std::vector<StreamingHtMinReductionEntry>&
get_streaming_ht_min_reduction() {
+ static std::vector<StreamingHtMinReductionEntry> entries;
+ static bool initialized = false;
+
+ if (!initialized) {
+ long l2_cache_size = CpuInfo::get_cache_size(CpuInfo::L2_CACHE);
Review Comment:
**[Thread Safety]** This lazy initialization pattern is **not thread-safe**.
While C++11 guarantees thread-safe initialization of `static` local variables,
that only applies to the initial construction. Here, `entries` is
default-constructed (empty vector), and then mutated via `push_back()`
conditionally on a separate `static bool initialized`. Two threads could both
observe `initialized == false` simultaneously, leading to a data race on
`entries`.
The idiomatic fix is to use a lambda for initialization:
```cpp
static const auto entries = []() {
std::vector<StreamingHtMinReductionEntry> v;
long l2 = CpuInfo::get_cache_size(CpuInfo::L2_CACHE);
long l3 = CpuInfo::get_cache_size(CpuInfo::L3_CACHE);
v.push_back({0, 0.0});
// ... rest of logic ...
return v;
}();
return entries;
```
This leverages C++11's guarantee that the lambda runs exactly once,
thread-safely.
##########
be/src/util/cpu_info.h:
##########
@@ -148,6 +148,56 @@ class CpuInfo {
static std::string debug_string();
+ static long get_cache_size(CacheLevel level) {
Review Comment:
**[Performance]** Each call to `get_cache_size()` invokes
`_get_cache_info()` which performs syscalls (`sysconf` on Linux, `sysctlbyname`
on macOS) to read all 3 cache levels. Cache sizes are hardware constants that
don't change at runtime. Consider reading them once during `CpuInfo::init()`
and storing them as static members, similar to how `num_cores_`,
`hardware_flags_`, etc. are handled.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]