Copilot commented on code in PR #13320:
URL: https://github.com/apache/trafficserver/pull/13320#discussion_r3470801724
##########
tools/benchmark/CMakeLists.txt:
##########
@@ -33,17 +33,5 @@ target_link_libraries(benchmark_ProxyAllocator PRIVATE
Catch2::Catch2WithMain ts
add_executable(benchmark_SharedMutex benchmark_SharedMutex.cc)
target_link_libraries(benchmark_SharedMutex PRIVATE Catch2::Catch2 ts::tscore
libswoc::libswoc)
-add_executable(benchmark_Random benchmark_Random.cc)
-target_link_libraries(benchmark_Random PRIVATE Catch2::Catch2WithMain
ts::tscore)
-
-add_executable(benchmark_HostDB benchmark_HostDB.cc)
-target_link_libraries(
- benchmark_HostDB
- PRIVATE ts::tscore
- ts::tsutil
- ts::inkevent
- ts::http
- ts::http_remap
- ts::inkcache
- ts::inkhostdb
-)
+add_executable(benchmark_ascii_tolower benchmark_ascii_tolower.cc)
+target_link_libraries(benchmark_ascii_tolower PRIVATE Catch2::Catch2WithMain
ts::tscore)
Review Comment:
benchmark_Random and benchmark_HostDB source files still exist in
tools/benchmark/, but their targets were removed from this CMakeLists. This
leaves dead benchmark sources and likely wasn’t intended; either restore the
targets or delete the unused sources.
##########
src/tscore/unit_tests/test_ink_base64.cc:
##########
@@ -30,209 +32,174 @@
#include <tscore/ink_base64.h>
+#include "../ink_base64_scalar.h" // scalar oracle: encode_scalar /
decode_scalar / count_alphabet_prefix
+
#include <catch2/catch_test_macros.hpp>
-#include <algorithm>
#include <cstdint>
-#include <cstring>
#include <random>
#include <string>
#include <vector>
Review Comment:
This test now uses std::max and std::copy but no longer includes
<algorithm>. Some standard libraries won’t provide those declarations
indirectly, so this can fail to compile.
##########
src/tscore/CMakeLists.txt:
##########
@@ -159,6 +170,8 @@ if(BUILD_TESTING)
unit_tests/test_Tokenizer.cc
unit_tests/test_arena.cc
unit_tests/test_ink_base64.cc
+ unit_tests/test_ink_ascii_tolower.cc
+ unit_tests/test_ink_base64.cc
unit_tests/test_ink_inet.cc
unit_tests/test_ink_memory.cc
Review Comment:
test_tscore lists unit_tests/test_ink_base64.cc twice, which can lead to
duplicate compilation / duplicate symbols depending on generator behavior. It
should only be listed once.
##########
tests/fuzzing/fuzz_base64.cc:
##########
@@ -0,0 +1,115 @@
+/** @file
+
+ fuzzing tscore base64 (ats_base64_encode / ats_base64_decode)
+
+ Treats the fuzz input as untrusted base64 and decodes it, then round-trips
+ arbitrary bytes through encode/decode. Every operation is cross-checked
+ against the scalar reference primitives, so any divergence of the public
+ (SIMD, when ENABLE_HIGHWAY_DISPATCH is on) path from the scalar path aborts.
+ Run under AddressSanitizer to catch any out-of-bounds access on the
+ untrusted-input decode path.
+
+ @section license License
+
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+ */
+
+#include <cstddef> // size_t, used by ink_base64.h
+
+#include <tscore/ink_base64.h>
+
+#include "ink_base64_scalar.h" // scalar reference (added to include path by
CMake)
+
+#include <cstdint>
+#include <cstdlib>
+#include <cstring>
+#include <string>
+#include <vector>
+
Review Comment:
std::fprintf is used but <cstdio> is not included. Relying on indirect
includes is non-portable and can break compilation on some platforms/toolchains.
##########
tools/benchmark/benchmark_ascii_tolower.cc:
##########
@@ -0,0 +1,138 @@
+/** @file
+
+ Micro benchmark for ts::ascii::tolower_copy against a byte-at-a-time
+ scalar loop equivalent to the prior URL.cc::memcpy_tolower definition.
+
+ @section license License
+
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+ */
+
+#define CATCH_CONFIG_ENABLE_BENCHMARKING
+
+#include <catch2/catch_test_macros.hpp>
+#include <catch2/benchmark/catch_benchmark.hpp>
+#include <catch2/benchmark/catch_optimizer.hpp>
+
+#include "tscore/ink_ascii_tolower.h"
+#include "tscore/ParseRules.h"
+
+#include <array>
+#include <cstdint>
+#include <cstring>
+#include <iostream>
+#include <random>
+#include <string>
+#include <vector>
+
+namespace
+{
+
+// Sizes chosen to mirror the URL.cc hot path:
+// 4-8B - common HTTP scheme strings ("http", "https")
+// 16-32B - typical host names
+// 64-256B - long host names / cache-key segments
+// 1024B - stress the inner loop
+constexpr std::array<std::size_t, 8> kSizes{4, 8, 16, 24, 32, 64, 256, 1024};
+
+// Same character distribution we expect from URL/host input: ASCII letters
+// (mixed case), digits, and the small set of non-alpha bytes that legitimately
+// appear in URLs.
+std::vector<char>
+make_mixed_case_ascii(std::size_t n, std::uint64_t seed = 0xABCDEFULL)
+{
+ std::mt19937_64 rng(seed);
+ std::vector<char> v(n);
+ for (std::size_t i = 0; i < n; ++i) {
+ auto r = static_cast<unsigned>(rng() & 0x3FU);
+ if (r < 26U) {
+ v[i] = static_cast<char>('A' + r);
+ } else if (r < 52U) {
+ v[i] = static_cast<char>('a' + (r - 26U));
+ } else {
+ static constexpr char kNonAlpha[] = "0123456789-_./:";
+ v[i] = kNonAlpha[r % (sizeof(kNonAlpha) -
1U)];
+ }
+ }
+ return v;
+}
+
+// Mirror of the prior static inline memcpy_tolower() from URL.cc, kept here
+// as the baseline the SIMD path is expected to beat.
+inline void
+tolower_scalar(char *d, const char *s, std::size_t n) noexcept
+{
+ while (n--) {
+ *d = ParseRules::ink_tolower(*s);
+ ++s;
+ ++d;
+ }
+}
+
+} // namespace
+
+TEST_CASE("active SIMD configuration", "[tolower][config]")
+{
+ // Print the configuration so the benchmark output makes the selected
+ // implementation obvious.
+ std::cout << "ts::ascii::tolower_copy implementation: ";
+#if TS_HAS_HIGHWAY_DISPATCH
+ std::cout << "Highway runtime dispatch (selects best available target at
startup)";
+#elif defined(__AVX512BW__)
+ std::cout << "compile-time cascade — AVX-512BW (64B body + masked tail,
gated at n>=64) + AVX2 + SSE2";
+#elif defined(__AVX2__)
+ std::cout << "compile-time cascade — AVX2 (32B body) + SSE2 (16B drain)";
+#elif defined(__SSE2__)
+ std::cout << "compile-time cascade — SSE2 (16B body)";
+#elif defined(__ARM_NEON) || defined(__aarch64__)
+ std::cout << "compile-time cascade — NEON (16B body)";
+#else
+ std::cout << "compile-time cascade — scalar only";
+#endif
+ std::cout << '\n';
+ SUCCEED();
+}
+
+TEST_CASE("tolower throughput", "[bench][tolower]")
+{
+ for (std::size_t sz : kSizes) {
+ auto input = make_mixed_case_ascii(sz);
+ std::vector<char> output_scalar(sz);
+ std::vector<char> output_simd(sz);
+
+ // Catch::Benchmark::keep_memory clobbers the buffer in the compiler's
+ // model, forcing it to materialize every byte we wrote. Without this an
+ // optimizing compiler can shrink or DCE the inline body's stores past
+ // the first element we observed.
+
+ std::string scalar_name = "scalar " + std::to_string(sz) + "B";
+ BENCHMARK(scalar_name.c_str())
+ {
+ tolower_scalar(output_scalar.data(), input.data(), sz);
+ Catch::Benchmark::keep_memory(output_scalar.data());
+ return output_scalar[0];
+ };
+
+ std::string simd_name = "ts::atc " + std::to_string(sz) + "B";
Review Comment:
The benchmark label "ts::atc" is unclear/likely a typo; it makes the output
harder to interpret. Consider using a name that matches the API being measured
(ts::ascii::tolower_copy).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]