iemejia commented on code in PR #12395:
URL: https://github.com/apache/gluten/pull/12395#discussion_r3493627404
##########
cpp/velox/compute/delta/DeltaDeletionVectorReader.cpp:
##########
@@ -183,15 +183,29 @@ void
DeltaDeletionVectorReader::applyDeletionFilter(uint64_t baseReadOffset, uin
auto* rawBitmap = deleteBitmap->asMutable<uint64_t>();
std::memset(rawBitmap, 0, bits::nbytes(size));
+ // Use an iterator-based approach instead of per-row contains() lookups.
+ // This is O(deletions_in_range) rather than O(batch_size), which is
+ // significantly faster when deletions are sparse relative to batch size.
+ const uint64_t rangeEnd = baseReadOffset + size;
Review Comment:
Fixed — added saturating arithmetic so `rangeEnd` clamps to `UINT64_MAX`
instead of wrapping:
```cpp
const uint64_t rangeEnd = (size <= UINT64_MAX - baseReadOffset)
? baseReadOffset + size : UINT64_MAX;
```
Also added a unit test (`ApplyDeletionFilterOverflowProtection`) that
exercises `baseReadOffset = UINT64_MAX - 50` with `size = 100` to verify the
guard works correctly.
##########
cpp/velox/benchmarks/DeltaBitmapBenchmark.cc:
##########
@@ -276,4 +280,69 @@ BENCHMARK_CAPTURE(
->Args({1 << 18, 64})
->Unit(benchmark::kMillisecond);
+// Benchmark for applyDeletionFilter: measures the hot path where a batch of
+// rows is checked against the deletion vector bitmap.
+// deletionPercent: fraction of rows in the total file that are deleted.
+// batchSize: number of rows per batch (typical Velox batch size).
+void BM_ApplyDeletionFilter(benchmark::State& state, double deletionPercent) {
+ const auto batchSize = static_cast<uint64_t>(state.range(0));
+ const uint64_t totalFileRows = 1000000; // 1M row file
+ const auto numDeleted = static_cast<uint64_t>(totalFileRows *
deletionPercent / 100.0);
+
+ // Build a DV with deletions spread across the file.
+ RoaringBitmapArray bitmap;
+ const uint64_t stride = numDeleted > 0 ? totalFileRows / numDeleted : 0;
+ for (uint64_t i = 0; i < numDeleted; ++i) {
+ bitmap.addSafe(i * stride);
+ }
+ const auto payload = bitmap.serializeToString(true);
+
+ // Load the DV reader.
+ DeltaDeletionVectorReader reader;
+ reader.loadSerializedDeletionVector(std::string_view(payload.data(),
payload.size()));
+
+ // Allocate the output bitmap buffer.
+ memory::MemoryManager::testingSetInstance(memory::MemoryManager::Options{});
+ auto pool = memory::memoryManager()->addLeafPool();
Review Comment:
Good catch — moved `testingSetInstance` out of the benchmark body into a
custom `main()` that initializes once before running benchmarks. This mirrors
the pattern in `GenericBenchmark.cc:642`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]