[C++] Setting rowIndexStride to a small value increases query time

Xinyu Z Mon, 05 Sep 2022 01:15:02 -0700

Hi community,

I am using ORC C++ with filter pushdown (using similar approaches in
TestPredicatePushdown.cc). By varying rowIndexStride, I found that for
a low selectivity query, which means smaller rowIndexStride should
eliminate more IO, the scan time even goes up. This typically happens
when rowIndexStride is below 1000.


A simple perf profiling shows that for an extreme case where I set
rowIndexStride=100, the time cost is from loadStripeIndex(). I was
wondering why? Is this because of the cost of protobuf parsing of a
lot of indexes?

Thanks a lot,
Xinyu

[C++] Setting rowIndexStride to a small value increases query time

Reply via email to