Hi community, I am using ORC C++ with filter pushdown (using similar approaches in TestPredicatePushdown.cc). By varying rowIndexStride, I found that for a low selectivity query, which means smaller rowIndexStride should eliminate more IO, the scan time even goes up. This typically happens when rowIndexStride is below 1000.
A simple perf profiling shows that for an extreme case where I set rowIndexStride=100, the time cost is from loadStripeIndex(). I was wondering why? Is this because of the cost of protobuf parsing of a lot of indexes? Thanks a lot, Xinyu