Hi Xinyu,

When the row group stride is set to 100, we end up with many row groups and
each contributes a protobuf object in the stripe index. That's why you see
the most expensive function is loadStripeIndex().

I need to say that smaller row groups may not help reduce the I/Os since
the compression blocks by design are not aligned to the row group boundary.
For example, if we have one compression block containing 5 row groups and
only the 3rd row group survives the PPD, we still need the I/O of the
entire compressed block and decompress the two row groups before the 3rd
one.

Hope my answer helps.

Best,
Gang

On Mon, Sep 5, 2022 at 4:15 PM Xinyu Z <xzen...@gmail.com> wrote:

> Hi community,
>
> I am using ORC C++ with filter pushdown (using similar approaches in
> TestPredicatePushdown.cc). By varying rowIndexStride, I found that for
> a low selectivity query, which means smaller rowIndexStride should
> eliminate more IO, the scan time even goes up. This typically happens
> when rowIndexStride is below 1000.
>
> A simple perf profiling shows that for an extreme case where I set
> rowIndexStride=100, the time cost is from loadStripeIndex(). I was
> wondering why? Is this because of the cost of protobuf parsing of a
> lot of indexes?
>
> Thanks a lot,
> Xinyu
>

Reply via email to