jp0317 commented on code in PR #39818:
URL: https://github.com/apache/arrow/pull/39818#discussion_r1468739708
##########
cpp/src/parquet/column_reader.cc:
##########
@@ -70,6 +70,8 @@ namespace {
// The minimum number of repetition/definition levels to decode at a time, for
// better vectorized performance when doing many smaller record reads
constexpr int64_t kMinLevelBatchSize = 1024;
+// The max buffer size of validility bitmap for skipping buffered levels.
+constexpr int64_t kMaxSkipLevelBufferSize = 128;
Review Comment:
tried 1024 byte size and got similar results. It basically correlates with
the number of levels being buffered. I tend to keep it small to avoid
occupying too much memory as it's per column buffer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]