fatemah created PARQUET-2175:
--------------------------------

             Summary: Skip method skips levels and not rows for repeated fields
                 Key: PARQUET-2175
                 URL: https://issues.apache.org/jira/browse/PARQUET-2175
             Project: Parquet
          Issue Type: Bug
          Components: parquet-cpp
            Reporter: fatemah


The implementation of TypedColumnReader::Skip method with signature:

virtual int64_t Skip(int64_t num_levels_to_skip) = 0;

will skip levels for both repeated fields and non-repeated fields. We want to 
be able to skip rows for repeated fields, and skipping levels is not that 
useful.

For example, for the following rows:

message M \{ repeated int32 b = 1 }

rows: {}, \{[10,10]}, \{[20, 20, 20]}

values = \{10, 10, 20, 20, 20};
def_levels = \{0, 1, 1, 1, 1, 1};
rep_levels = \{0, 0, 1, 0, 1, 1};

We want skip(2) to skip the first two rows, so that the next value that we read 
is 20. However, it will skip the first two levels, and the next value that we 
read is 10.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to