mapleFU commented on issue #34722:
URL: https://github.com/apache/arrow/issues/34722#issuecomment-1482995057

   The interface:
   
   ```c++
   // Abstract page iterator interface. This way, we can feed column pages to 
the
   // ColumnReader through whatever mechanism we choose
   class PARQUET_EXPORT PageReader {
     using DataPageFilter = std::function<bool(const DataPageStats&)>;
   
    public:
     virtual ~PageReader() = default;
   
     // @returns: shared_ptr<Page>(nullptr) on EOS, std::shared_ptr<Page>
     // containing new Page otherwise
     virtual std::shared_ptr<Page> NextPage() = 0;
   ```
   
   The actual:
   
   ```c++
   // This subclass delimits pages appearing in a serialized stream, each 
preceded
   // by a serialized Thrift format::PageHeader indicating the type of each page
   // and the page metadata.
   class SerializedPageReader : public PageReader {
    public:
     SerializedPageReader(std::shared_ptr<ArrowInputStream> stream, int64_t 
total_num_values,
                          Compression::type codec, const ReaderProperties& 
properties,
                          const CryptoContext* crypto_ctx, bool 
always_compressed)
         : properties_(properties),
           stream_(std::move(stream)),
           decompression_buffer_(AllocateBuffer(properties_.memory_pool(), 0)),
           decryption_buffer_(AllocateBuffer(properties_.memory_pool(), 0)) {}
   ```
   
   It (implicitly) depend on `SerializedPageReader`'s buffer


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to