ksuarez1423 commented on code in PR #14018: URL: https://github.com/apache/arrow/pull/14018#discussion_r982681341
########## docs/source/cpp/parquet.rst: ########## @@ -32,6 +32,298 @@ is a space-efficient columnar storage format for complex data. The Parquet C++ implementation is part of the Apache Arrow project and benefits from tight integration with the Arrow C++ classes and facilities. +Reading Parquet files +===================== + +The :class:`arrow::FileReader` class reads data for an entire +file or row group into an :class:`::arrow::Table`. + +The :class:`StreamReader` and :class:`StreamWriter` classes allow for +data to be written using a C++ input/output streams approach to +read/write fields column by column and row by row. This approach is +offered for ease of use and type-safety. It is of course also useful +when data must be streamed as files are read and written +incrementally. + +Please note that the performance of the :class:`StreamReader` and +:class:`StreamWriter` classes will not be as good due to the type +checking and the fact that column values are processed one at a time. + +FileReader +---------- + +The Parquet :class:`arrow::FileReader` requires a +:class:`::arrow::io::RandomAccessFile` instance representing the input +file. + +.. literalinclude:: ../../../cpp/examples/arrow/parquet_read_write.cc + :language: cpp + :start-after: arrow::Status ReadFullFile( + :end-before: return arrow::Status::OK(); + :emphasize-lines: 9-10,14 + :dedent: 2 + +Finer-grained options are available through the Review Comment: The linearity expectation isn't as strong, but I wasn't quite able to realize each snippet was standalone without a fresh header. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
