[GitHub] [arrow] westonpace commented on issue #36608: [C++] Support reading Hadoop-snappy File Format Directly

via GitHub Thu, 13 Jul 2023 06:37:58 -0700


westonpace commented on issue #36608:
URL: https://github.com/apache/arrow/issues/36608#issuecomment-1634261236


   Are you reading CSV files?  If not, then I am not sure that a streaming view 
will be very helpful.  We generally rely on random access to metadata.  This 
allows us to implement things like column skipping (projection pushdown).  This 
is why formats like the arrow format and the parquet format support 
buffer-compression (instead of whole-file compression).  It allows the metadata 
to stay uncompressed.
   
   When you talk about blocks are you talking about the snappy framing format?  
https://github.com/google/snappy/blob/main/framing_format.txt
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on issue #36608: [C++] Support reading Hadoop-snappy File Format Directly

Reply via email to