[GitHub] [arrow-rs] sundy-li commented on pull request #4299: chore: export fn parquet_to_array_schema_and_fields

via GitHub Mon, 29 May 2023 04:51:39 -0700


sundy-li commented on PR #4299:
URL: https://github.com/apache/arrow-rs/pull/4299#issuecomment-1567035679


   Yes, `AsyncFileReader`'s `get_metadata` can work. 
   
   But:
   1. We don't store the whole metadata of the parquet files, we just store the 
`Vec<ColumnChunkMetaData>` of each leaf column, because we only write one row 
group, so it's much simple and small metadata.
   2. The `ParquetRecordBatchStream` will be reading IO task in dedicated async 
runtime and decoding in blocking threads. But we have completely separated the 
two processes, we will first fetch the Bytes in dedicated async runtime and 
send the results to a thread pool to decode them into arrrays.
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] sundy-li commented on pull request #4299: chore: export fn parquet_to_array_schema_and_fields

Reply via email to