[GitHub] [arrow] ravwojdyla commented on issue #35393: High (resident) memory usage when fetching Parquet metadata/schema

via GitHub Fri, 05 May 2023 10:40:24 -0700


ravwojdyla commented on issue #35393:
URL: https://github.com/apache/arrow/issues/35393#issuecomment-1536574997


   Thanks both of you for suggestions. For an existing data I obviously can't 
change any of that. We will look into changing those on the writing side for 
future datasets.
   
   > But I agree that it would be nice if we could get the schema of the file 
without having to parse the row group metadata.
   
   Huge +1 to that. We have a process that crawls existing metadata, looks into 
the schema of parquet datasets, this util would be amazing! Should I create a 
separate issue for this? Should we reuse this issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] ravwojdyla commented on issue #35393: High (resident) memory usage when fetching Parquet metadata/schema

Reply via email to