alamb commented on PR #5605: URL: https://github.com/apache/arrow-rs/pull/5605#issuecomment-2059397523
Maybe we can add some way to the parquet arrow reader to override its choice of data type for certain columns to allow users to specify types for cases where it is not clear from the parquet file itself. @mapleFU and I have been discussing the need for something similar for deciding what Array type to use when reading strings from Parquet files on https://github.com/apache/arrow-rs/issues/5530 -- see https://github.com/apache/arrow-rs/issues/5530#issuecomment-2052223254) If we had such an API then people using spark created parquet files could specify that the timestamp column should always be UTC (as suggested by @tustvold ) without having to add an explicit cast afterwards -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
