alamb opened a new pull request, #4427: URL: https://github.com/apache/arrow-datafusion/pull/4427
this is a reworked version of https://github.com/apache/arrow-datafusion/pull/3885 # Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/3821 This also helps towards #3887 # Rationale for this change 1. Make it easier for people to see what parquet config options are available will make it more likely they are used 2. The more mechanisms that configuration is supplied, the more likely it to confuse people It turns out options for reading parquet files were able to be set (and possibly) overridden by no less than three different structures! This is confusing, to say the least. # What changes are included in this PR? 1. move metadata_size_hint, enable_pruning, and merge_schema_metadata to new config options 2. Make the precidence of the parquet options passed down to the ParquetExec clear # Are there any user-facing changes? The main change is that now all parquet reader settings are visible session wide. Previously, depending on which of the APIs was used to create / register / run parquet, the settings might change if you change the session config or they might have been a snapshot based on when you registered the reader -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
