rymarm commented on PR #2937: URL: https://github.com/apache/drill/pull/2937#issuecomment-2326078335
@paul-rogers, there is no new feature. This behavior of reading all parquet files metadata during the planning phase has been present for a long time. Moreover, we even have a feature called "parquet metadata cache" aimed to resolve the con of this logic when the planning phase takes significant time due to the reading of metadata of many distinct parquet files > Parquet metadata caching is a feature that enables Drill to read a single metadata cache file instead of retrieving metadata from multiple Parquet files during the query-planning phase > ... > Metadata caching is useful when planning time is a significant percentage of the total elapsed time of the query https://drill.apache.org/docs/optimizing-parquet-metadata-reading/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org