akevinge commented on issue #7036: URL: https://github.com/apache/arrow-datafusion/issues/7036#issuecomment-1646699105
> This is currently supported for CSV files but not Parquet. I think this would be good first issue for new contributors. It is supported for CSV, but there's a bit of loopy logic. The issue is that [manually specifying the schema for a Parquet file will error](https://github.com/apache/arrow-datafusion/blob/49c91b563ad894b2f368690d85402895bdeaa73a/datafusion/sql/src/statement.rs#L636) and so does [not specifying the scheme when there's an ordering](https://github.com/apache/arrow-datafusion/blob/49c91b563ad894b2f368690d85402895bdeaa73a/datafusion/core/tests/sql/order.rs#L86) - checkmate. The reason it works for CSV's is because we allow the specifying of schemas for them. Is there a reason behind disallowing schemas for Parquet files? In [the docs](https://arrow.apache.org/datafusion/user-guide/sql/ddl.html#create-external-table) it says "It is not necessary to provide schema information for Parquet files," but that makes it sound optional when it is disallowed by implementation. @ozankabak -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
