tustvold commented on code in PR #2854: URL: https://github.com/apache/arrow-datafusion/pull/2854#discussion_r916720045
########## README.md: ########## @@ -21,52 +21,70 @@ <img src="docs/source/_static/images/DataFusion-Logo-Background-White.svg" width="256"/> -DataFusion is an extensible query execution framework, written in +DataFusion is an extensible query planning, optimization, and execution framework, written in Rust, that uses [Apache Arrow](https://arrow.apache.org) as its in-memory format. -DataFusion supports both an SQL and a DataFrame API for building -logical query plans as well as a query optimizer and execution engine -capable of parallel execution against partitioned data sources (CSV -and Parquet) using threads. +## Features -DataFusion also supports distributed query execution via the -[Ballista](https://github.com/apache/arrow-ballista/) crate. +- SQL query planner with support for multiple SQL dialects +- DataFrame API +- Parquet, CSV, JSON, and Avro file formats are supported natively. Custom + file formats can be supported by implementing a `TableProvider` trait. +- Supports popular object stores, including AWS S3, Azure Blob Review Comment: Yes, it also supports GCS -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
