alamb commented on issue #1329: URL: https://github.com/apache/arrow-datafusion/issues/1329#issuecomment-973284996
There is also a substantial list of "powered by" Arrow systems at https://arrow.apache.org/powered_by/ Something else which may be obvious, but I wanted to make explicit, is that that DataFusion doesn't have its own "native" storage format in the way that DuckDB or other DBMS systems do -- DataFusion is a query engine that can be used if you have your data in Arrow record batches (or want to load them into memory using `register_record_batches`). If you are comparing DuckDB and DataFusion, another comparison might be to start with data in parquet files and compare the timings of: 1. Time to load the parquet into DuckDB + time to run the query (or time to run the queries in DuckDB if it supports external tables) 2. The time needed to run the query in DataFusion directly against parquet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
