kszlim commented on issue #12357:
URL: https://github.com/apache/datafusion/issues/12357#issuecomment-2345293094

   One vote here for the other use case. I'd like datafusion to be usable as a 
single node query engine (alongside a nice dataframe api). This is in works 
within the datafusion-python bindings, but I'd personally love for this use 
case to gain as much priority as datafusion as a library to build other db 
products on top of.
   
   I really think with a combination of really strong python bindings (and 
ensuring that all extension points are also appropriately exposed to python), 
https://github.com/apache/datafusion/issues/4285, and a lot of work into making 
the docs and the python bindings as nice as polars. Datafusion could become 
*the* go to solution for ETL/OLAP/ML/data engineering/etc. use cases.
   
   DataFusion has a lot of really excellent foundational engineering. How it's 
used by so many downstream DB engines attests strongly to that. I think it's a 
real shame that it isn't quite as suitable for the role that 
pandas/dask/polars/duckdb currently occupies. This isn't due to anything 
lacking in the query engine, but the overall user experience for a direct user 
isn't quite as solid (as opposed to someone using it as a library).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to