sidshehria commented on issue #1032: URL: https://github.com/apache/datafusion-python/issues/1032#issuecomment-2674996952
@timsaucer Yes, kind of some solutions I have in my mind Kindly review them, **1. Higher-Level Abstractions:** - Introduce a DataFrame-like API that feels more intuitive, similar to Pandas/Polars. - Expose simplified query execution methods, reducing the need for manual SQL queries. - Provide a lazy evaluation mode to optimize performance in large-scale data operations. **2. Better Integration with Pandas/Polars:** - Implement direct conversion utilities between DataFusion and Pandas/Polars DataFrames. - Improve data type compatibility to ensure smooth interoperability. - Support efficient batch processing, leveraging Arrow’s memory format. **3. Performance Optimizations in the FFI Layer:** - Reduce overhead in Python-Rust interop using PyO3/maturin optimizations. - Optimize data movement between Python and Rust to minimize serialization costs. - Explore parallel execution to enhance computation speed for large datasets. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org