Igosuki commented on issue #1916: URL: https://github.com/apache/arrow-datafusion/issues/1916#issuecomment-1066201917
Datafusion has a huge advantage, but the user story needs to be improved. Also, horizontal scalability and using it for ML in distributed is still an issue because it needs to be able to map partitions and have multi-stage jobs. Le dim. 13 mars 2022 à 22:51, Lin Ma ***@***.***> a écrit : > Also, as a standalone system, Ballista will compete with the heavy weights > in the category (Spark, Presto..). That is an interesting but very > ambitious goal 😄 > > I feel some opportunities/differentiators for Ballista are the following: > > 1. non-JVM - this brings a lot of benefit such as lower footprint, > memory efficiency, and no GC cost (Rust specific) > 2. A chance for more modern design principles - for example Spark was > originally architected to best deployed to bare metal, it is hard to make > some changes to be more cloud friendly > 3. Utilize modern resource management and orchestration technologies - > reusing mature tools like k8s will simplify Ballista's implementation and > integrate easily with modern systems > 4. Using Arrow as the backbone opens doors for more advanced use case > such as ML - it may be efficiently integrated with Pandas or Tensorflow > through Arrow. > > We heavily use systems like Spark for Analytics and ML, the above points > are pain points that worth consider switching. > > — > Reply to this email directly, view it on GitHub > <https://github.com/apache/arrow-datafusion/issues/1916#issuecomment-1066188955>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AADDFBWMEEONETGGK73ASOTU7ZWNXANCNFSM5P3MCIBQ> > . > Triage notifications on the go with GitHub Mobile for iOS > <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> > or Android > <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. > > You are receiving this because you commented.Message ID: > ***@***.***> > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
