sunchao commented on issue #119: URL: https://github.com/apache/arrow-datafusion-comet/issues/119#issuecomment-1965745491
@viirya feel free to break up the tasks for join when you think it is necessary, to improve the parallelism :) (I'm not sure whether some extra work is required for broadcast join atm). @advancedxy you can also check the existing operators on the Spark side and see if there are some gaps that we should fill. There are also a bunch of tasks on the DataFusion side in particular on aggregate and join performances. To name a few: - https://github.com/apache/arrow-datafusion/issues/6937 - https://github.com/apache/arrow-datafusion/issues/7955 - add spilling for SMJ in DF (@viirya do we have an issue tracking this?) I think implementing the support of operator is just the start. How to get good performance out of them will also become very important in future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
