viirya commented on issue #119:
URL: 
https://github.com/apache/arrow-datafusion-comet/issues/119#issuecomment-1965768331

   > feel free to break up the tasks for join when you think it is necessary, 
to improve the parallelism
   
   For SortMergeJoin support in Comet, it is a integral one like other working 
items we finished and are working on, and it makes more sense work on it as 
whole (except that you want to break it out to serde code, 
CometSortMergeJoinExec operator class, test, etc. 😂 ).
   
   There are some pre tasks and they are finished, e.g., relaxing join on 
expression type and adding join filter support.
   
   Improving DataFusion SortMergeJoin could be a separate task as it is 
orthogonal to the task of adding support in Comet. Although I am not where is 
the performance bottleneck yet, but from the benchmark I ran before compared to 
Spark, it doesn't have better performance but just similar.
   
   SortMergeJoin spilling support is also another separate task. I created a 
ticket for that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to