thinkharderdev commented on issue #12454: URL: https://github.com/apache/datafusion/issues/12454#issuecomment-2351814850
> It still not very clear for me what exactly required to be done on DF side The basic idea (which I have implemented on our internal fork here https://github.com/coralogix/arrow-datafusion/pull/269) is to provide a hook in DataFusion which can be passed in through the `TaskContext`. The hook is to allow `HashJoinStream` to share state between partitions potentially running on different nodes. Specifically, it is sufficient to just share the bitmask of matched rows on the build side, then the last partition to execute can emit the unmatched rows. > Referring to Parallel HJ concepts https://www.youtube.com/watch?v=QCTyOLvzR88 some external process needs to repartition(it can be hash or by specific columns) the data before sending it to the nodes. More on partitioning strategies is in https://www.youtube.com/watch?v=S40K8iGa8Ek Hash partitioning is the standard way to do this in general I agree but its just not possible in practice to shuffle that much data as required by hash partitioning. The best performing way to do it by far is a broadcast join which is currently not possible. My opinion personally is that since DataFusion is being used by a number of organizations to build distributed query engines (even if DF itself is not that) it is reasonable to add hooks/extension points in DF to support that use case as long as 1. It doesn't add any measurable overhead to single node query execution (eg it should be explicitly opt-in) 2. It doesn't add significant code complexity and maintenance burden If this conversation makes sense to concretize this on a particular code change I'm happy to submit a PR here to frame things. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
