[GitHub] [arrow-datafusion] mingmwang commented on pull request #4620: Implement optimizer rule for reordering fact-dimension joins

GitBox Mon, 19 Dec 2022 18:02:09 -0800


mingmwang commented on PR #4620:
URL: 
https://github.com/apache/arrow-datafusion/pull/4620#issuecomment-1358734237


   > > > I prepare to review this PR carefully.
   > > 
   > > 
   > > I'm sorry I can't comment on this PR because I suffer from COVIDcold_face
   > 
   > Sorry to hear that! Hope you recover soon. I don't think this PR will be 
ready for review until sometime in January anyway. I am still fine-tuning the 
version of this that I am implementing for Spark and will update this PR after 
the US holidays.
   
   
   
   > > > I prepare to review this PR carefully.
   > > 
   > > 
   > > I'm sorry I can't comment on this PR because I suffer from COVIDcold_face
   > 
   > Sorry to hear that! Hope you recover soon. I don't think this PR will be 
ready for review until sometime in January anyway. I am still fine-tuning the 
version of this that I am implementing for Spark and will update this PR after 
the US holidays.
   
   Sure, I can help to review the PR when you feel it is ready. 
   Regarding the join reordering, one thing that we need to take into account 
is the connectivity of join nodes. 
   The connectivity can be inferred from existing join/filter conditions. 
   For example: `(A innerJoin B on (A.id = B.id1)) innerJoin C on (B.id2 = 
C.id) where B.id2 = B.id,`  in this case, A, B, C are all connected. I'm not 
sure whether Spark can cover all the cases or not, because SparkSQL does not 
implement the EquivalenceProperties/EquivalentClass, but PostgreSQL do 
implement those and the join reordering can benefit from it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] mingmwang commented on pull request #4620: Implement optimizer rule for reordering fact-dimension joins

Reply via email to