[GitHub] [spark] EnricoMi commented on pull request #39902: [SPARK-42349][PYTHON]Support pandas cogroup with multiple df

via GitHub Tue, 07 Feb 2023 00:25:24 -0800


EnricoMi commented on PR #39902:
URL: https://github.com/apache/spark/pull/39902#issuecomment-1420377190


   Excellent work. I would strongly recommend two things:
   - lets make existing CoGroup code handle many dataframes, this way lots of 
code does not get duplicated
   - lets always expect the first argument of the UDF to be the key, things 
simplify that way and there is not much overhead of always providing the key
   
   But let's first hear whether Spark committers are happy to approve either.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] EnricoMi commented on pull request #39902: [SPARK-42349][PYTHON]Support pandas cogroup with multiple df

Reply via email to