EnricoMi commented on PR #39902: URL: https://github.com/apache/spark/pull/39902#issuecomment-1420377190
Excellent work. I would strongly recommend two things: - lets make existing CoGroup code handle many dataframes, this way lots of code does not get duplicated - lets always expect the first argument of the UDF to be the key, things simplify that way and there is not much overhead of always providing the key But let's first hear whether Spark committers are happy to approve either. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
