BryanCutler commented on issue #24981: [WIP][SPARK-27463][PYTHON] Support 
Dataframe Cogroup via Pandas UDFs- Arrow Stream Impl
URL: https://github.com/apache/spark/pull/24981#issuecomment-512979848
 
 
   @d80tb7 thanks for running the benchmarks, it's good to see we can use Arrow 
stream format without any significant penalty. It would be best to stick with 
this PR if you can, as @icexelloss there is already a lot of good discussion 
here. As for the API, I prefer 
`df1.groupby('id').cogroup(df2.groupby('id')).apply(func)` a little more but 
not too strongly. I agree we should come to a consensus on one API though and 
not introduce an alternate form also.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to