[GitHub] [spark] BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs

GitBox Wed, 11 Sep 2019 15:33:51 -0700

BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe 
Cogroup via Pandas UDFs
URL: https://github.com/apache/spark/pull/24981#issuecomment-530591873
 
 
   In response to @ueshin question, please correct me if I'm wrong @d80tb7 
   
   >I'm just wondering what if the group keys are different between the two 
grouped data.
   Is it okay to execute as are, or should we check the both keys are the same 
lengths and types?
   
   The group keys could be different, and then both are passed to the left, 
right dataframes in the udf. It currently does not restrict they be the same 
length or type, so the user has to make sure the udf can handle this. For 
example if using `pandas.merge_asof`, different keys can be used and it allows 
some flexibility for the comparison.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs

Reply via email to