[GitHub] [spark] BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs

2019-09-17 Thread GitBox
BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe 
Cogroup via Pandas UDFs
URL: https://github.com/apache/spark/pull/24981#issuecomment-532464242
 
 
   We can refine the API in followups if needed. There should be a section 
added in the usage guide for this, I made 
https://issues.apache.org/jira/browse/SPARK-29126 to track.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs

2019-09-17 Thread GitBox
BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe 
Cogroup via Pandas UDFs
URL: https://github.com/apache/spark/pull/24981#issuecomment-532462929
 
 
   merged to master, thanks for your contribution @d80tb7 !


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs

2019-09-17 Thread GitBox
BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe 
Cogroup via Pandas UDFs
URL: https://github.com/apache/spark/pull/24981#issuecomment-532360918
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs

2019-09-17 Thread GitBox
BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe 
Cogroup via Pandas UDFs
URL: https://github.com/apache/spark/pull/24981#issuecomment-532328138
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs

2019-09-11 Thread GitBox
BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe 
Cogroup via Pandas UDFs
URL: https://github.com/apache/spark/pull/24981#issuecomment-530591873
 
 
   In response to @ueshin question, please correct me if I'm wrong @d80tb7 
   
   >I'm just wondering what if the group keys are different between the two 
grouped data.
   Is it okay to execute as are, or should we check the both keys are the same 
lengths and types?
   
   The group keys could be different, and then both are passed to the left, 
right dataframes in the udf. It currently does not restrict they be the same 
length or type, so the user has to make sure the udf can handle this. For 
example if using `pandas.merge_asof`, different keys can be used and it allows 
some flexibility for the comparison.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs

2019-08-23 Thread GitBox
BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe 
Cogroup via Pandas UDFs
URL: https://github.com/apache/spark/pull/24981#issuecomment-524399804
 
 
   The current API looks good to me. Let me take another detailed pass through 
next week and then can ping others for a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs

2019-08-20 Thread GitBox
BryanCutler commented on issue #24981: [SPARK-27463][PYTHON] Support Dataframe 
Cogroup via Pandas UDFs
URL: https://github.com/apache/spark/pull/24981#issuecomment-523219688
 
 
   This is looking really good @d80tb7 !  I forget if the previous discussions 
had settled on the API or is it still in debate?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org