[GitHub] [spark] BryanCutler commented on issue #24095: [SPARK-27163][PYTHON] Cleanup and consolidate Pandas UDF functionality

GitBox Wed, 20 Mar 2019 20:37:20 -0700

BryanCutler commented on issue #24095: [SPARK-27163][PYTHON] Cleanup and 
consolidate Pandas UDF functionality
URL: https://github.com/apache/spark/pull/24095#issuecomment-475104137
 
 
   Apologies, I moved things around again for item (2) because I didn't really 
like having an option in `ArrowStreamPandasSerializer` to send the 
`START_ARROW_STREAM` either.
   
   Now, I have `_create_batch(...)` as a method in 
`ArrowStreamPandasSerializer` (where it belongs I think), and then have a 
subclass used for Pandas UDFs that overrides `dump_stream` that can send 
`START_ARROW_STREAM`.
   
   I think it's clearer this way because it's easier to see what serializer is 
used where, and I also tried to improve the docs. Let me know what you think 
when you get the chance to take another look @HyukjinKwon @ueshin . Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] BryanCutler commented on issue #24095: [SPARK-27163][PYTHON] Cleanup and consolidate Pandas UDF functionality

Reply via email to