BryanCutler commented on issue #24095: [SPARK-27163][PYTHON] Cleanup and consolidate Pandas UDF functionality URL: https://github.com/apache/spark/pull/24095#issuecomment-475104137 Apologies, I moved things around again for item (2) because I didn't really like having an option in `ArrowStreamPandasSerializer` to send the `START_ARROW_STREAM` either. Now, I have `_create_batch(...)` as a method in `ArrowStreamPandasSerializer` (where it belongs I think), and then have a subclass used for Pandas UDFs that overrides `dump_stream` that can send `START_ARROW_STREAM`. I think it's clearer this way because it's easier to see what serializer is used where, and I also tried to improve the docs. Let me know what you think when you get the chance to take another look @HyukjinKwon @ueshin . Thanks!
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
