[GitHub] spark issue #19349: [SPARK-22125][PYSPARK][SQL] Enable Arrow Stream format f...

BryanCutler Tue, 26 Sep 2017 11:51:46 -0700

Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/19349
  
    Nice job on refactoring `PythonRunner`!  I think we should just replace the 
arrow file format with stream format for pandas udf instead of having a new 
conf to enable it, as long as all the issues are worked out. Along with being a 
little faster, it's also easier on memory usage.  I'd like to do the same for 
`toPandas()` also, but that can be a followup.  Is it possible to do away with 
the SQLConf and maybe rename some of these classes to be more general, e.g. 
`ArrowStreamPythonUDFRunner` -> `ArrowPythonRunner`?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19349: [SPARK-22125][PYSPARK][SQL] Enable Arrow Stream format f...

Reply via email to