[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...

BryanCutler Tue, 28 Aug 2018 18:45:01 -0700

Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/21546
  
    @gatorsmile , this is just the format for Arrow IPC between the JVM and 
Python process and although it used the Arrow File format, there is nothing 
persisted. There is no real reason to keep both formats, the stream format is 
better for our purposes and it's already what is being used for `pandas_udf`s, 
so there is unlikely a bug in the Arrow format itself. As with any change, a 
bug is possible but this has been tested pretty thouroughly and trying to keep 
the old code would get really messy and complicated.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...

Reply via email to