[GitHub] spark issue #20567: [SPARK-23380][PYTHON] Make toPandas fallback to non-Arro...

HyukjinKwon Wed, 14 Feb 2018 07:45:11 -0800

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20567
  
    @gatorsmile and @rxin,
    
    The problem here is that `toPandas` just fails on unsupported types later 
and allows `BinaryType` with inconsistent conversion 
(https://github.com/apache/spark/pull/20567#issuecomment-364639922) in Arrow 
whereas `createDataFrame` allows fallback in both cases.
    
    This is the last one left (for now) about PySpark/Pandas interoperability 
which I found during testing out and I was thinking about targeting 2.3.0.
    
    So, for clarification, would you be uncomfortable with one of:
    
    1. matching both toPandas and createDataFrame to fallback with a warning
    2. matching both toPandas and createDataFrame to throw an exception
    3. adding a configuration to control the fallback for both
    
    to target 2.3.0 (or 2.3.1 if the vote fails)? FYI, the current one in this 
PR is 1.
    
    If so, let me have two PRs, one for the error message for now to target 
2.3.0 (or 2.3.1 if the vote fails), and one for adding a configuration to 
control the fallback to target master (and maybe 2.3.1).
    
    Does that make sense to both of you?
    
    cc @cloud-fan too.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20567: [SPARK-23380][PYTHON] Make toPandas fallback to non-Arro...

Reply via email to