Github user icexelloss commented on the issue:

    https://github.com/apache/spark/pull/18664
  
    Thanks @gatorsmile for the constructive feedback!
    
    I don't want to make this more complicated but I also want to make sure we 
are aware that there is also difference between Arrow/non-Arrow version when 
treating array and sstruct type:
    
    Array:
    ```
    non-Arrow:
    In [47]: type(df2.toPandas().array[0])
    Out[47]: list
    
    Arrow:
    In [45]: type(df2.toPandas().array[0])
    Out[45]: numpy.ndarray
    ```
    
    Struct:
    ```
    Arrow:
    In [35]: type(df.toPandas().struct[0])
    Out[35]: pyspark.sql.types.Row
    
    non-Arrow:
    In [37]: type(df.toPandas().struct[0])
    Out[37]: dict
    ```
    
    I think there should be a high level doc capturing all differences between 
Arrow/non-Arrow version. 
    
    Unfortunately I cannot commit much time until Nov but I am happy for help 
with review and discussion.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to