[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

BryanCutler Wed, 03 Oct 2018 17:20:58 -0700

Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/22610
  
    > Thanks, @BryanCutler. WDYT about documenting the type map thing?
    
    I think that would help in the cases of dates/times because those can get a 
little confusing. For primitives, I think it's pretty straightforward, so I 
don't know how much that would help. Maybe it we just highlight some potential 
pitfalls?
    
    The problem here was that when a null value was introduced, Pandas 
automatically converted the data to float to insert a NaN value, then the Arrow 
conversion from float to bool is broken. When the data just had ints, the 
conversion seems ok, so it ended up giving inconsistent confusing results.  Not 
sure what might have helped here, it's just a nasty bug :)




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

Reply via email to