[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-23 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553756#comment-16553756 ] Hyukjin Kwon commented on SPARK-24760: -- +1 for not a problem resolution for now. > Pandas UDF does

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-23 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553315#comment-16553315 ] Wes McKinney commented on SPARK-24760: -- If data comes to Spark from pandas, any "NaN" values should

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-12 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541967#comment-16541967 ] Bryan Cutler commented on SPARK-24760: -- Yeah, createDataFrame is inconsistent with pandas_udf here,

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-12 Thread Mortada Mehyar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541209#comment-16541209 ] Mortada Mehyar commented on SPARK-24760: [~bryanc] thanks for the example. It looks like

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-10 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538924#comment-16538924 ] Bryan Cutler commented on SPARK-24760: -- Pandas uses NaNs as a special value that it interprets as a

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-10 Thread Mortada Mehyar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538234#comment-16538234 ] Mortada Mehyar commented on SPARK-24760: I still think something is not right here. It is true

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-09 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537893#comment-16537893 ] Bryan Cutler commented on SPARK-24760: -- Pandas interprets NaN to be missing data for numeric values

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-09 Thread Mortada Mehyar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537864#comment-16537864 ] Mortada Mehyar commented on SPARK-24760: Setting the "new" column to be nullable indeed makes it

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-09 Thread Mortada Mehyar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537817#comment-16537817 ] Mortada Mehyar commented on SPARK-24760: [~icexelloss] but NaN is not really a "null" value

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-09 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537662#comment-16537662 ] Li Jin commented on SPARK-24760: I think the issue here is that the output schema for the UDF is not

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-08 Thread Mortada Mehyar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536540#comment-16536540 ] Mortada Mehyar commented on SPARK-24760: cc [~icexelloss] [~hyukjin.kwon] > Pandas UDF does not