[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553756#comment-16553756
]
Hyukjin Kwon commented on SPARK-24760:
--
+1 for not a problem resolution for now.
> Pandas UDF does
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553315#comment-16553315
]
Wes McKinney commented on SPARK-24760:
--
If data comes to Spark from pandas, any "NaN" values should
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541967#comment-16541967
]
Bryan Cutler commented on SPARK-24760:
--
Yeah, createDataFrame is inconsistent with pandas_udf here,
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541209#comment-16541209
]
Mortada Mehyar commented on SPARK-24760:
[~bryanc] thanks for the example. It looks like
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538924#comment-16538924
]
Bryan Cutler commented on SPARK-24760:
--
Pandas uses NaNs as a special value that it interprets as a
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538234#comment-16538234
]
Mortada Mehyar commented on SPARK-24760:
I still think something is not right here. It is true
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537893#comment-16537893
]
Bryan Cutler commented on SPARK-24760:
--
Pandas interprets NaN to be missing data for numeric values
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537864#comment-16537864
]
Mortada Mehyar commented on SPARK-24760:
Setting the "new" column to be nullable indeed makes it
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537817#comment-16537817
]
Mortada Mehyar commented on SPARK-24760:
[~icexelloss] but NaN is not really a "null" value
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537662#comment-16537662
]
Li Jin commented on SPARK-24760:
I think the issue here is that the output schema for the UDF is not
[
https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536540#comment-16536540
]
Mortada Mehyar commented on SPARK-24760:
cc [~icexelloss] [~hyukjin.kwon]
> Pandas UDF does not
11 matches
Mail list logo