[jira] [Commented] (SPARK-6820) Convert NAs to null type in SparkR DataFrames

Antonio Piccolboni (JIRA) Mon, 13 Apr 2015 15:58:10 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493223#comment-14493223
 ]


Antonio Piccolboni commented on SPARK-6820:
-------------------------------------------

For the distinction between NAs and NUlls in R, see 
http://www.r-bloggers.com/r-na-vs-null/  This seems a fairly dangerous move, 
but I don't have a good alternative to suggest. This is a valid data frame


dd <-
structure(list(c.1..2..NA. = c(1, 2, NA), V2 = list(1, 2, NULL)), .Names = 
c("c.1..2..NA.", 
"V2"), row.names = c(NA, -3L), class = "data.frame")

dd[3,1] == dd[3,2][[1]]

How often real code relies on list columns that can contain nulls, I am not 
sure.

> Convert NAs to null type in SparkR DataFrames
> ---------------------------------------------
>
>                 Key: SPARK-6820
>                 URL: https://issues.apache.org/jira/browse/SPARK-6820
>             Project: Spark
>          Issue Type: New Feature
>          Components: SparkR, SQL
>            Reporter: Shivaram Venkataraman
>
> While converting RDD or local R DataFrame to a SparkR DataFrame we need to 
> handle missing values or NAs.
> We should convert NAs to SparkSQL's null type to handle the conversion 
> correctly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-6820) Convert NAs to null type in SparkR DataFrames

Reply via email to