[
https://issues.apache.org/jira/browse/SPARK-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493223#comment-14493223
]
Antonio Piccolboni commented on SPARK-6820:
-------------------------------------------
For the distinction between NAs and NUlls in R, see
http://www.r-bloggers.com/r-na-vs-null/ This seems a fairly dangerous move,
but I don't have a good alternative to suggest. This is a valid data frame
dd <-
structure(list(c.1..2..NA. = c(1, 2, NA), V2 = list(1, 2, NULL)), .Names =
c("c.1..2..NA.",
"V2"), row.names = c(NA, -3L), class = "data.frame")
dd[3,1] == dd[3,2][[1]]
How often real code relies on list columns that can contain nulls, I am not
sure.
> Convert NAs to null type in SparkR DataFrames
> ---------------------------------------------
>
> Key: SPARK-6820
> URL: https://issues.apache.org/jira/browse/SPARK-6820
> Project: Spark
> Issue Type: New Feature
> Components: SparkR, SQL
> Reporter: Shivaram Venkataraman
>
> While converting RDD or local R DataFrame to a SparkR DataFrame we need to
> handle missing values or NAs.
> We should convert NAs to SparkSQL's null type to handle the conversion
> correctly
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]