Github user olarayej commented on a diff in the pull request:
https://github.com/apache/spark/pull/8984#discussion_r41661065
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1881,3 +1881,31 @@ setMethod("as.data.frame",
collect(x)
}
)
+
+#' Returns the column types of a DataFrame.
+#'
+#' @name coltypes
+#' @title Get column types of a DataFrame
+#' @param x (DataFrame)
+#' @return value (character) A character vector with the column types of
the given DataFrame
+#' @rdname coltypes
+setMethod("coltypes",
+ signature(x = "DataFrame"),
+ function(x) {
+ # TODO: This may be moved as a global parameter
+ # These are the supported data types and how they map to
+ # R's data types
+ DATA_TYPES <- c("string"="character",
+ "double"="numeric",
+ "int"="integer",
+ "long"="integer",
+ "boolean"="long"
+ )
+
+ # Get the data types of the DataFrame by invoking dtypes()
function.
+ # Some post-processing is needed.
+ types <- as.character(t(as.data.frame(dtypes(x))[2, ]))
+
+ # Map Spark data types into R's data types
+ as.character(DATA_TYPES[types])
--- End diff --
@shivaram I agree. I could use the mapping below (got the short types from
schema.R:118):
scala -> R
"string"="character",
"long"="integer",
"short"="integer",
"integer"="integer"
"byte"="integer",
"double"="numeric",
"float"="numeric",
"decimal"="numeric",
"boolean"="logical"
In any other case, I will use the same scala type. Sounds good?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]