[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

shivaram Fri, 16 Oct 2015 09:53:02 -0700

Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8984#discussion_r42263878
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -1880,4 +1880,46 @@ setMethod("as.data.frame",
                   stop(paste("Unused argument(s): ", paste(list(...), 
collapse=", ")))
                 }
                 collect(x)
    +          }
    +)
    +
    +#' Returns the column types of a DataFrame.
    +#' 
    +#' @name coltypes
    +#' @title Get column types of a DataFrame
    +#' @param x (DataFrame)
    +#' @return value (character) A character vector with the column types of 
the given DataFrame
    +#' @rdname coltypes
    +setMethod("coltypes",
    +          signature(x = "DataFrame"),
    +          function(x) {
    +            # TODO: This may be moved as a global parameter
    +            # These are the supported data types and how they map to
    +            # R's data types
    +            DATA_TYPES <- c("string"="character",
    +                            "long"="integer",
    +                            "tinyint"="integer",
    +                            "short"="integer",
    +                            "integer"="integer",
    +                            "byte"="integer",
    +                            "double"="numeric",
    +                            "float"="numeric",
    +                            "decimal"="numeric",
    +                            "boolean"="logical"
    +            )
    --- End diff --
    
    The single character names are to reduce the amount of data serialized when 
we transfer these data types to the JVM. Its not meant to be remembered by 
anybody so I don't see it being a source of confusion. @sun-rui also added 
tests which ensure these mappings don't break.
    
    However I think having a list of primitive types, complex types and mapping 
in a common file (types.R ?) sounds good to me.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

Reply via email to