[GitHub] spark pull request #19243: [SPARK-21780][R] Simpler Dataset.sample API in R

felixcheung Tue, 19 Sep 2017 20:34:09 -0700

Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19243#discussion_r139868488
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -998,33 +998,44 @@ setMethod("unique",
     #' sparkR.session()
     #' path <- "path/to/file.json"
     #' df <- read.json(path)
    +#' collect(sample(df, fraction = 0.5))
     #' collect(sample(df, FALSE, 0.5))
    -#' collect(sample(df, TRUE, 0.5))
    +#' collect(sample(df, TRUE, 0.5, seed = 3))
     #'}
     #' @note sample since 1.4.0
     setMethod("sample",
    -          signature(x = "SparkDataFrame", withReplacement = "logical",
    -                    fraction = "numeric"),
    -          function(x, withReplacement, fraction, seed) {
    -            if (fraction < 0.0) stop(cat("Negative fraction value:", 
fraction))
    +          signature(x = "SparkDataFrame"),
    +          function(x, withReplacement = FALSE, fraction, seed) {
    +            if (!is.numeric(fraction)) {
    +              stop(paste("fraction must be numeric; however, got", 
class(fraction)))
    +            }
    +            if (!is.logical(withReplacement)) {
    +              stop(paste("withReplacement must be logical; however, got", 
class(withReplacement)))
    +            }
    +
                 if (!missing(seed)) {
    +              if (is.null(seed) || is.na(seed)) {
    +                stop(paste("seed must not be NULL or NA; however, got", 
class(seed)))
    --- End diff --
    
    this actually doesn't work for NA
    ```
    > class(NULL)
    [1] "NULL"
    > class(NA)
    [1] "logical"
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19243: [SPARK-21780][R] Simpler Dataset.sample API in R

Reply via email to