spark git commit: [SPARK-19682][SPARKR] Issue warning (or error) when subset method "[[" takes vector index

felixcheung Thu, 23 Feb 2017 11:12:48 -0800

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 21afc4534 -> d30238f1b



[SPARK-19682][SPARKR] Issue warning (or error) when subset method "[[" takes 
vector index

## What changes were proposed in this pull request?
The `[[` method is supposed to take a single index and return a column. This is 
different from base R which takes a vector index.  We should check for this and 
issue warning or error when vector index is supplied (which is very likely 
given the behavior in base R).

Currently I'm issuing a warning message and just take the first element of the 
vector index. We could change this to an error it that's better.

## How was this patch tested?
new tests

Author: actuaryzhang <actuaryzhan...@gmail.com>

Closes #17017 from actuaryzhang/sparkRSubsetter.

(cherry picked from commit 7bf09433f5c5e08154ba106be21fe24f17cd282b)
Signed-off-by: Felix Cheung <felixche...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d30238f1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d30238f1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d30238f1

Branch: refs/heads/branch-2.1
Commit: d30238f1b9096c9fd85527d95be639de9388fcc7
Parents: 21afc45
Author: actuaryzhang <actuaryzhan...@gmail.com>
Authored: Thu Feb 23 11:12:02 2017 -0800
Committer: Felix Cheung <felixche...@apache.org>
Committed: Thu Feb 23 11:12:18 2017 -0800

----------------------------------------------------------------------
 R/pkg/R/DataFrame.R                       |  8 ++++++++
 R/pkg/inst/tests/testthat/test_sparkSQL.R | 12 ++++++++++++
 2 files changed, 20 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/d30238f1/R/pkg/R/DataFrame.R
----------------------------------------------------------------------
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 986f1f1..d0f0979 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -1800,6 +1800,10 @@ setClassUnion("numericOrcharacter", c("numeric", 
"character"))
 #' @note [[ since 1.4.0
 setMethod("[[", signature(x = "SparkDataFrame", i = "numericOrcharacter"),
           function(x, i) {
+            if (length(i) > 1) {
+              warning("Subset index has length > 1. Only the first index is 
used.")
+              i <- i[1]
+            }
             if (is.numeric(i)) {
               cols <- columns(x)
               i <- cols[[i]]
@@ -1813,6 +1817,10 @@ setMethod("[[", signature(x = "SparkDataFrame", i = 
"numericOrcharacter"),
 #' @note [[<- since 2.1.1
 setMethod("[[<-", signature(x = "SparkDataFrame", i = "numericOrcharacter"),
           function(x, i, value) {
+            if (length(i) > 1) {
+              warning("Subset index has length > 1. Only the first index is 
used.")
+              i <- i[1]
+            }
             if (is.numeric(i)) {
               cols <- columns(x)
               i <- cols[[i]]

http://git-wip-us.apache.org/repos/asf/spark/blob/d30238f1/R/pkg/inst/tests/testthat/test_sparkSQL.R
----------------------------------------------------------------------
diff --git a/R/pkg/inst/tests/testthat/test_sparkSQL.R 
b/R/pkg/inst/tests/testthat/test_sparkSQL.R
index d9dd0f3..9608fa1 100644
--- a/R/pkg/inst/tests/testthat/test_sparkSQL.R
+++ b/R/pkg/inst/tests/testthat/test_sparkSQL.R
@@ -1007,6 +1007,18 @@ test_that("select operators", {
   expect_is(df[[2]], "Column")
   expect_is(df[["age"]], "Column")
 
+  expect_warning(df[[1:2]],
+                 "Subset index has length > 1. Only the first index is used.")
+  expect_is(suppressWarnings(df[[1:2]]), "Column")
+  expect_warning(df[[c("name", "age")]],
+                 "Subset index has length > 1. Only the first index is used.")
+  expect_is(suppressWarnings(df[[c("name", "age")]]), "Column")
+
+  expect_warning(df[[1:2]] <- df[[1]],
+                 "Subset index has length > 1. Only the first index is used.")
+  expect_warning(df[[c("name", "age")]] <- df[[1]],
+                 "Subset index has length > 1. Only the first index is used.")
+
   expect_is(df[, 1, drop = F], "SparkDataFrame")
   expect_equal(columns(df[, 1, drop = F]), c("name"))
   expect_equal(columns(df[, "age", drop = F]), c("age"))


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-19682][SPARKR] Issue warning (or error) when subset method "[[" takes vector index

Reply via email to