Repository: spark
Updated Branches:
  refs/heads/master a6e53a9c8 -> ed3cb1d21


[SPARK-9277] [MLLIB] SparseVector constructor must throw an error when declared 
number of elements less than array length

Check that SparseVector size is at least as big as the number of indices/values 
provided. And add tests for constructor checks.

CC MechCoder jkbradley -- I am not sure if a change needs to also happen in the 
Python API? I didn't see it had any similar checks to begin with, but I don't 
know it well.

Author: Sean Owen <so...@cloudera.com>

Closes #7794 from srowen/SPARK-9277 and squashes the following commits:

e8dc31e [Sean Owen] Fix scalastyle
6ffe34a [Sean Owen] Check that SparseVector size is at least as big as the 
number of indices/values provided. And add tests for constructor checks.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ed3cb1d2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ed3cb1d2
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ed3cb1d2

Branch: refs/heads/master
Commit: ed3cb1d21c73645c8f6e6ee08181f876fc192e41
Parents: a6e53a9
Author: Sean Owen <so...@cloudera.com>
Authored: Thu Jul 30 09:19:55 2015 -0700
Committer: Xiangrui Meng <m...@databricks.com>
Committed: Thu Jul 30 09:19:55 2015 -0700

----------------------------------------------------------------------
 .../org/apache/spark/mllib/linalg/Vectors.scala      |  2 ++
 .../org/apache/spark/mllib/linalg/VectorsSuite.scala | 15 +++++++++++++++
 2 files changed, 17 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/ed3cb1d2/mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala
index 0cb28d7..23c2c16 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala
@@ -637,6 +637,8 @@ class SparseVector(
   require(indices.length == values.length, "Sparse vectors require that the 
dimension of the" +
     s" indices match the dimension of the values. You provided 
${indices.length} indices and " +
     s" ${values.length} values.")
+  require(indices.length <= size, s"You provided ${indices.length} indices and 
values, " +
+    s"which exceeds the specified vector size ${size}.")
 
   override def toString: String =
     s"($size,${indices.mkString("[", ",", "]")},${values.mkString("[", ",", 
"]")})"

http://git-wip-us.apache.org/repos/asf/spark/blob/ed3cb1d2/mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala
----------------------------------------------------------------------
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala 
b/mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala
index 03be411..1c37ea5 100644
--- a/mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala
@@ -57,6 +57,21 @@ class VectorsSuite extends SparkFunSuite with Logging {
     assert(vec.values === values)
   }
 
+  test("sparse vector construction with mismatched indices/values array") {
+    intercept[IllegalArgumentException] {
+      Vectors.sparse(4, Array(1, 2, 3), Array(3.0, 5.0, 7.0, 9.0))
+    }
+    intercept[IllegalArgumentException] {
+      Vectors.sparse(4, Array(1, 2, 3), Array(3.0, 5.0))
+    }
+  }
+
+  test("sparse vector construction with too many indices vs size") {
+    intercept[IllegalArgumentException] {
+      Vectors.sparse(3, Array(1, 2, 3, 4), Array(3.0, 5.0, 7.0, 9.0))
+    }
+  }
+
   test("dense to array") {
     val vec = Vectors.dense(arr).asInstanceOf[DenseVector]
     assert(vec.toArray.eq(arr))


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to