[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3468#issuecomment-64526546 [Test build #23884 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23884/consoleFull) for PR 3468 at commit [`3b0e4e2`](https://github.com/apache/spark/commit/3b0e4e213d407b5fb42a4fd06bb157e3d2d10941). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3468#issuecomment-64536426 [Test build #23884 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23884/consoleFull) for PR 3468 at commit [`3b0e4e2`](https://github.com/apache/spark/commit/3b0e4e213d407b5fb42a4fd06bb157e3d2d10941). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3468 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3468#issuecomment-64671123 @brkyvz Thanks! I've merged this into master. Will submit another PR for branch-1.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/3468 [SPARK-4614][MLLIB] Slight API changes in Matrix and Matrices Before we have a full picture of the operators we want to add, it might be safer to hide `Matrix.transposeMultiply` in 1.2.0. Another update we want to change is `Matrix.randn` and `Matrix.rand`, both of which should take a `Random` implementation. Otherwise, it is very likely to produce inconsistent RDDs. I also added some unit tests for matrix factory methods. All APIs are new in 1.2, so there is no incompatible changes. @brkyvz You can merge this pull request into a Git repository by running: $ git pull https://github.com/mengxr/spark SPARK-4614 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3468.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3468 commit 6bfd8a4c5a7d7f00b7c71ccf6ad8aab6c6caab92 Author: Xiangrui Meng m...@databricks.com Date: 2014-11-26T03:32:13Z hide transposeMultiply; add rng to rand and randn; add unit tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3468#issuecomment-64510830 [Test build #23870 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23870/consoleFull) for PR 3468 at commit [`6bfd8a4`](https://github.com/apache/spark/commit/6bfd8a4c5a7d7f00b7c71ccf6ad8aab6c6caab92). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/3468#discussion_r20916331 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/MatricesSuite.scala --- @@ -112,4 +116,50 @@ class MatricesSuite extends FunSuite { assert(sparseMat(0, 1) === 10.0) assert(sparseMat.values(2) === 10.0) } + + test(zeros) { +val mat = Matrices.zeros(2, 3).asInstanceOf[DenseMatrix] +assert(mat.numRows === 2) +assert(mat.numCols === 3) +assert(mat.values.forall(_ == 0.0)) + } + + test(ones) { +val mat = Matrices.ones(2, 3).asInstanceOf[DenseMatrix] +assert(mat.numRows === 2) +assert(mat.numCols === 3) +assert(mat.values.forall(_ == 1.0)) + } + + test(eye) { +val mat = Matrices.eye(2).asInstanceOf[DenseMatrix] +assert(mat.numCols === 2) +assert(mat.numCols === 2) +assert(mat.values.toSeq === Seq(1.0, 0.0, 0.0, 1.0)) --- End diff -- Small, but: `mat.values.toSeq === Seq(1.0, 0.0, 0.0, 1.0)` - `mat.values === Array(1.0, 0.0, 0.0, 1.0)` is there any reason the other wouldn't work? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user brkyvz commented on the pull request: https://github.com/apache/spark/pull/3468#issuecomment-64515054 Looks good to me! Just made one comment, no biggie though, it's fine as is (but if you decide to change it, there are 4 exact copies of it). One comment/question about using `Random`: What would be the problem if we just pass a seed? Since users wouldn't be able to use `XORShiftRandom`, which has better performance than `java.util.Random`, wouldn't passing the seed be enough to solve the inconsistency issue? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3468#issuecomment-64515619 [Test build #23870 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23870/consoleFull) for PR 3468 at commit [`6bfd8a4`](https://github.com/apache/spark/commit/6bfd8a4c5a7d7f00b7c71ccf6ad8aab6c6caab92). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3468#issuecomment-64515623 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23870/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/3468#discussion_r20920325 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/MatricesSuite.scala --- @@ -112,4 +116,50 @@ class MatricesSuite extends FunSuite { assert(sparseMat(0, 1) === 10.0) assert(sparseMat.values(2) === 10.0) } + + test(zeros) { +val mat = Matrices.zeros(2, 3).asInstanceOf[DenseMatrix] +assert(mat.numRows === 2) +assert(mat.numCols === 3) +assert(mat.values.forall(_ == 0.0)) + } + + test(ones) { +val mat = Matrices.ones(2, 3).asInstanceOf[DenseMatrix] +assert(mat.numRows === 2) +assert(mat.numCols === 3) +assert(mat.values.forall(_ == 1.0)) + } + + test(eye) { +val mat = Matrices.eye(2).asInstanceOf[DenseMatrix] +assert(mat.numCols === 2) +assert(mat.numCols === 2) +assert(mat.values.toSeq === Seq(1.0, 0.0, 0.0, 1.0)) --- End diff -- `Array.equals` only check the reference. ~~~ scala Array(1.0, 2.0) == Array(1.0, 2.0) res0: Boolean = false ~~~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4614][MLLIB] Slight API changes in Matr...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3468#issuecomment-64526447 @brkyvz `XORShiftRandom` implements `Random`. So users can use it directly. Using seed will create problems when we want to generate a sequence of matrices, e.g., generate a random block matrices, because `randn` and `rand` do not return the next seed. If we using a predefined sequence of seeds, the generated sequence won't have good quality. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org