[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/14156 @srowen OK I close the pr for now if I found better way to optimize it I will reopen it, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14156 I get though sounds like there is not necessarily any such optimization now and actually not sure there can be. It could even be slower; it introduces an extra copy. It is somewhat harder to understand and different from its sibling method. I'm not sure we should do this until it is a demonstrable benefit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/14156 @srowen yeah, the function supplied here called cannot be turned into SIMD instructions but I think it can do some parallelization optimization on large matrix, for example we can split the matrix into several blocks and executed the "in place transform" in parallel way, although it haven't added in breeze currently. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14156 That's the question indeed. I'm not sure because the function that's supplied could be anything. I don't see how it could automatically be converted to a vectorized operation automatically. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/14156 yeah, currently it seems to make a little overhead (do a copy), but I think it will take advantage of breeze optimization, in the future, e.g, SIMD instructions or something ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14156 I see, this copies x to y then modifies y in place. OK. Is that more efficient? it seems like extra work, but does the transform method make up for it? just seeing if this has actually been observed to speed it up or not. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/14156 @srowen The := operator in BDM is simply copy one BDM to another, and it is widely used in breeze source, e.g, we can check DenseMatrix.copy function in Breeze: it first use `DenseMatrix.create` to create a new Matrix with the same dimension `val result = DenseMatrix.create(...)` , and them use `result := this` to copy self into the matrix just created. The mechanism of := operator for DenseMatrix is that the DenseMatrix implements the `OpSet` trait. check `DenseMatrix` source file in breeze, in line 985, there is: implicit val setMV_D:OpSet.InPlaceImpl2[...] = new SetDMDVOp[Double]() so, the implementation code is in `SetDMDVOp` class and we can see that in `SetDMDVOp` it do Type Specialization for Double type so that the compiling code will have high efficiency. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14156 Is there reasonable evidence this speeds things up? just want to make sure this does not make it slower. Help me understand the := operator? I don't recognize how it's helping compute y as a function of x here. I assume the method below can't use the same mechanism? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/14156 cc @srowen thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14156 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14156 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62172/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14156 **[Test build #62172 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62172/consoleFull)** for PR 14156 at commit [`c7b2059`](https://github.com/apache/spark/commit/c7b2059c5799404c3a3e99615e2ad7dc32989fda). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14156: [SPARK-16499][ML][MLLib] improve ApplyInPlace function i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14156 **[Test build #62172 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62172/consoleFull)** for PR 14156 at commit [`c7b2059`](https://github.com/apache/spark/commit/c7b2059c5799404c3a3e99615e2ad7dc32989fda). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org