Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-221026877
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-221026865
**[Test build #59144 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59144/consoleFull)**
for PR 11610 at commit
[`e9c80a8`](https://g
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-221026870
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-221025962
**[Test build #59144 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59144/consoleFull)**
for PR 11610 at commit
[`e9c80a8`](https://gi
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-221023289
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-221023283
**[Test build #59136 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59136/consoleFull)**
for PR 11610 at commit
[`e9c80a8`](https://g
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-221023290
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-221022478
**[Test build #59136 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59136/consoleFull)**
for PR 11610 at commit
[`e9c80a8`](https://gi
Github user iyounus commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-218239032
@mengxr I looked into using DGELSD to solve `A^T A x = A^T b` as you
suggested. It works fine, but then the issue is how to calculate the errors on
the coefficients fo
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-217511021
Ping @iyounus ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this featur
Github user iyounus commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-197468720
One problem with the eigen decomposition method is that for rank deficient
matrix some of the eigenvalues can be extremely small (instead of being zero)
and their contr
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-197557069
@dbtsai There is a good chance of precision loss during the computation of
A^T A is A is ill-conditioned. A better approach is to factorize A directly. It
is similar to
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-197213572
I'm not an expert in this area, but after thinking it more, I don't think
we can use `DGELSD` which minimizes `||b - A*x||` using the singular value
decomposition (SVD)
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-197199401
Locally, we are solving `A^T A x = A^T b`. In a rank deficient case, we can
compute the min-length least squares solution that also minimizes `\| x \|_2`,
which is uniqu
Github user iyounus commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-196960794
I'm a bit confused about the use of DGELSD. As far as I can tell, it
requires matrix A itself. But in the current implementation, we're decomposing
A^T.A on the driver.
Github user dbtsai commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-196662210
I will vote for approach 1.
SVD will be the most stable algorithm, but slowest O(mn^2 + n^3) compared
with Cholesky O(mn^2) or QR O(mn^2 - n^3/3) decomposition.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-196503166
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-196503171
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-196502654
**[Test build #53091 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53091/consoleFull)**
for PR 11610 at commit
[`e9c80a8`](https://g
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-196486264
@iyounus @dbtsai The normal equation approach will fail if the matrix A is
rank-deficient. It happens when there are constant columns. However, more
generally, it happen
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-196480702
**[Test build #53091 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53091/consoleFull)**
for PR 11610 at commit
[`e9c80a8`](https://gi
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55964826
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares(
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55964808
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares(
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55964768
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares(
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55964546
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares(
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55964451
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares(
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55964211
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares(
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55963818
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares(
Github user dbtsai commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55963496
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,57 @@ private[ml] class WeightedLeastSquares(
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-195644840
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-195644841
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-195644779
**[Test build #52981 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52981/consoleFull)**
for PR 11610 at commit
[`652d2bd`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-195631468
**[Test build #52981 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52981/consoleFull)**
for PR 11610 at commit
[`652d2bd`](https://gi
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55909205
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,53 @@ private[ml] class WeightedLeastSquares(
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55909088
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,53 @@ private[ml] class WeightedLeastSquares(
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55908792
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -120,34 +160,47 @@ private[ml] class WeightedLeastSquares(
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55908419
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,53 @@ private[ml] class WeightedLeastSquares(
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55908431
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,53 @@ private[ml] class WeightedLeastSquares(
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55908346
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,53 @@ private[ml] class WeightedLeastSquares(
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/11610#discussion_r55908232
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala ---
@@ -108,6 +101,53 @@ private[ml] class WeightedLeastSquares(
Github user iyounus commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-194522061
I should point out that to identify constant features, I'm comparing
variance (aVar) to zero. But, It can happen that the variance for constant
features may not be iden
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-194506646
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-194506644
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-194506333
**[Test build #52764 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52764/consoleFull)**
for PR 11610 at commit
[`9412ef4`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11610#issuecomment-194489209
**[Test build #52764 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52764/consoleFull)**
for PR 11610 at commit
[`9412ef4`](https://gi
GitHub user iyounus opened a pull request:
https://github.com/apache/spark/pull/11610
[SPARK-13777] [ML] Remove constant features from training in noraml solver
(WLS)
## What changes were proposed in this pull request?
"normal" solver in LinearRegression uses Cholesky decom
46 matches
Mail list logo