Thanks for reporting the bug! I will take a look. -Xiangrui
On Thu, Oct 16, 2014 at 11:25 PM, Debasish Das wrote:
> Hi,
>
> I am validating the proximal algorithm for positive and bound constrained
> ALS and I came across the bug detailed in the JIRA while running ALS with
> NNLS:
>
> https://iss
Hi,
I am validating the proximal algorithm for positive and bound constrained
ALS and I came across the bug detailed in the JIRA while running ALS with
NNLS:
https://issues.apache.org/jira/browse/SPARK-3987
ADMM based proximal algorithm came up with correct result...
Thanks.
Deb
On Thu, Oct 16, 2014 at 3:55 PM, shane knapp wrote:
> i really, truly hate non-deterministic failures.
Amen bruddah.
yeah, at this point it might be worth trying. :)
the absolutely irritating thing is that i am not seeing this happen w/any
other jobs other that the spark prb, nor does it seem to correlate w/time
of day, network or system load, or what slave it runs on. nor are we
hitting our limit of connectio
Thanks for continuing to look into this, Shane.
One suggestion that Patrick brought up, if we have trouble getting to the
bottom of this, is doing the git checkout ourselves in the run-tests-jenkins
script and cutting out the Jenkins git plugin entirely. That way we can
script retries and post fri
Hi Matt,
I’m not sure whether those tests will actually find this specific issue. The
tests that I linked to test Spark’s Zookeeper-based multi-master mode, whereas
it sounds like you’re seeing this issue in regular standalone cluster. In
those tests, the workers disconnect from the master be
the bad news is that we've had a couple more failures due to timeouts, but
the good news is that the frequency that these happen has decreased
significantly (3 in the past ~18hr).
seems like the git plugin downgrade has helped relieve the problem, but
hasn't fixed it. i'll be looking in to this m
Accumulators on the stage info page show the rolling life time value of
accumulators as well as per task which is handy. I think it would be useful to
add another field to the “Accumulators” table that also shows the total for the
stage you are looking at (basically just a merge of the accumula
Just checked, QR is exposed by netlib: import org.netlib.lapack.Dgeqrf
For the equality and bound version, I will use QR...it will be faster than
the LU that I am using through jblas.solveSymmetric...
On Thu, Oct 16, 2014 at 8:34 AM, Debasish Das
wrote:
> @xiangrui should we add this epsilon in
@xiangrui should we add this epsilon inside ALS code itself ? So that if
user by mistake put 0.0 as regularization, LAPACK failures does not show
up...
@sean For the proximal algorithms I am using Cholesky for L1 and LU for
equality and bound constraints (since the matrix is quasi definite)...I am
It Gramian is at least positive semidefinite and will be definite if the
matrix is non singular, yes. That's usually but not always true.
The lambda*I matrix is positive definite, well, when lambda is positive.
Adding that makes it definite.
At least, lambda=0 could be rejected as invalid.
But t
11 matches
Mail list logo