[
https://issues.apache.org/jira/browse/SPARK-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-1262:
-----------------------------
Comment: was deleted
(was: That makes sense. There may still be an interesting point here. What
would be interested to know is whether the size of |Ax-B| is consistently
larger or smaller for one solver or another. If you find the QR decomposition
has smaller differences and gives anecdotally better results, that would be a
compelling reason to consider the different decomposition. There was another
issue regarding solvePositive and mysteriously near-singular A that might be
related.)
> Numerical drift in computation of matrix inverse leads to invalid results in
> ALS
> --------------------------------------------------------------------------------
>
> Key: SPARK-1262
> URL: https://issues.apache.org/jira/browse/SPARK-1262
> Project: Spark
> Issue Type: Bug
> Components: MLlib
> Reporter: Michael Allman
>
> In what follows I cannot offer an expert analysis and remedy, however I will
> describe the problem I'm seeing and the strategy I've used to mitigate it.
> The ALS {{updateBlock()}} method includes a call to {{Solve.solvePositive()}}
> from JBlas. Generally speaking, when we call {{solvePositive(A, B)}} for
> symmetric, positive definite {{A}}, this method computes {{x}} from the
> matrix equation {{Ax = B}}. Or, in other words, it computes {{A}}^-1^{{B}}.
> As mentioned, one of the preconditions on A is that it be symmetric. In ALS,
> we call this method on {{fullXtX}} or {{fullXtX.add(YtY.value.get)}}, both of
> which should be symmetric. However, for implicit ALS and rank > 1, some kind
> of imprecision in the computation of this inverse tends to make successive
> values of {{fullXtX.add(YtY.value.get)}} less and less symmetric. From my
> experience this can and does alter the model produced by ALS significantly,
> leading to very different and counterintuitive user-item recommendations
> compared to a solution in which this asymmetry does not exist.
> An approach I've seen taken against this problem from the Oryx codebase is to
> cast each value in the vector returned from {{Solve.solvePositive()}} to a
> float. That has worked for me.
> I've also tried alternative implementations of a linear system solver. The
> solver that Oryx uses, {{RRQRDecomposition}} from commons math v3, seems to
> lead to better solutions, but still drifts unless the results are cast to
> floats.
> The Colt numerical computing libraries include a solver called
> {{QRDecomposition}}, which may be superior to the one used in JBlas. However,
> I haven't tested it.
> I'm working on a unit test to exercise this bug. I discovered it by "tracing"
> the behavior of {{ALS.scala}} with various logging statements inserted into
> the code.
> If nothing else, add a line before the calls to {{Solve.solvePositive()}} in
> {{updateBlock()}} to validate that {{fullXtX}} or
> {{fullXtX.add(YtY.value.get)}} are symmetric.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]