tygert commented on issue #16556: [SPARK-19184][MLlib] Improve numerical 
stability for method tallSkinnyQR.
URL: https://github.com/apache/spark/pull/16556#issuecomment-531974017
 
 
   To be honest, @srowen : this is way more likely at scale than for the 4x4 
case. That is how we found the problem. @hl475 eventually worked out a small 
case that was representative of what others had been observing. We got 
complaints that principal component analysis in Spark was broken, and it turned 
out that the problem was numerical instability. You could in principle use a 
least-squares solver rather than inverting matrices, if you wanted to rely on 
Breeze alone. There seems to be a larger issue, though: solving systems of 
linear equations by explicitly inverting matrices and without any reason for 
subspaces to align is something prohibited very early in textbooks on numerical 
linear algebra. Ideally whoever would be maintaining MLlib would be familiar 
with condition numbers and numerical instability, though I fully realize that 
there may not be enough resources available to approach the ideal. If we end up 
using MLlib more where I work, then perhaps I can fix this in the future. Sorry 
for the distraction.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to