2011/12/21 Radim Rehurek <[email protected]>: > Heh. Is that the official Mahout philosophy for other algorithms as well? > "Who cares about correctness, we are happy we can run it on some data at all, > so shut up?" I hope you're not serious Ted.
"No", and calm down Radim, this isn't useful. I think it's easy to read Ted's messages with the wrong voice. I don't think it's meant that way. > The results are completely in line with Martinsson et al.'s [1] as well as my > previous experiments: no power iteration steps with massive truncation = > rubbish output. Accuracy improves exponentially with increasing no. of > iteration steps (but see my initial warning re. numerical issues with higher > number of steps if implemented naively). That's great info. Do you have a distributed version of this? :) I was actually hoping you would... > From all the dev replies here -- no users actually replied -- I get the vibe > that the accuracy discussion annoys you. Now, I dropped by to give a friendly > hint about possible serious accuracy concerns, based on experience with > mid-scale (billions of non-zeroes) SVD computations in one specific domain > (term-document matrices in NLP). And possibly learning about your issues on > tera-feature scale datasets in return, which I'm very interested in. > Apparently neither of us is getting anything out of this, so I'll stop here. Your interests are not the same as, say, mine, as a user of the SVD for recs. A reconstruction with small error is good all else equal, but all else is not equal. The quality of my output does not scale proportionally with accuracy. Unfortunately what you suggest simply doesn't exist in a form I can use at scale, which is a big issue! I don't think anyone pinged you to claim what's in the project now is even finished, let alone optimal. For example I'm not sure if you've seen the improvement Dmitriy has done in this area on a few open JIRA issues? I am sure it's fair to call it a work in progress as an when volunteers want to create something better. But I'm sure that suggesting that this is unuseful or needs warnings is misguided... at least you may not be considering how this is actually used in practice for, say, recs.
