PS I think it is better if you reply in jira rather than in the email broadcast of it since i don't monitor it and miss your posts.
On Mon, Nov 28, 2011 at 12:37 PM, Dmitriy Lyubimov <[email protected]>wrote: > I think it is certainly ok for you to try and your thoughts are even more > appreciated because optimization of this stuff for big data that is also > accurate seem to take more than one head to review. > > However, I've already planned on doing 817 in the next two months and > finish it in Q1 if I can work out existing issues. > The existing issues are both flow and performance and IMO require a tad > more contemplation w.r.t. to existing flow pecularities before reliable > flow could be figured. > On top of it, at the point I am primary maintainer of SSVD code and I > think you should know that introducing modifications which at this point > seem fairly sizable may make it more difficult for me to maintain it -- > especially given we haven't considered effect on existing power iterations > yet and future issue of introducing Cholesky option (there's a pending > issue for that as well). But I think you can catalyze that process, you > already did a lot. > > > On Mon, Nov 28, 2011 at 12:32 AM, Raphael Cendrillon < > [email protected]> wrote: > >> Hi Dmitriy, >> >> If it's OK with you I'd like to try implementing this decoration. >> >> Any advice or guidance would be very much appreciated. >> >> Raphael. >> >> On 27 Nov, 2011, at 9:23 AM, Dmitriy Lyubimov (Commented) (JIRA) wrote: >> >> > Dmitriy Lyubimov commented on MAHOUT-817: >> > ----------------------------------------- >> > >> > For the column mean bruteforce approach is probably the simplest, we 'd >> have to decorate input of A with mean subtraction. >> > >> >> Add PCA options to SSVD code >> >> ---------------------------- >> >> >> >> Key: MAHOUT-817 >> >> URL: https://issues.apache.org/jira/browse/MAHOUT-817 >> >> Project: Mahout >> >> Issue Type: New Feature >> >> Affects Versions: 0.6 >> >> Reporter: Dmitriy Lyubimov >> >> Assignee: Dmitriy Lyubimov >> >> Fix For: Backlog >> >> >> >> >> >> It seems that a simple solution should exist to integrate PCA mean >> subtraction into SSVD algorithm without making it a pre-requisite step and >> also avoiding densifying the big input. >> >> Several approaches were suggested: >> >> 1) subtract mean off B >> >> 2) propagate mean vector deeper into algorithm algebraically where the >> data is already collapsed to smaller matrices >> >> 3) --? >> >> It needs some math done first . I'll take a stab at 1 and 2 but >> thoughts and math are welcome. >> > >> > -- >> > This message is automatically generated by JIRA. >> > If you think it was sent incorrectly, please contact your JIRA >> administrators: >> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa >> > For more information on JIRA, see: >> http://www.atlassian.com/software/jira >> > >> > >> >> >
