Thanks Dmitriy. I certainly understand. Perhaps I can find some other areas to contribute.
On 28 Nov, 2011, at 12:37 PM, Dmitriy Lyubimov wrote: > I think it is certainly ok for you to try and your thoughts are even more > appreciated because optimization of this stuff for big data that is also > accurate seem to take more than one head to review. > > However, I've already planned on doing 817 in the next two months and > finish it in Q1 if I can work out existing issues. > The existing issues are both flow and performance and IMO require a tad > more contemplation w.r.t. to existing flow pecularities before reliable > flow could be figured. > On top of it, at the point I am primary maintainer of SSVD code and I think > you should know that introducing modifications which at this point seem > fairly sizable may make it more difficult for me to maintain it -- > especially given we haven't considered effect on existing power iterations > yet and future issue of introducing Cholesky option (there's a pending > issue for that as well). But I think you can catalyze that process, you > already did a lot. > > > On Mon, Nov 28, 2011 at 12:32 AM, Raphael Cendrillon < > [email protected]> wrote: > >> Hi Dmitriy, >> >> If it's OK with you I'd like to try implementing this decoration. >> >> Any advice or guidance would be very much appreciated. >> >> Raphael. >> >> On 27 Nov, 2011, at 9:23 AM, Dmitriy Lyubimov (Commented) (JIRA) wrote: >> >>> Dmitriy Lyubimov commented on MAHOUT-817: >>> ----------------------------------------- >>> >>> For the column mean bruteforce approach is probably the simplest, we 'd >> have to decorate input of A with mean subtraction. >>> >>>> Add PCA options to SSVD code >>>> ---------------------------- >>>> >>>> Key: MAHOUT-817 >>>> URL: https://issues.apache.org/jira/browse/MAHOUT-817 >>>> Project: Mahout >>>> Issue Type: New Feature >>>> Affects Versions: 0.6 >>>> Reporter: Dmitriy Lyubimov >>>> Assignee: Dmitriy Lyubimov >>>> Fix For: Backlog >>>> >>>> >>>> It seems that a simple solution should exist to integrate PCA mean >> subtraction into SSVD algorithm without making it a pre-requisite step and >> also avoiding densifying the big input. >>>> Several approaches were suggested: >>>> 1) subtract mean off B >>>> 2) propagate mean vector deeper into algorithm algebraically where the >> data is already collapsed to smaller matrices >>>> 3) --? >>>> It needs some math done first . I'll take a stab at 1 and 2 but >> thoughts and math are welcome. >>> >>> -- >>> This message is automatically generated by JIRA. >>> If you think it was sent incorrectly, please contact your JIRA >> administrators: >> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa >>> For more information on JIRA, see: >> http://www.atlassian.com/software/jira >>> >>> >> >>
