I guess we should define what we mean by 'row' here. I'm thinking of the 'distributed row matrix' row, is that also what you have in mind?
I've been working on row wise mean computation as part of MAHOUT-880. I could add a column wise mean job as well. By the way, if you have any comments on the current patch (available at https://reviews.apache.org/r/2955/diff/2/) they would be much appreciated! On Dec 5, 2011, at 11:00 AM, Ted Dunning <[email protected]> wrote: > Row-wise mean usually means that a mean of each row is computed. > > I think that most PCA users would want column-wise means for subtraction. > > On Mon, Dec 5, 2011 at 10:58 AM, Dmitriy Lyubimov <[email protected]> wrote: > We probably need row wise mean computation job anyway as a separate mr > step. Wanna take a stab? > On Dec 5, 2011 10:34 AM, "Raphael Cendrillon" <[email protected]> > wrote: > > > Given that this request seems to come up frequently, would it be worth > > putting this approach under mahout-examples? Initially it could use the > > brute force approach together with SSVD, and updated later once support is > > ready for mean-subtraction within SSVD. > > > > I could put something together if there's interest. > > > > On Mon, Dec 5, 2011 at 9:40 AM, Dmitriy Lyubimov <[email protected]> > > wrote: > > > > > I am working on the addtions to ssvd algorithms and the mods to current > > > solver will probably emerge in a matter of a month, my schedule > > permitting. > > > > > > However, a brute force approach is already possible. If your input is of > > > moderate size, or if it is already dense, you could compute median and > > > substract it yourself very easily and then shove it into ssvd solver > > while > > > requesting to produce either u or v depending if subtract column wise or > > > row wise mean. > > > > > > The only problem with brute force approach is that it would densify > > > originally sparse input. Depending on your problem and # of machine nodes > > > you can spare, it may or may not be a problem. > > > On Dec 4, 2011 7:59 PM, "magicalo" <[email protected]> wrote: > > > > > > > Hello, > > > > > > > > Is there an expected release date for the PCA algorithm as part of > > > Mahout? > > > > Tx! > > > > > > > > > > > > > >
