A bin/mahout job to to turn A and B into rows of a1,a2,...aN,b1,b2...bN? What extra options would you like? For example, would you want to apply different weights to matrix A v.s. matrix B?
On Thu, Oct 13, 2011 at 10:11 AM, Ted Dunning <[email protected]> wrote: > This is relatively easy to do at the code level, but I don't know of a > command line level way to do this. As you suggest this involves adjoining > the two matrices. > > If you use the feature hashing, adjoining works, but it is also possible to > simply add the two matrices (assuming conformal sizes). > > On Thu, Oct 13, 2011 at 4:55 PM, Dan Brickley <[email protected]> wrote: > > > I have a matrix of 100,000 items x 30k features; and another of those > > same 100,000 items, x however-many different features (from n-gram > > collocation extraction). In current app, these are library holdings > > and subject codes + extracted phrases. (later these should be 14 > > million items by somewhat but not shockingly larger feature space, if > > that is useful to know) > > > > I'd like to compose these into a larger unified feature matrix, with > > same row structure, and with feature columns drawing from both input > > matrices. So far in this work I've managed to get by using bin/mahout > > rather than firing up Eclipse and messing with Java; I'd be happy to > > learn I can continue in this work style. But if custom code is needed > > that's fine. Either way, some pointer would be much appreciated... > > > > thanks, > > > > Dan > > > -- Lance Norskog [email protected]
