This is relatively easy to do at the code level, but I don't know of a command line level way to do this. As you suggest this involves adjoining the two matrices.
If you use the feature hashing, adjoining works, but it is also possible to simply add the two matrices (assuming conformal sizes). On Thu, Oct 13, 2011 at 4:55 PM, Dan Brickley <[email protected]> wrote: > I have a matrix of 100,000 items x 30k features; and another of those > same 100,000 items, x however-many different features (from n-gram > collocation extraction). In current app, these are library holdings > and subject codes + extracted phrases. (later these should be 14 > million items by somewhat but not shockingly larger feature space, if > that is useful to know) > > I'd like to compose these into a larger unified feature matrix, with > same row structure, and with feature columns drawing from both input > matrices. So far in this work I've managed to get by using bin/mahout > rather than firing up Eclipse and messing with Java; I'd be happy to > learn I can continue in this work style. But if custom code is needed > that's fine. Either way, some pointer would be much appreciated... > > thanks, > > Dan >
