I have a matrix of 100,000 items x 30k features; and another of those same 100,000 items, x however-many different features (from n-gram collocation extraction). In current app, these are library holdings and subject codes + extracted phrases. (later these should be 14 million items by somewhat but not shockingly larger feature space, if that is useful to know)
I'd like to compose these into a larger unified feature matrix, with same row structure, and with feature columns drawing from both input matrices. So far in this work I've managed to get by using bin/mahout rather than firing up Eclipse and messing with Java; I'd be happy to learn I can continue in this work style. But if custom code is needed that's fine. Either way, some pointer would be much appreciated... thanks, Dan
