You do have a problem in deciding what to do with overlapping columns. Do you need map/reduce? This is a map/reduce job to concatenate vectors with independent columns.
https://issues.apache.org/jira/browse/MAHOUT-884 Never got committed- you may have to change it to match Mahout code drift. It might be great to have a map/reduce job that applies a named double function across two vectors. ----- Original Message ----- | From: "Stefan Kreuzer" <[email protected]> | To: [email protected] | Sent: Thursday, November 29, 2012 2:55:10 PM | Subject: Re: How to concatenate Vectors? | | The vectors don't have the same cardinality, so vector1.plus(vector2) | does not work. | Is there a way to resize a given vector? Sorry I am a complette | Mahout-noob. | | | | -----Ursprüngliche Mitteilung----- | Von: Ted Dunning <[email protected]> | An: user <[email protected]> | Verschickt: Do, 29 Nov 2012 11:45 pm | Betreff: Re: How to concatenate Vectors? | | | The most efficient way is probably to just add them. | | You can also use assign with a max function. Or you can write a | special | function if you want the left vector or the right one to have | preference. | | Vector a, b; | // method 1 | a.plus(b); | // method 2 | a.assign(b, Functions.MAX); | // method 3 | a.assign(b, new DoubleDoubleFunction() { | @Override | public double apply(double arg1, double arg2) { | return arg1 > 0 ? arg1 : arg2; | } | }); | | The assignment approaches have the problem that they not operate in a | sparse fashion except for a few known functions. It would be very | easy to | extend the DoubleDoubleFunction so that there is a | isSparseFriendly() method or to introduce a new | SparseFriendlyDoubleDoubleFunction class to determine whether it | would be | safe to do the iteration in a sparse fashion. The last example would | become: | | a.assign(b, new SparseFriendlyDoubleDoubleFunction() { | @Override | public double apply(double arg1, double arg2) { | return arg2 != 0 ? arg2 : arg1; | } | }); | | MAX is not a sparse friendly function, of course since max(x,0) != 0 | in | general. | | On Thu, Nov 29, 2012 at 2:24 PM, Stefan Kreuzer | <[email protected]>wrote: | | > Hello, | > | > I dont understand what is the best way to concatenate (or merge) | > two | > sparse vectors. | > I.e. given two sparse vectors | > {8:0.027,38:0.037,67:0.027} | > and | > {86:0.032,87:01042} | > I need to build a new vector that contains all the values of the | > two: | > {8:0.027,38:0.037,67:0.027,86:0.032,87:01042} | > | > What is the best / most efficient way to achieve this in Mahout? | > | > Best Regards | > Stefan | > | | | work |
