Thank you for all your responses. ref. Dan Brickley: ------------------ hopefully you did dream ;-)
ref. Dmitriy Lyubimov: ---------------------- When I run `mahout ssvd -i A.seq -o A-ssvd/ -k 3 -p 0` I get an IllegalArgumentException. You can find the traceback at http://paste.pocoo.org/show/481168/ . ref. Ted Dunning: ----------------- I am running the M/R version of SVD in local mode. I didn't install Hadoop except what is coming via `mvn install`. If I understand the code correctly, the `--inMemory` argument is only relevant for the "EigenVerificationJob" -- I didn't run that. Here are the latest results for the calculations as described in my previous mail: For 1: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 11.344411508600611: {0:0.8940505788976013,1:0.05761556873901637,2:-0.44424543735613486} Key: 1: Value: eigenVector1, eigenvalue = 0.0: {0:-0.3030457633656634,1:0.8081220356417685,2:-0.5050762722761053} Key: 2: Value: eigenVector2, eigenvalue = -0.4362482432944815: {0:0.3299042704770375,1:0.5861904313011974,2:0.7399621277956934} Count: 3 For 2: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 11.344814282762082: {0:0.7369762290995766,1:0.3279852776056837,2:-0.5910090485061045} Key: 1: Value: eigenVector1, eigenvalue = 0.17091518882717976: {0:0.9225878132457447,1:0.3812202473600341,2:0.05918487858557608} Key: 2: Value: eigenVector2, eigenvalue = 0.0: {0:-0.5910090485061055,1:0.7369762290995774,2:-0.3279852776056802} Key: 3: Value: eigenVector3, eigenvalue = -0.5157294715892533:{0:-0.32798527760568197,1:-0.5910090485061036,2:-0.7369762290995783} Count: 4 For 3: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 11.344814080004587: {0:0.2870124314018251,1:-0.8054865010309287,2:0.5184740696291035} Key: 1: Value: eigenVector1, eigenvalue = 0.4852290375835231: {0:0.9000472484774761,1:0.041469409433508436,2:-0.4338147514658307} Key: 2: Value: eigenVector2, eigenvalue = 0.0: {0:0.3279311127797073,1:0.5911613863727806,2:0.7368781449689461} Count: 3 For 4: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 11.34481428276208: {0:0.788451139115581,1:0.5058848349238699,2:0.3498933194866569} Key: 1: Value: eigenVector1, eigenvalue = 0.5157294715892401: {0:-0.5910090485061453,1:0.7369762290995597,2:-0.32798527760564816} Key: 2: Value: eigenVector2, eigenvalue = 0.1709151888272022: {0:-0.7369762290995447,1:-0.3279852776057236,2:0.5910090485061223} Key: 3: Value: eigenVector3, eigenvalue = 0.0: {0:-0.3279852776056819,1:-0.5910090485061036,2:-0.7369762290995783} Count: 4 For 5: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 7.7949818262315: {0:-0.3998289016610171,1:0.3486764982772797,2:0.8476800982361441} Key: 1: Value: eigenVector1, eigenvalue = 0.0: {0:0.3244428422615253,1:-0.8111071056538125,2:0.4866642633922878} Key: 2: Value: eigenVector2, eigenvalue = -2.2686660367578133: {0:0.8572477421969729,1:0.4696061783100697,2:0.21117846905213422} Count: 3 For 6: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 9.903422603237882: {0:-0.305869782876591,1:-0.012493432384138303,2:0.9519913813004245} Key: 1: Value: eigenVector1, eigenvalue = 6.002722238353203: {0:-0.7781330995244824,1:0.06366543541563939,2:0.624864458709054} Key: 2: Value: eigenVector2, eigenvalue = 0.0: {0:0.2988138112963618,1:0.9481291552697455,2:0.10845003967736172} Key: 3: Value: eigenVector3, eigenvalue = -3.906144841591079: {0:0.9039656974142156,1:-0.3176397630567398,2:0.2862708487144453} Count: 4 For 7: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 7.04924152040162: {0:-0.4082482904638631,1:0.8164965809277261,2:-0.4082482904638631} Key: 1: Value: eigenVector1, eigenvalue = 3.782617346103868: {0:0.7808892910047764,1:0.08072916428282848,2:-0.6194309624391194} Key: 2: Value: eigenVector2, eigenvalue = 0.0: {0:0.47280571964327067,1:0.5716783495703939,2:0.6705509794975171} Count: 3 For 8: Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: eigenVector0, eigenvalue = 7.964450219004663: {0:NaN,1:NaN,2:NaN} Key: 1: Value: eigenVector1, eigenvalue = 7.000000000000002: {0:NaN,1:NaN,2:NaN} Key: 2: Value: eigenVector2, eigenvalue = 0.753347668076679: {0:NaN,1:NaN,2:NaN} Key: 3: Value: eigenVector3, eigenvalue = 0.0: {0:NaN,1:NaN,2:NaN} Count: 4 ref. Danny Bickson: ------------------- Thanks for your confirmation on how to use the rank. Regarding the scale factor and orthogonalization: Yes, I take it into account. I'm running SVD from trunk without any changes. And even after commenting out those parts of the code, the results are still wrong in the cases 1, 2, 3, 7 and 8 Thank you for your help. Markus > On 22 Sep 2011, at 18:37, Markus Holtermann > <i...@markusholtermann.eu> wrote: > >> Hello there, >> >> I'm trying to run Mahout's Singular Value Decomposition but >> realized, that the resulting eigenvalues are wrong in most cases. >> So I took two small 3x3 matrices and calculated their eigenvalues >> and eigenvectors by hand and compared the results to Mahout. >> >> Only in one of eight cases the results for Mahout and my pen & >> paper matched. >> >> Lets take A = {{1,2,3},{2,4,5},{3,5,6}} and B = >> {{5,2,4},{-3,6,2},{3,-3,1}} >> >> As you can see, A is symmetric, B is not. >> >> I ran `mahout svd --output out/ --numRows 3 --numCols 3` eight >> times with different arguments: >> >> 1) --input A --rank 3 --symmetric true result is wrong 2) >> --input A --rank 4 --symmetric true result is wrong 3) --input >> A --rank 3 --symmetric false result is wrong 4) --input A --rank >> 4 --symmetric false result is CORRECT >> >> 5) --input B --rank 3 --symmetric true result is wrong 6) >> --input B --rank 4 --symmetric true result is wrong 7) --input >> B --rank 3 --symmetric false result is wrong 8) --input B --rank >> 4 --symmetric false result is wrong >> >> To verify that my input data is correct, this is the result of >> `mahout seqdumper` >> >> For A: Key class: class org.apache.hadoop.io.IntWritable Value >> Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: >> {0:1.0,1:2.0,2:3.0} Key: 1: Value: {0:2.0,1:4.0,2:5.0} Key: 2: >> Value: {0:3.0,1:5.0,2:6.0} Count: 3 >> >> >> For B: Key class: class org.apache.hadoop.io.IntWritable Value >> Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: >> {0:5.0,1:2.0,2:4.0} Key: 1: Value: {0:-3.0,1:6.0,2:2.0} Key: 2: >> Value: {0:3.0,1:-3.0,2:1.0} Count: 3 >> >> >> And finally, the correct eigenvalues should be: For A: λ1 = 11.3448 >> λ2 = -0.515729 λ3 = 0.170915 >> >> For B: λ1 = 7 λ2 = 3 λ3 = 2 >> >> So, are there any known bugs in Mahout's SVD implementation? Am I >> doing something wrong? Is this algorithm known to produce wrong >> results? >> >> Thanks in advance. >> >> Markus