On Thu, Jun 10, 2010 at 7:24 AM, Richard Simon Just < [email protected]> wrote: > > > How does the EigenVerificationJob represent V and S in the > SequenceFile<IntWriteable, VectorWriteable> output? and I guess the same > question for the DistributedLanczosSolver. >
Oooh, you caught me in an ugly bit of code. The V output of EigenVerificationJob and DistributedLanczosSolver is yes, just a SequenceFile<IntWritable,VectorWritable>, where the ints (the keys) are row numbers (which run from 0 up to reducedRank [well, roughly]). S, on the other hand... is hackily encoded in the serialized "name" variable of the vector output of EigenVerificationJob. If you can think of a better place to a couple dozen to a couple hundred double values output from a Hadoop job, well, by all means, submit a patch and I'll tack it in there. If you dump the vectors to the screen with the vectordumper command line script, you'll see the values (but they're also printed to the console when you run EigenVerificationJob). -jake
