Richard, I think this is because the ToItemPrefMapper and ToUserVectorReducer is spitting out SequentialAccessSparseVectors with cardinality MAX_INTEGER, which doesn't matter in all of the Taste stuff, because no dense vector is ever created with this cardinality. For SVD, you do need dense vectors, so you really need to make sure the SASVectors have the correct cardinality.
A simple M/R job which runs over the output of the ToXYZPrefMapper/Reducer sequence files, and spits out new Vectors with the correct cardinality but the same data should do the trick. It may require new constructors for these vectors however, to do it most efficiently (ie just copy references to inner data structures, but set a new value for the cardinality - you can't just modify it because it is probably final). -jake On Thu, Jul 15, 2010 at 1:37 AM, Richard Simon Just < [email protected]> wrote: > I'm not exactly sure how or why, but could it be to do with DLS, > DistributedRowMatrix, TimesSquaredJob etc still using mapred.JobConf > instead of mapreduce.Job? > > I ask because when I give DLS a sequence file that's been created from > the latest ToItemPrefsMapper and ToUserVectorReducer I get the > exception. However when I use versions that still use mapred I don't get > the exception. > > >
