Re: [mllib] GradientDescent requires huge memory for storing weight vector

2015-01-12 Thread Reza Zadeh
I guess you're not using too many features (e.g. < 10m), just that hashing the index makes it look that way, is that correct? If so, the simple dictionary that maps your feature index -> rank can be broadcast and used everywhere, so you can pass mllib just the feature's rank as its index. Reza O

[mllib] GradientDescent requires huge memory for storing weight vector

2015-01-12 Thread Tianshuo Deng
Hi, Currently in GradientDescent.scala, weights is constructed as a dense vector: initialWeights = Vectors.dense(new Array[Double](numFeatures)) And the numFeatures is determined in the loadLibSVMFile as the max index of features. But in the case of using hash function to compute feature ind