For the SVM implementation, I am not sure. For the SGD implementation, the number of possible features will not have a direct impact on memory usage since a hashed feature vector is used. The number of class labels has a linear impact on the amount of memory required, but the dependence isn't horrible. For complex problems such as text, several other variables and high degrees of cross-terms you will want to have 50,000 - 1,000,000 internal features which means that the scaling will be about 8MB per class label. I am not sure how performance would be at more than a hundred class labels in any case so the memory requirements are significant, but not absolutely massive.
On Sat, Jul 10, 2010 at 4:38 PM, Viva Friend <[email protected]> wrote: > > Also, will the number of features and class labels affect the memory > usage for the Liblinear Mahout implementation? > >
