Putting a smaller value here will degrade prediction quality because more and more features will collide in the hashed feature space. Increasing this beyond a certain point, however, will not significantly increase prediction quality and it will increase memory usage.
On Fri, Apr 27, 2012 at 11:01 PM, Yang <[email protected]> wrote: > when I run mahout trainlogistic , there is an optional param --features > > from the book "Mahout in action", it says: > > --features > The size of the internal feature vector to use in building the model. A > larger value here can be helpful, especially with text-like input data > > > so is this something like buffer size so it does not affect the result of > the training? I thought the feature count to be considered in the model is > already explicitly given by the --predictors param > > > thanks > yang >
