normPower - Normalization is used to negate the effects of features that skew up results disproportionately.
Mathematically the norm of a vector [x,y,z] is x/(x^p + y^p + z^p), y/(x^p + y^p + z^p), z/(x^p + y^p + z^p) when p = 1 its 1-norm or Manhattan norm when p =2 its Euclidean norm..... logNormalize - measure of the log likelihood that an NGram is a significant unit Both of the above terms have been described in detail in Chapter 8 of Mahout in Action book. ________________________________ From: Neil Chaudhuri <[email protected]> To: "[email protected]" <[email protected]> Sent: Monday, December 12, 2011 1:04 PM Subject: DictionaryVectorizer Parameters Can I get an explanation of what the following means and how the values affect outcomes? normPower - L_p norm to be computed logNormalize - whether to use log normalization Thanks.
