I am attempting to validate the output of an L2 normalization function: *data_l2 = preprocessing.normalize(data, norm='l2') * # raw data is below at end of this email
output: array([[ 0.57649683, 0.53806371, 0.61492995], [-0.53806371, -0.57649683, -0.61492995], [ 0.3359268 , 0.90089461, -0.2748492 ], [ 0.6676851 , -0.39566524, -0.63059148], [-0.70710678, 0. , 0.70710678], [-0.63116874, 0.45083482, 0.63116874]]) Each row being a set of three features of an observation I am under the belief that the sum of the 'squared' values of an instance (row) should be virtually equal to 1 (normalized). *Problem - 1:* the np.square() function is returning the absolute value of the sum of the three features, even when the sum of the squares is clearly negative. np.square(-0.53806371) returns 0.28951255601896408 however, (-0.53806371**2) returns -0.2895125560189641 The correct square of -0.53806371 is -0.2895125560189641 (a negative number), even my 10 year old calculator gets it right. I can find nothing in the numpy documentation that indicates np.square() always returns the absolute value, instead of the correctly signed value. *Question:* Is there a way to force np.square() to return the correctly signed square value not the absolute value? *Problem - 2:* For some of the observations (rows), the sum of the squared values (which should be virtually 1), are nowhere near 1. print 0.57649683**2 + 0.53806371**2 + 0.61492995**2 row 1 0.9999999944260154 (this is virtually 1) print -0.63116874**2 + 0.45083482**2 + 0.63116874**2 row 6 0.203252034924 (*this is nowhere near 1*) sum of the 'squared' values of an instance (row) should be virtually equal to 1. *Question:* Is the preprocessing.normalize(data, norm='l2') messing up, or is my raw data being fed into the normalization routine to unrealistic (I made it up of both positive and negative numbers. *Raw Data* array([[ 1.5, 1.4, 1.6], [-1.4, -1.5, -1.6], [ 2.2, 5.9, -1.8], [ 5.4, -3.2, -5.1], [-1.4, 0. , 1.4], [-1.4, 1. , 1.4]]) Thanks: Chris P.S.: Not a real world problem, just trying to understand the functionality of scikit-learn. Have only been working with the package for two weeks.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn