Hi Zoraida,
The Imputer assumes that your data is a numeric numpy array, or
convertible to one. You should replace your string "NA" values with
np.nan objects, then use the Imputer with the default,
`missing_values='NaN'`.
It's easier to debug if you explicitly convert your data to a float
numpy
Hi all,
I am having problems when trying to deal with missing values. I am using
Imputer like this:
Pipeline([('imputerNA', Imputer(missing_values='NA', strategy='mean',
axis=0, verbose=4)), ('minmax', MinMaxScaler())]))]
My data looks like this:
24881956.0|NA|1840.0|NA|NA|48.0|1.4|NA|-1.0|0.0|