actuaryzhang commented on a change in pull request #17864: [SPARK-20604][ML] Allow imputer to handle numeric types URL: https://github.com/apache/spark/pull/17864#discussion_r309451003
########## File path: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala ########## @@ -84,9 +84,15 @@ private[feature] trait ImputerParams extends Params with HasInputCols with HasOu * :: Experimental :: * Imputation estimator for completing missing values, either using the mean or the median * of the columns in which the missing values are located. The input columns should be of - * DoubleType or FloatType. Currently Imputer does not support categorical features + * numeric type. Currently Imputer does not support categorical features * (SPARK-15041) and possibly creates incorrect values for a categorical feature. * + * Note that the input columns are converted to Double data type internally to compute + * the mean/median value and impute the missing values, which are then casted back to Review comment: Great suggestion. Streamlined the doc ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
