Hi everyone,
I'm tackling a classification problem with a lot of missing values for
several features. A quick google search suggested that GradientBoosting
and RandomForest would both be able to handle NaN values - but even in the
bleeding edge repo, both classifiers complain and throw an error.
Is this just a case of Google deceiving me? And does anyone have any
advice for systematically dealing with missing values? I have very few
'complete' rows, so I'd have to throw away a lot of data to get a clean
dataset with no NaN.
Federico
------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general