Hi Stuart Reynold,
Like Jacob said we have an active PR at
https://github.com/scikit-learn/scikit-learn/pull/5974
You could do
git fetch https://github.com/raghavrv/scikit-learn.git
missing_values_rf:missing_values_rf
git checkout missing_values_rf
python setup.py install
And try it out. I warn
Road, Johns Creek, GA 30097 |
dale.t.sm...@macys.com
From: scikit-learn
[mailto:scikit-learn-bounces+dale.t.smith=macys@python.org] On Behalf Of
Stuart Reynolds
Sent: Thursday, October 13, 2016 2:14 PM
To: scikit-learn@python.org
Subject: [scikit-learn] Missing data and decision trees
⚠ EX
You can simply make a new binary feature (per feature that might have a
missing value) that is 1 if the value is missing and 0 otherwise. The RF
can then work out what to do with this information.
I don't know how this compares in practice to more sophisticated approaches.
Raphael
On Thursday,
It's not a decision tree, but py-earth may also do what you need. It
handles missingness as described in section 3.4 here:
http://media.salford-systems.com/library/MARS_V2_JHF_LCS-108.pdf.
Basically, missingness is considered potentially predictive.
On Thu, Oct 13, 2016 at 11:20 AM, Jeff wrote:
I ran into this several times as well with scikit-learn implementation
of GBM. Look at xgboost if you have not already (is there someone out
there that hasn't ? :)- it deals with missing values in the predictor
space in a very eloquent manner.
http://xgboost.readthedocs.io/en/latest/python/pyt
I think Raghav is working on it in this PR:
https://github.com/scikit-learn/scikit-learn/pull/5974
The reason they weren't initially supported is likely that it involves a
lot of work and design choices to handle missing values appropriately, and
the discussion on the best way to handle it was pos
I'm looking for a decision tree and RF implementation that supports missing
data (without imputation) -- ideally in Python, Java/Scala or C++.
It seems that scikit's decision tree algorithm doesn't allow this -- which
is disappointing because its one of the few methods that should be able to
sensi