Re: [scikit-learn] Missing data and decision trees

2016-10-13 Thread Jason Rudy
It's not a decision tree, but py-earth may also do what you need. It handles missingness as described in section 3.4 here: http://media.salford-systems.com/library/MARS_V2_JHF_LCS-108.pdf. Basically, missingness is considered potentially predictive. On Thu, Oct 13, 2016 at 11:20 AM, Jeff wrote:

[scikit-learn] Failing check_estimator on py-earth

2017-05-19 Thread Jason Rudy
I'm pushing to get py-earth ready for a release, but I'm having an issue with the check_estimator function on 32 bit windows machines. Here is a link to the failing build on appveyor: https://ci.appveyor.com/project/jcrudy/py-earth/build/job/21r6838yh1bgwxw4 It appears that array conversion is p

Re: [scikit-learn] Failing check_estimator on py-earth

2017-05-19 Thread Jason Rudy
sting.assert_array_almost_equal(..., precision=2) > > or sth like that? > > Best, > Sebastian > > > On May 19, 2017, at 6:10 PM, Jason Rudy wrote: > > > > I'm pushing to get py-earth ready for a release, but I'm having an issue > with the check_estimat

Re: [scikit-learn] combining datasets from different sources

2017-09-05 Thread Jason Rudy
Thomas, This is sort of related to the problem I did my M.S. thesis on years ago: cross-platform normalization of gene expression data. If you google that term you'll find some papers. The situation is somewhat different, though, because with microarrays or RNA-seq you get thousands of data poin

[scikit-learn] check_estimator and score_samples method

2018-12-08 Thread Jason Rudy
Hi all, I'm working on updating py-earth for some recent changes in scikit-learn and cython. It seems like check_estimator has been significantly improved, and I'm working through making py-earth compliant with it. I've hit the following issue, though. It seems check_estimator tests score_sampl

Re: [scikit-learn] check_estimator and score_samples method

2018-12-13 Thread Jason Rudy
Thanks, Joel. From your response I assume that the use of a y argument to score_samples is not a violation of the sklearn API, so I'll keep the method and find a workaround for the check_estimator test as it's currently written. I'll comment on the issue as well. On Mon, Dec 10, 2018 at 2:58 P