Hi,
for personal reason I am writing a function to compute the outlier measure
from random forest
http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#outliers
with a little more work I can include the function in the sklearn random
forest class.
Is the community interested? Should I
On Mon, Sep 08, 2014 at 10:05:58AM +0100, Luca Puggini wrote:
for personal reason I am writing a function to compute the outlier
measure from random forest
http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#
outliers
with a little more work I can include the function in the
Hi Luca,
This may not be the fastest implementation, but random forest
proximities can be computed quite straightforwardly in Python given
our 'apply' function.
See for instance
https://github.com/glouppe/phd-thesis/blob/master/scripts/ch4_proximity.py#L12
From a personal point of view, I never
+1 -- looks like a very handy 3-liner :)
2014-09-08 16:14 GMT+02:00 Gilles Louppe g.lou...@gmail.com:
Hi Luca,
This may not be the fastest implementation, but random forest
proximities can be computed quite straightforwardly in Python given
our 'apply' function.
See for instance
+1 for seeing this implemented. I feel it would be a useful addition for
work we do here that involves use of random forests.
On Mon, Sep 8, 2014 at 3:14 PM, Gilles Louppe g.lou...@gmail.com wrote:
Hi Luca,
This may not be the fastest implementation, but random forest
proximities can be
This could be a transform method added to RandomForestClassifier /
RandomForestRegressor.
On Mon, Sep 8, 2014 at 11:14 PM, Gilles Louppe g.lou...@gmail.com wrote:
Hi Luca,
This may not be the fastest implementation, but random forest
proximities can be computed quite straightforwardly in
On Mon, Sep 08, 2014 at 11:49:26PM +0900, Mathieu Blondel wrote:
This could be a transform method added to RandomForestClassifier /
RandomForestRegressor.
I don't think that it can be a transform, because currently transform
cannot modify y (and that's really a problem).
G
I don't think that it can be a transform, because currently transform
cannot modify y (and that's really a problem).
Brainfart! I hadn't thought about the problem well enough. Please
disregard the previous message.
G
I am rather -1 on making this a transform. There has many ways to come
up with proximity measures in forest -- In fact, I dont think
Breiman's is particularly well designed.
On 8 September 2014 16:52, Gael Varoquaux gael.varoqu...@normalesup.org wrote:
On Mon, Sep 08, 2014 at 11:49:26PM +0900,
On Mon, Sep 8, 2014 at 11:55 PM, Gilles Louppe g.lou...@gmail.com wrote:
I am rather -1 on making this a transform. There has many ways to come
up with proximity measures in forest -- In fact, I dont think
Breiman's is particularly well designed.
I think this is actually an argument for
Variants include:
- Taking into account common internal nodes reached by two samples. In
this sense, proximity takes into account the paths that are common and
not only the leaves.
- Normalizing the counts by the number of training samples within the
common leaves (instead of simply counting +1
11 matches
Mail list logo