Re: [Scikit-learn-general] SO question for the tree growers

2013-04-04 Thread Olivier Grisel
Thank you to both of you! I learned something new today :) -- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified pro

Re: [Scikit-learn-general] SO question for the tree growers

2013-04-04 Thread Gilles Louppe
Hi Olivier, There are indeed several ways to get feature "importances". As often, there is no strict consensus about what this word means. In our case, we implement the importance as described in [1] (often cited, but unfortunately rarely read...). It is sometimes called "gini importance" or "mea

Re: [Scikit-learn-general] misleading example for DBSCAN?

2013-04-04 Thread Andreas Mueller
Hi Johannes. I think the example is just wrong. Can someone confirm this? Cheers, Andy On 04/04/2013 06:57 PM, Johannes Knopp wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi everyone, > > I just stumbled upon the example plot_dbscan.py at [1]. As far as I > understand, the similar

Re: [Scikit-learn-general] SO question for the tree growers

2013-04-04 Thread Peter Prettenhofer
I posted a brief description of the algorithm. The method that we implement is briefly described in ESLII. Gilles is the expert here, he can give more details on the issue. 2013/4/4 Olivier Grisel > The variable importance in scikit-learn's implementation of random > forest is based on the prop

[Scikit-learn-general] SO question for the tree growers

2013-04-04 Thread Olivier Grisel
The variable importance in scikit-learn's implementation of random forest is based on the proportion of samples that were classified by the feature at some point in one of the decision trees evaluation. http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation This method

[Scikit-learn-general] misleading example for DBSCAN?

2013-04-04 Thread Johannes Knopp
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi everyone, I just stumbled upon the example plot_dbscan.py at [1]. As far as I understand, the similarity matrix S is computed from the data in X and then it is used for clustering with DBSCAN. What confused me was that the documentation for DBSCAN.

Re: [Scikit-learn-general] PyCon 2013 scikit-learn tutorial videos online!

2013-04-04 Thread Olivier Grisel
Thanks Robert! -- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Po