Re: [Scikit-learn-general] Regarding GSoC 2014

2014-02-06 Thread Andy
On 02/06/2014 06:47 PM, MIT SHAH wrote: > Is scikit-learn going to take part as mentoring organization in GSoC > 2014 ?? > If yes, what are the projects, which are going to be focused ? > Yes, under the wing of the python software foundation. The projects will be determined by the participants. Th

Re: [Scikit-learn-general] Strange Error Message

2014-02-06 Thread Andy
I think this is the same issue as: https://github.com/scikit-learn/scikit-learn/issues/2809 Can you do X.max() and X.min()? I would guess that the values are to large to be represented by float32, which is what the trees use internally. Hth, Andy On 02/06/2014 06:38 PM, Lorenzo Isella wrote: >

Re: [Scikit-learn-general] KModes

2014-02-06 Thread Andy
Hi Tim. I don't know of an implementation of the algorithm. For scikit-learn, this would currently not be suitable. Unfortunately there is no concept of categorical variables in scikit-learn, and all categorical variables need to be one-hot encoded. Cheers, Andy On 02/06/2014 06:32 PM, tim pi

[Scikit-learn-general] Regarding GSoC 2014

2014-02-06 Thread MIT SHAH
Is scikit-learn going to take part as mentoring organization in GSoC 2014 ?? If yes, what are the projects, which are going to be focused ? -- Managing the Performance of Cloud-Based Applications Take advantage of what the

Re: [Scikit-learn-general] Strange Error Message

2014-02-06 Thread Lorenzo Isella
On Wed, 05 Feb 2014 14:08:10 +0100, wrote: > Date: Wed, 5 Feb 2014 12:11:54 +0100 > From: federico vaggi > Subject: Re: [Scikit-learn-general] Strange Error Message > To: [email protected] > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > A quic

[Scikit-learn-general] KModes

2014-02-06 Thread tim pierson
Hi, Forgive the general inquiry, but I've been trying to find a python implementation of k modes clustering (for nominal/categorical data). Does anyone know of one in existence? (Would this be something the scikit learn community would be interested in?) Thanks,

Re: [Scikit-learn-general] Negative feature_importances in random forest with sample_weights

2014-02-06 Thread Gilles Louppe
Vincent, I identified the bug and opened an issue at https://github.com/scikit-learn/scikit-learn/issues/2835 I will try to fix this in the next days. Sorry for the inconvenience. Gilles On 6 February 2014 18:18, Gilles Louppe wrote: > Dear Vincent, > > On 6 February 2014 17:46, Vincent Arel

Re: [Scikit-learn-general] Negative feature_importances in random forest with sample_weights

2014-02-06 Thread Gilles Louppe
Dear Vincent, On 6 February 2014 17:46, Vincent Arel wrote: > Hi all, > > Gilles Louppe[1] suggests that feature importance in random forest > classifiers is calculated using the algorithm of Breiman (1984). I > imagine this is the same as formula 10.42 on page 368 of Hastie et > al.[2]. This for

[Scikit-learn-general] Negative feature_importances in random forest with sample_weights

2014-02-06 Thread Vincent Arel
Hi all, Gilles Louppe[1] suggests that feature importance in random forest classifiers is calculated using the algorithm of Breiman (1984). I imagine this is the same as formula 10.42 on page 368 of Hastie et al.[2]. This formula only has a sum, a squared term and an indicator, so I’m trying to f