Re: [Scikit-learn-general] Possible Bug in GradientBoostingClassifier

2012-07-05 Thread Nikit Saraf
Hi Peter As recommended by you I'll switch to the master immediately and check with it. But I am unclear on flattening Y_1.Should it done or kept as 2D ? Thank you so much.And yes, Congratulations for Win in Kaggle Competition!! Regards Nikit Saraf On Fri, Jul 6, 2012 at 1:55 AM, Peter Prettenh

Re: [Scikit-learn-general] Possible Bug in GradientBoostingClassifier

2012-07-05 Thread Peter Prettenhofer
Ok, I strongly recommend using the current master because I added some bugfixes w.r.t. input verification after the 0.11 release. I added some more test cases for input verification and another fix (I didn't flatten the ``y`` array and you are passing a 2d ``Y_1``). I included them in this PR [1] w

Re: [Scikit-learn-general] Possible Bug in GradientBoostingClassifier

2012-07-05 Thread Nikit Saraf
Hi Peter It is most probably 0.11.I'm not so sure, It can be older version too. Regards Nikit Saraf On Fri, Jul 6, 2012 at 1:39 AM, Peter Prettenhofer < [email protected]> wrote: > Nikit, > which version of sklearn do you use? master or 0.11? > > best, > Peter > > 2012/7/5 Nikit S

Re: [Scikit-learn-general] Possible Bug in GradientBoostingClassifier

2012-07-05 Thread Peter Prettenhofer
Nikit, which version of sklearn do you use? master or 0.11? best, Peter 2012/7/5 Nikit Saraf : > Hi Peter > > Thanks for the reply. > > dtype of Y_1 is 'string64' and its shape is (12137,1) > dtype of X_1 is 'float64" and its shape is (12137,100) > > And here is the example of 10 cases http://

Re: [Scikit-learn-general] Possible Bug in GradientBoostingClassifier

2012-07-05 Thread Nikit Saraf
Hi Peter Thanks for the reply. dtype of Y_1 is 'string64' and its shape is (12137,1) dtype of X_1 is 'float64" and its shape is (12137,100) And here is the example of 10 cases http://paste.ubuntu.com/1077028/ It would be great if you could point out where I'm going wrong.Thank you so much for t

Re: [Scikit-learn-general] Possible Bug in GradientBoostingClassifier

2012-07-05 Thread Peter Prettenhofer
Hi Nikit, thanks for reporting - I added a test case for symbolic class labels and it works ok (class labels get mapped to internal class ids prior to fitting; see gradient_boosting.py:629:631) - I think the source of the error is something different. Can you check the dtype and shape of ``Y_1``?

[Scikit-learn-general] Possible Bug in GradientBoostingClassifier

2012-07-05 Thread Nikit Saraf
I was trying to train a Character Recognition Model with the help of GradientBoostingClassifier. When i tried to run, it gave me the following error :- Traceback (most recent call last): File "charRecog.py", line 23, in clf = GradientBoostingClassifier().fit(X_1,Y_1) File "/usr/local/lib/

Re: [Scikit-learn-general] Astronomy Tutorial

2012-07-05 Thread Fernando Perez
On Thu, Jul 5, 2012 at 10:24 AM, Gael Varoquaux wrote: > Do you have any plans to generate notebooks from enhanced valid Python > code? I would find that really handy as it would open the door to proving > notebook-like functionality without really depending on the notebook for > the development w

Re: [Scikit-learn-general] Astronomy Tutorial

2012-07-05 Thread Jacob VanderPlas
Olivier, Gael, Thanks for the detailed suggestions. The tutorial I'm preparing for is on Monday, July 16, so I'll be putting in a lot of effort in the next couple weeks. I think for present purposes, I'll plan to keep the tutorial and examples in the old paradigm of rst + source code with skel

Re: [Scikit-learn-general] Astronomy Tutorial

2012-07-05 Thread Gael Varoquaux
On Thu, Jul 05, 2012 at 10:19:49AM -0700, Fernando Perez wrote: > We've tried to make sure the format is as version control-friendly as > possible, within the limits of accepting that it's json. Do you have any plans to generate notebooks from enhanced valid Python code? I would find that really

Re: [Scikit-learn-general] Astronomy Tutorial

2012-07-05 Thread Fernando Perez
On Thu, Jul 5, 2012 at 9:34 AM, Gael Varoquaux wrote: > I am very clearly -1 on this suggestion for several reasons: You guys should definitely find a policy that works well for sklearn. I just want to provide some info here, not push for using notebooks in your default setup: > a. I worry very

Re: [Scikit-learn-general] Astronomy Tutorial

2012-07-05 Thread Gael Varoquaux
On Thu, Jul 05, 2012 at 06:47:35PM +0200, Olivier Grisel wrote: > None of this is doctested. And I don't want to put pollute the code > with boilerplate to make that testable. It's up to the teacher to > check that those exercises still work prior to using them in an > interactive session. I afrai

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Gael Varoquaux
On Fri, Jul 06, 2012 at 01:44:43AM +0900, Mathieu Blondel wrote: >This is how I did it in my calibration plot PR too: >[2]https://github.com/scikit-learn/scikit-learn/pull/882 I am so far behind on PR reviewing :(. I hadn't looked at that so far. I am only starting to catch up with mail.

Re: [Scikit-learn-general] Astronomy Tutorial

2012-07-05 Thread Olivier Grisel
The tutorial itself and inline examples can stay in sphinx + doctests. I agree this is a great format for online publishing and maintenance checks using doctests. But converting the 3 or 4 short exercises in notebook format would be great: Here is the current code for the exercise snippets in my

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Mathieu Blondel
On Fri, Jul 6, 2012 at 12:31 AM, Gael Varoquaux < [email protected]> wrote: > I must admit that so far I have been frowning away from adding code that > does any plotting to scikit-learn. I tend to be worried about the > maintenance code for such code. However, maybe having code that s

Re: [Scikit-learn-general] any obvious ideas for following test failures on armel and mipsel?

2012-07-05 Thread Gael Varoquaux
On Wed, Jul 04, 2012 at 11:43:38AM +0200, Olivier Grisel wrote: > > Thanks -- I will pick up this "lucky to succeed once" fix ;) > I think that using the numpy.random singleton (or any other mutable > singleton) in scikit-learn tests should be considered a failure in > itself as it breaks the test

Re: [Scikit-learn-general] Astronomy Tutorial

2012-07-05 Thread Gael Varoquaux
On Tue, Jul 03, 2012 at 12:24:43PM -0700, Jake Vanderplas wrote: > I turned in the first draft of my PhD thesis yesterday, Congratulations! > Should the loaders be moved to sklearn.datasets, so the data can be > used for general examples which are not associated with the tutorial? > Or do you thi

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Vlad Niculae
On Jul 5, 2012, at 17:31 , Gael Varoquaux wrote: > On Thu, Jul 05, 2012 at 05:08:13PM +0200, Peter Prettenhofer wrote: >> Indeed would be great to have a component to generate learning curves >> in sklearn - I have some custom code lying around but it's rather >> ugly... > > I must admit that so

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Andreas Mueller
On 07/05/2012 04:31 PM, Gael Varoquaux wrote: > On Thu, Jul 05, 2012 at 05:08:13PM +0200, Peter Prettenhofer wrote: >> Indeed would be great to have a component to generate learning curves >> in sklearn - I have some custom code lying around but it's rather >> ugly... > I must admit that so far I h

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Gael Varoquaux
On Thu, Jul 05, 2012 at 04:57:25PM +0200, Peter Prettenhofer wrote: > Model selection is my nemesis - little can be gained, everything lost :-) Agreed! > In the end I did 5x 5-fold CV - the error std between the repetitions > was around 0.005. I am a big fan of using ShuffleSplit to reduce the v

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Gael Varoquaux
On Thu, Jul 05, 2012 at 05:08:13PM +0200, Peter Prettenhofer wrote: > Indeed would be great to have a component to generate learning curves > in sklearn - I have some custom code lying around but it's rather > ugly... I must admit that so far I have been frowning away from adding code that does an

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Olivier Grisel
2012/7/5 Peter Prettenhofer : > 2012/7/5 Olivier Grisel : >> 2012/7/5 Emanuele Olivetti : >>> On 07/05/2012 09:45 AM, Andreas Mueller wrote: Hey Peter. Pretty awesome feat! Thanks for all the work you put into the ensemble module! A blog post about this competition would re

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Olivier Grisel
2012/7/5 Peter Prettenhofer : > 2012/7/5 Olivier Grisel : >> 2012/7/5 Emanuele Olivetti : >>> On 07/05/2012 08:49 AM, Olivier Grisel wrote: 2012/7/5 Peter Prettenhofer : > ... > > I've to check with the competition organizers whether its ok to put > the source code on github -

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Peter Prettenhofer
2012/7/5 Olivier Grisel : > 2012/7/5 Emanuele Olivetti : >> On 07/05/2012 08:49 AM, Olivier Grisel wrote: >>> 2012/7/5 Peter Prettenhofer : ... I've to check with the competition organizers whether its ok to put the source code on github - I'll keep you posted. >>> If so that wo

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Peter Prettenhofer
2012/7/5 Olivier Grisel : > 2012/7/5 Emanuele Olivetti : >> On 07/05/2012 09:45 AM, Andreas Mueller wrote: >>> Hey Peter. >>> Pretty awesome feat! Thanks for all the work you put into the ensemble >>> module! >>> >>> A blog post about this competition would really be great :) >>> >>> I was wonderin

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Olivier Grisel
2012/7/5 Emanuele Olivetti : > On 07/05/2012 08:49 AM, Olivier Grisel wrote: >> 2012/7/5 Peter Prettenhofer : >>> ... >>> >>> I've to check with the competition organizers whether its ok to put >>> the source code on github - I'll keep you posted. >> If so that would be a great blog post topic. Loo

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Olivier Grisel
2012/7/5 Emanuele Olivetti : > On 07/05/2012 09:45 AM, Andreas Mueller wrote: >> Hey Peter. >> Pretty awesome feat! Thanks for all the work you put into the ensemble >> module! >> >> A blog post about this competition would really be great :) >> >> I was wondering, was there much difference in perf

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Emanuele Olivetti
On 07/05/2012 08:49 AM, Olivier Grisel wrote: > 2012/7/5 Peter Prettenhofer : >> ... >> >> I've to check with the competition organizers whether its ok to put >> the source code on github - I'll keep you posted. > If so that would be a great blog post topic. Looking forward to it. > Hi, For what

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Emanuele Olivetti
On 07/05/2012 09:45 AM, Andreas Mueller wrote: > Hey Peter. > Pretty awesome feat! Thanks for all the work you put into the ensemble > module! > > A blog post about this competition would really be great :) > > I was wondering, was there much difference in performance between GBRT > and RF? Hi, I

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-05 Thread Olivier Grisel
Thanks very much for the wrap up Vlad. Could you please document how to use the %memit and %mrun tools in the performance chapter of the scikit-learn documentation? http://scikit-learn.org/stable/developers/performance.html Those are great tools BTW. Also continuous monitoring of memory usage in

[Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-05 Thread Vlad Niculae
Hello friends, As the midterm evaluation is approaching, I pushed the pedal to the metal and my blog and github profile have seen a lot of activity recently. I would like to link to everything from one place, and that place will be this e-mail. So this is what happened: -- I wrote a couple of m

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Vlad Niculae
Congratulations Peter! Excellent work as always! Vlad On Jul 5, 2012, at 00:48 , Emanuele Olivetti wrote: > Dear All, > > As some of you may have already noticed, Peter (Prettenhofer) has > just won a the "Online Product Sales" competition on kaggle.com > beating 365 teams: > http://www.kaggle.

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread federico vaggi
If you get an OK - that would be absolutely amazing, especially if you broke it down and explained the different tweaks. Speaking for myself - I am only super interested in seeing how you set up the grid search on the EC3 instances :) Congratulations on a fantastic job, and this will draw a lot m

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread xinfan meng
On Thu, Jul 5, 2012 at 4:45 PM, Andreas Mueller wrote: > Hey Peter. > Pretty awesome feat! Thanks for all the work you put into the ensemble > module! > > A blog post about this competition would really be great :) > > I was wondering, was there much difference in performance between GBRT > and RF

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Andreas Mueller
Hey Peter. Pretty awesome feat! Thanks for all the work you put into the ensemble module! A blog post about this competition would really be great :) I was wondering, was there much difference in performance between GBRT and RF? We should do a "hall of fame" on the website listing citations an

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Jaques Grobler
Congratulations!!! Very well done 2012/7/5 Olivier Grisel > 2012/7/5 Peter Prettenhofer : > > Hi everybody, > > > > thanks a lot for your congratulations. It has been a tight race indeed > > and I have to consider myself lucky that I ended up on the first place > > - as Olivier already said scor

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Olivier Grisel
2012/7/5 Peter Prettenhofer : > Hi everybody, > > thanks a lot for your congratulations. It has been a tight race indeed > and I have to consider myself lucky that I ended up on the first place > - as Olivier already said score differences among the top teams are > really small. > > Anyways, it's a