Re: [Scikit-learn-general] [ANN] scikit-learn 0.16b1 is out!

2015-03-11 Thread Mathieu Blondel
On Tue, Mar 10, 2015 at 12:01 PM, Andy wrote: > On 03/09/2015 10:44 PM, Joel Nothman wrote: > > Congratulations! This has been a long time coming, and if not only for the > swathe of features it'll be great to see the documentation improvements > appearing on stable soon! > > My thoughts on dev

[Scikit-learn-general] Gap statistic

2015-03-11 Thread Saul Spatz
The gap statistic is a popular method of choosing the "optimum" k in k-means. I've written an implementation, but it doesn't appear to be useful for my client's data, and I'd like to contribute it. I'm attaching my script so you can see what I'm talking about. I know I have to document it, and I

Re: [Scikit-learn-general] SciKitLearn, OpenML, Google Summer of Code

2015-03-11 Thread Joaquin Vanschoren
Yes, sure, but there may be other platforms that connect to it? Thanks, Joaquin On Wed, Mar 11, 2015 at 10:44 PM Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > On Wed, Mar 11, 2015 at 09:41:01PM +, Joaquin Vanschoren wrote: > > We mention mldata in the description. OpenML does a lo

Re: [Scikit-learn-general] SciKitLearn, OpenML, Google Summer of Code

2015-03-11 Thread Gael Varoquaux
On Wed, Mar 11, 2015 at 09:41:01PM +, Joaquin Vanschoren wrote: > We mention mldata in the description. OpenML does a lot more than sharing > datasets, it's a complete open science platform: you can share complete > experiments (data, code, results), and it automatically links all datasets, > a

Re: [Scikit-learn-general] GSoC2015 Hyperparameter Optimization topic

2015-03-11 Thread Christof Angermueller
I will have a closer look at the different optimization approaches and start to work on an outline for this topic. Does anybody know of further optimization approaches that were not mentioned below and that we could consider? Is there anybody else interested in this topic? Christof On 201503

Re: [Scikit-learn-general] SciKitLearn, OpenML, Google Summer of Code

2015-03-11 Thread Andreas Mueller
Hi Jaoquin. I think this sounds like a great project. I remember there being similar efforts earlier, like mldata. Anyhow, if you have mentors from the openml project, I'd be happy to co-mentor from the sklearn side (possibly not full-on mentor, depending on how many scikit-learn projects we wil

Re: [Scikit-learn-general] random forests - number of samples

2015-03-11 Thread Andreas Mueller
By default bootstrap=True, so a bootstrap sample is used. That means the number of samples is the same as the original number of samples, but only about 2/3 of the dataset is used, the rest are duplicates. For efficiency, the samples are actually represented using sample weights. On 03/11/2015

Re: [Scikit-learn-general] SciPy 2015 Austin

2015-03-11 Thread Nelle Varoquaux
> We can probably also email one of the organizers (I think they are > listed on the site?) and find out if we can edit or add an addendum. > It is strange - I am almost 100% positive we could edit the proposals > in past years. > It is not the same system as last year. Admins can edit proposals.

Re: [Scikit-learn-general] SciPy 2015 Austin

2015-03-11 Thread Justin Vincent
Organizer here :)! We switched to a new tool this year and are working out the kinks. Will look into how to edit a proposal. Best, Justin On Wed, Mar 11, 2015 at 8:57 AM Kyle Kastner wrote: > We can probably also email one of the organizers (I think they are > listed on the site?) and find out

Re: [Scikit-learn-general] Scikit-learn sprint in Paris, April 2nd

2015-03-11 Thread Gilles Louppe
I'll be there! On 11 March 2015 at 16:42, Nelle Varoquaux wrote: > Hi all, > > Gael, Alex, Vincent and I are organizing a sprint in Paris, the day before > Pydata Paris [1]. The sprint will take place in Télécom ParisTech (13th > arrondissement) [2], from 9am until people are bored or we get kick

[Scikit-learn-general] Application for GSoC (possible mentors Andreas or Michael )

2015-03-11 Thread Luca Puggini
Hi Andreas , Michael and all the others :-) I am writing because I need a mentor to discuss my proposal for the GSoC. At the moment I am writing a document regarding some ideas for the cross decomposition module. I would be nice if Michael or some one else want to discuss it. I am writing about

[Scikit-learn-general] random forests - number of samples

2015-03-11 Thread Luca Puggini
In the original version of breiman random forest n samples with replacement are selected to build each tree. So each tree should be build with the n sample where n is the total number of samples. (they are chosen with replacement tough) I do not know if sklearn follows this standard. I hope this

Re: [Scikit-learn-general] NDVI smoothing using Savitzky-Golay

2015-03-11 Thread Chris Holdgraf
I'd also check out the pandas package, as it might do what you want with timeseries. I'd also note that you should use caution when doing Machine Learning on time-series, especially after you smooth your data. This introduces autocorrelations into your signal which migh

Re: [Scikit-learn-general] SciPy 2015 Austin

2015-03-11 Thread Kyle Kastner
We can probably also email one of the organizers (I think they are listed on the site?) and find out if we can edit or add an addendum. It is strange - I am almost 100% positive we could edit the proposals in past years. Kyle On Wed, Mar 11, 2015 at 10:22 AM, Andreas Mueller wrote: > Unfortunate

[Scikit-learn-general] Scikit-learn sprint in Paris, April 2nd

2015-03-11 Thread Nelle Varoquaux
Hi all, Gael, Alex, Vincent and I are organizing a sprint in Paris, the day before Pydata Paris [1]. The sprint will take place in Télécom ParisTech (13th arrondissement) [2], from 9am until people are bored or we get kicked out of the school. For security reason, we need to issue the list of peo

[Scikit-learn-general] random forests - number of samples

2015-03-11 Thread Pagliari, Roberto
How many samples does a single tree of a random use? Or does it use all samples? -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media,

Re: [Scikit-learn-general] SciPy 2015 Austin

2015-03-11 Thread Andreas Mueller
Unfortunately we can't edit the submission any more. But I am pretty confident that we get accepted, and then maybe we can do a final version? On 03/11/2015 10:18 AM, Nelle Varoquaux wrote: On 11 March 2015 at 15:05, Andreas Mueller > wrote: We submitted a ful

Re: [Scikit-learn-general] SciPy 2015 Austin

2015-03-11 Thread Nelle Varoquaux
On 11 March 2015 at 15:05, Andreas Mueller wrote: > We submitted a full day, so I'd rather join forces with you, than do two > sets. > Though obviously it's up to you ;) > Sounds good to you. The submission looks good too. Can you still edit it? (I think the answer is no). Cheers, N > > On 03/

[Scikit-learn-general] SciKitLearn, OpenML, Google Summer of Code

2015-03-11 Thread Joaquin Vanschoren
Dear SKLearn people, I'm new here, so here's a short intro: I am a machine learning research and I am setting up OpenML.org, a platform to share ML experiments online, including data, code, and results (models, predictions). It is meant as a collaboration performance (collaborative model building)

Re: [Scikit-learn-general] NDVI smoothing using Savitzky-Golay

2015-03-11 Thread Ronnie Ghose
also time signal ML is somewhat specialized On Wed, Mar 11, 2015 at 2:33 AM, Leo Kris Palao wrote: > Hi Gael, > > Thanks for your quick reply. My apologies. > > Thanks, > -Leo > > On Wed, Mar 11, 2015 at 1:48 PM, Gael Varoquaux < > gael.varoqu...@normalesup.org> wrote: > >> Hi, >> >> Filter