[Scikit-learn-general] Sampling in grid_search randomized_grid_search

2016-02-19 Thread Stylianos Kampakis
Hi everyone, I was thinking to implement a tweak where it is possible to sample randomly from a dataset when using grid search. This would particularly useful for big datasets. The sampling takes place during each round of grid search. Does anyone think this would be worthy submitting to scikit-l

Re: [Scikit-learn-general] Sampling in grid_search randomized_grid_search

2016-02-19 Thread Sebastian Raschka
Hi, Stelios, I am wondering, how did you implement this tweak? Just a thought, but instead of adding extra functionality inside the GridSearch class, what about using a random training data selector (transformer) as a pipeline object? Something along the lines of class RandomRowSelector(object)

Re: [Scikit-learn-general] Sampling in grid_search randomized_grid_search

2016-02-19 Thread Gael Varoquaux
That won't work, as it is modifying the number of samples, which breaks the scikit-learn pipeline. Please add this usecase in the PR on the scikit-learn enhancement proposal that discusses a possible modification to scikit-learn: https://github.com/scikit-learn/enhancement_proposals/pull/2 Cheers

[Scikit-learn-general] 2 job postings: Data Scientist­ Data Preparation AND Entry ­Level Marketing Statistician (San Diego, CA)

2016-02-19 Thread Lisa Solomon
job opportunities: Data Scientist- Data Preparation AND Entry-Level Marketing Statistician (San Diego, CA) Salford Systems is an advanced data mining software and consulting company based in San Diego, California. We enjoy an open, creative environment, and have developed an international rep

[Scikit-learn-general] Query about GSoC 2016

2016-02-19 Thread Atharva
Hi, I would like to know if Scikit Learn Team is planning to offer projects in GSoC this year. If yes then can you tell me what project ideas are planned to be offered as projects in GSoC 2016? Thanks, Atharva -- Site24x7

[Scikit-learn-general] Adding EarthRegressor

2016-02-19 Thread Devashish Deshpande
Hi everyone, I was browsing through the projects that had been offered last year for GSoC and came across the GAM project which wasn't taken up I believe. I was reading up about MARS (called Earth due to TM issues!) and also took a look at the massive PR by Jason Rudy on including pyearth into skl