Hi everyone,
I was thinking to implement a tweak where it is possible to sample randomly
from a dataset when using grid search. This would particularly useful for
big datasets. The sampling takes place during each round of grid search.
Does anyone think this would be worthy submitting to scikit-l
Hi, Stelios,
I am wondering, how did you implement this tweak? Just a thought, but instead
of adding extra functionality inside the GridSearch class, what about using a
random training data selector (transformer) as a pipeline object? Something
along the lines of
class RandomRowSelector(object)
That won't work, as it is modifying the number of samples, which breaks
the scikit-learn pipeline.
Please add this usecase in the PR on the scikit-learn enhancement
proposal that discusses a possible modification to scikit-learn:
https://github.com/scikit-learn/enhancement_proposals/pull/2
Cheers
job opportunities: Data Scientist- Data Preparation AND Entry-Level Marketing
Statistician
(San Diego, CA)
Salford Systems is an advanced data mining software and consulting company
based in San Diego, California. We enjoy an open, creative environment, and
have developed an international rep
Hi,
I would like to know if Scikit Learn Team is planning to offer projects in
GSoC this year. If yes then can you tell me what project ideas are planned
to be offered as projects in GSoC 2016?
Thanks,
Atharva
--
Site24x7
Hi everyone,
I was browsing through the projects that had been offered last year for
GSoC and came across the GAM project which wasn't taken up I believe. I was
reading up about MARS (called Earth due to TM issues!) and also took a look
at the massive PR by Jason Rudy on including pyearth into skl