You can just do this via a CV object. For example, use 
StratifiedShuffleSplit(train_set=.1, test_set=.1, n_folds=5)
and your training and test set will be randomly samples disjoint 10% of 
the data, repeated 5 times.



On 02/19/2016 11:42 AM, Gael Varoquaux wrote:
> That won't work, as it is modifying the number of samples, which breaks
> the scikit-learn pipeline.
>
> Please add this usecase in the PR on the scikit-learn enhancement
> proposal that discusses a possible modification to scikit-learn:
> https://github.com/scikit-learn/enhancement_proposals/pull/2
>
> Cheers,
>
> Gaƫl
>
> On Fri, Feb 19, 2016 at 11:36:29AM -0500, Sebastian Raschka wrote:
>> Hi, Stelios,
>> I am wondering, how did you implement this tweak? Just a thought, but 
>> instead of adding extra functionality inside the GridSearch class, what 
>> about using a random training data selector (transformer) as a pipeline 
>> object? Something along the lines of
>> class RandomRowSelector(object):
>>      def __init__(self):
>>          pass
>>      def _some_random_sampling_function(self, X, y)
>>      def transform(self, X, y):
>>          sampled_rows = self.some_random_sampling_function(self, X, y)
>>          return X[sampled_rows, :], y[sampled_rows, :]
>>      def fit(self, X, y=None):
>>          return self
>> Best,
>> Sebastian
>>> On Feb 19, 2016, at 7:56 AM, Stylianos Kampakis 
>>> <stylianos.kampa...@gmail.com> wrote:
>>> Hi everyone,
>>> I was thinking to implement a tweak where it is possible to sample randomly 
>>> from a dataset when using grid search. This would particularly useful for 
>>> big datasets. The sampling takes place during each round of grid search.
>>> Does anyone think this would be worthy submitting to scikit-learn?
>>> Best regards,
>>> Stelios
>>> ------------------------------------------------------------------------------
>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>> Monitor end-to-end web transactions and take corrective actions now
>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>> ------------------------------------------------------------------------------
>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to