Hi Andres, hi Andy,

Indeed in real life I also needed to cross-validate time series in a different 
manner than TimeSeriesSplit implemented in sklearn does.
I fully support the idea of such a contribution Andres.

As Andy mentioned, the main option would be a « rolling window » or as I use to 
say, a « sliding window » technique.
I think this is what you meant. In order to understand each other, I propose to 
give a piece of explanation:

Think about your data sorted by time chronologically on an axis.
Set a constant test set length (interval) which will « slide » over the time. 
Then the training set is just the rest of the data before the first one in test 
set.

I joined a slide I used during a presentation of that principle.
Andy, probably it wasn’t your exact idea but I think it’s kind of.

Thanks,
Sylvain




> Le 28 avr. 2017 à 17:48, Andreas Mueller <t3k...@gmail.com> a écrit :
> 
> Hey Andres.
> I think there might be a PR for that.
> Can you explain the minimum size of the training set? How is that used?
> I thought the other main option would be "rolling window" cross validation
> to use a fixed length cv training set.
> 
> So the two options to me were rolling window and what we're doing right now.
> Can you elaborate on the other use cases, like minimum size of the training 
> set
> and why you would want the other options with a variable length training set?
> 
> Thanks,
> Andy
> 
> On 04/27/2017 09:44 AM, andres lago wrote:
>> Hello,
>>   I'd like to contribute with a new functionality in sklearn. It's the cross 
>> validation of time series. It's an evolution of the current functionality, 
>> implemented by TimeSeriesSplit.
>> 
>>   TimeSeriesSplit only allows the user to set the number of folds. In real 
>> life, when performing the cross validation of time series, other parameters 
>> are required, for instance:
>>     -minimum size of CV-training set
>>     -size of CV-test set
>>     -fixed or variable length of CV-training set.
>> 
>>   The functionality is inspired by the R library 'caret'.   
>> 
>>   If you agree, I can share my code. I developed it for a project with the 
>> french rail company SNCF. It's in production now.
>> 
>>   Regards,
>>     Andres 
>> 
>> 
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn@python.org <mailto:scikit-learn@python.org>
>> https://mail.python.org/mailman/listinfo/scikit-learn 
>> <https://mail.python.org/mailman/listinfo/scikit-learn>
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org <mailto:scikit-learn@python.org>
> https://mail.python.org/mailman/listinfo/scikit-learn 
> <https://mail.python.org/mailman/listinfo/scikit-learn>

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to