Hi Andres, hi Andy, Indeed in real life I also needed to cross-validate time series in a different manner than TimeSeriesSplit implemented in sklearn does. I fully support the idea of such a contribution Andres.
As Andy mentioned, the main option would be a « rolling window » or as I use to say, a « sliding window » technique. I think this is what you meant. In order to understand each other, I propose to give a piece of explanation: Think about your data sorted by time chronologically on an axis. Set a constant test set length (interval) which will « slide » over the time. Then the training set is just the rest of the data before the first one in test set. I joined a slide I used during a presentation of that principle. Andy, probably it wasn’t your exact idea but I think it’s kind of. Thanks, Sylvain > Le 28 avr. 2017 à 17:48, Andreas Mueller <t3k...@gmail.com> a écrit : > > Hey Andres. > I think there might be a PR for that. > Can you explain the minimum size of the training set? How is that used? > I thought the other main option would be "rolling window" cross validation > to use a fixed length cv training set. > > So the two options to me were rolling window and what we're doing right now. > Can you elaborate on the other use cases, like minimum size of the training > set > and why you would want the other options with a variable length training set? > > Thanks, > Andy > > On 04/27/2017 09:44 AM, andres lago wrote: >> Hello, >> I'd like to contribute with a new functionality in sklearn. It's the cross >> validation of time series. It's an evolution of the current functionality, >> implemented by TimeSeriesSplit. >> >> TimeSeriesSplit only allows the user to set the number of folds. In real >> life, when performing the cross validation of time series, other parameters >> are required, for instance: >> -minimum size of CV-training set >> -size of CV-test set >> -fixed or variable length of CV-training set. >> >> The functionality is inspired by the R library 'caret'. >> >> If you agree, I can share my code. I developed it for a project with the >> french rail company SNCF. It's in production now. >> >> Regards, >> Andres >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org <mailto:scikit-learn@python.org> >> https://mail.python.org/mailman/listinfo/scikit-learn >> <https://mail.python.org/mailman/listinfo/scikit-learn> > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org <mailto:scikit-learn@python.org> > https://mail.python.org/mailman/listinfo/scikit-learn > <https://mail.python.org/mailman/listinfo/scikit-learn>
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn