The splitting for the time series could be -deterministic, that is in increasing window sizes to the past: train set is everything up to point t, test set is everything from t. -there could be variants on this that are limiting the sizes of the folds -shuffling "blocks" that are approximately independent (i.e. the time series forgets its past after t observations), this should re-use existing code on shuffling
2017-08-08 16:34 GMT+01:00 sahil chaddha <[email protected]>: > I read the implementation of cross-validation splitting and > cross-validation. The build_subset() implements a random process to build > the subsets and thus, makes sense to run evaluate_one_run() several times > in cross-validation. Does the time-series split also require random > process? And also, generate_subset_indices() is like test set and > generate_subset_inverse() is like train set. So, to respect the time, the > subsets are bound to have a non-empty intersection. Am I right? > > *Sahil Chaddha* > Third Year Undergraduate Student > Department of Metallurgy and Materials Engineering > IIT Kharagpur, West Bengal - 721302 > +91-7872705997 <+91%2078727%2005997>, LinkedIn > <https://www.linkedin.com/in/sahil-chaddha-a0a376b7/> | Github > <https://github.com/Sahil333> > > On Mon, Aug 7, 2017 at 1:56 PM, Fernando J. Iglesias García < > [email protected]> wrote: > >> Welcome Sahil! >> >> Great that you have already successfully set up your dev environment. >> >> For this particular task, I think it will be useful to get familiar with >> Shogun's cross-validation. You could start by checking the related examples >> (like this one >> <https://github.com/shogun-toolbox/shogun/blob/develop/examples/undocumented/libshogun/splitting_standard_crossvalidation.cpp>). >> Then, you can get into understanding how the splitting strategy is >> implemented internally (you can find the implementation by following the >> appropriate include file from the example). You will also need to >> understand details about the time-series splitting strategy, the links in >> the github issue will be useful for this. >> >> After, you should be ready to start implementing the time-series >> splitting. Let us know how it goes. >> >> Hope that helps! >> >> Cheers, >> Fernando. >> >> On 5 August 2017 at 20:29, sahil chaddha <[email protected]> wrote: >> >>> Ma'am/Sir, >>> >>> I want to work on this https://github.com/shogun >>> -toolbox/shogun/issues/3847. But I have no idea where to start. I am >>> new to such big projects. Can anyone guide me through it? I have already >>> setup the environment, ran tests and examples successfully. >>> >>> *Sahil Chaddha* >>> Fourth Year Undergraduate Student >>> Department of Metallurgy and Materials Engineering >>> IIT Kharagpur, West Bengal - 721302 >>> +91-7872705997 <+91%2078727%2005997>, LinkedIn >>> <https://www.linkedin.com/in/sahil-chaddha-a0a376b7/> | Github >>> <https://github.com/Sahil333> >>> >> >> >
