Re: [shogun] ISSUE #3847

Heiko Strathmann Tue, 08 Aug 2017 09:57:23 -0700

The splitting for the time series could be

-deterministic, that is in increasing window sizes to the past: train set
is everything up to point t, test set is everything from t.
-there could be variants on this that are limiting the sizes of the folds
-shuffling "blocks" that are approximately independent (i.e. the time
series forgets its past after t observations), this should re-use existing
code on shuffling


2017-08-08 16:34 GMT+01:00 sahil chaddha <[email protected]>:

> I read the implementation of cross-validation splitting and
> cross-validation. The build_subset() implements a random process to build
> the subsets and thus, makes sense to run evaluate_one_run() several times
> in cross-validation. Does the time-series split also require random
> process? And also, generate_subset_indices() is like test set and
> generate_subset_inverse() is like train set. So, to respect the time, the
> subsets are bound to have a non-empty intersection. Am I right?
>
> *Sahil Chaddha*
> Third Year Undergraduate Student
> Department of Metallurgy and Materials Engineering
> IIT Kharagpur, West Bengal - 721302
> +91-7872705997 <+91%2078727%2005997>,  LinkedIn
> <https://www.linkedin.com/in/sahil-chaddha-a0a376b7/> | Github
> <https://github.com/Sahil333>
>
> On Mon, Aug 7, 2017 at 1:56 PM, Fernando J. Iglesias García <
> [email protected]> wrote:
>
>> Welcome Sahil!
>>
>> Great that you have already successfully set up your dev environment.
>>
>> For this particular task, I think it will be useful to get familiar with
>> Shogun's cross-validation. You could start by checking the related examples
>> (like this one
>> <https://github.com/shogun-toolbox/shogun/blob/develop/examples/undocumented/libshogun/splitting_standard_crossvalidation.cpp>).
>> Then, you can get into understanding how the splitting strategy is
>> implemented internally (you can find the implementation by following the
>> appropriate include file from the example). You will also need to
>> understand details about the time-series splitting strategy, the links in
>> the github issue will be useful for this.
>>
>> After, you should be ready to start implementing the time-series
>> splitting. Let us know how it goes.
>>
>> Hope that helps!
>>
>> Cheers,
>> Fernando.
>>
>> On 5 August 2017 at 20:29, sahil chaddha <[email protected]> wrote:
>>
>>> Ma'am/Sir,
>>>
>>>    I want to work on this https://github.com/shogun
>>> -toolbox/shogun/issues/3847. But I have no idea where to start. I am
>>> new to such big projects. Can anyone guide me through it? I have already
>>> setup the environment, ran tests and examples successfully.
>>>
>>> *Sahil Chaddha*
>>> Fourth Year Undergraduate Student
>>> Department of Metallurgy and Materials Engineering
>>> IIT Kharagpur, West Bengal - 721302
>>> +91-7872705997 <+91%2078727%2005997>,  LinkedIn
>>> <https://www.linkedin.com/in/sahil-chaddha-a0a376b7/> | Github
>>> <https://github.com/Sahil333>
>>>
>>
>>
>

Re: [shogun] ISSUE #3847

Reply via email to