I can't see the difference between a bad optimization result and a failed test of out of sample data.
On Wed, Dec 8, 2010 at 12:53 PM, Astor <[email protected]> wrote: > Thanks Klaus. Exellent description. Can you post your JBT extension for > backtests? I would love to incorporate that into my program. > > ------------------------------ > *From:* Klaus <[email protected]> > > *To:* JBookTrader <[email protected]> > *Sent:* Wed, December 8, 2010 2:48:13 PM > > *Subject:* [JBookTrader] Re: Dynamic Parameter Optimization > > use of-out-of sample data is a must in all machine learning approaches > (and this is actually what we do here). > (So, yes i also take the approach, this is actually the reason why I > built JBT extension for batches of backtests.) > > Perhaps it can be better understood if looks at the danger of > presenting all the data. > What can happen is that the strategy (and actually JBT does not > support learning of parameters, but only of parameter settings) > that is generated is sort of memorizing the presented data.. and then > provides good results there, but not beyond. > The result does not generalize to further data. > > It is like with people, if you really want someone to understand > something, you will teach him (presenting examples > is one approach for teaching). But at the end you want to know whether > he really understood (i.e., got the principles > and is able to use them to solve knew problems) or whether he just > memorized. The only way to find this out is to show him s.th. he has > not seen before... > That is what use of out-of-sample means. Simulated forward-Trading is > another way to achieve the same result, but then you are doing it in > real time (i.e., need weeks), but if you have more data, you can do > this simply in minutes with testing.. > > > > On 8 Dez., 15:15, Astor <[email protected]> wrote: > > >people do a lot of silly things. A lot of them agree on these silly > things > > > > True enough. This thing, though, has been the subject of so much academic > > research that it is probably not one of them. Of course, just like > > you, everybody wants to use as much data as possible for model > optimization, so > > a technique called "bootstrapping" is used, which is similar to > walk-forward > > optimization that I was proposing. > > > > ________________________________ > > From: ShaggsTheStud <[email protected]> > > To: [email protected] > > Sent: Tue, December 7, 2010 11:13:26 PM > > Subject: Re: [JBookTrader] Re: Dynamic Parameter Optimization > > > > I dunno, people do a lot of silly things. A lot of them agree on these > silly > > things. > > > > If you had extra data, why would you not use it to see the sensitivity of > your > > parameters? > > > > On Tue, Dec 7, 2010 at 8:34 PM, Astor <[email protected]> wrote: > > > > Shaggs, I wish I could claim credit for this approach but it is not my > approach. > > It is a standard statistical methodology used by every professional Quant > shop, > > without exceptions. In institutional settings, you could never get any > strategy > > past the Investment Committee without presenting strong out-of-sample > results. > > > > >This is not to say that sensitivity to parameter changes, robustness > checks, > > >etc need not be done. They still need to be done on in-sample data. > > > > ________________________________ > > From: ShaggsTheStud <[email protected]> > > > > > > > > > > > > > > > > >To: [email protected] > > >Sent: Tue, December 7, 2010 8:53:49 PM > > > > >Subject: Re: [JBookTrader] Re: Dynamic Parameter Optimization > > > > >So the difference between our approaches is that in your approach you > look at > > >the 1 "optimal" case, and then you try it on some other set of data to > verify > > >it. Ok, that is interesting. > > > > >I much prefer to have one set of data, and look at the optimization map, > and > > >view the sensitivities to changes in the parameters. A robust strategy > will not > > >be a picking out small "local minimums", it will have a wide plateau of > > >profitibility, and have good distribution on the trades graph where > there are > > >not large periods of drawdown. > > > > >I think my method is more robust, and would yield better real world > performance > > >than your method, but I can't prove it. > > > > >On Tue, Dec 7, 2010 at 5:52 PM, Astor <[email protected]> wrote: > > > > >>No, you do not do the same thing on both sets. You optimize and test > different > > >>models on in-sample set only. You can do it as much as is necessary to > get good > > >>results. You test only the final model on your out-of-sample and you > can not > > >>change the model or re-optimize parameters and re-test on > > >>out-of-sample. Out-of-sample is like virginity, - once used it is > gone. > > > > >>Results from out-of-sample is what you expect to get in real trading. > > > > ________________________________ > > From: ShaggsTheStud <[email protected]> > > > > > > > > > > > > > > > > >>To: [email protected] > > >>Sent: Tue, December 7, 2010 7:01:03 PM > > > > >>Subject: Re: [JBookTrader] Re: Dynamic Parameter Optimization > > > > >>Doing the same thing on two different sets of data seems identical to > doing it > > >>on one combined set of data. How is it different? > > > > >>On Tue, Dec 7, 2010 at 4:14 AM, Astor <[email protected]> wrote: > > > > >>The "in-sample" set is where you develop your model and optimize your > > >>parameters. Because optimization searches through a very large number > of > > >>possible parameter values, it finds those values which best fit the > datain this > > >>set. In a different data set, such as the one that may occur in real > trading, > > >>these parameters may prove perfectly useless. In Quant research, such > situation > > >>is (derogatively) referred to as "datamining" or overfitting. With > enough model > > >>parameters and extensive optimization, I can get perfect accuracy > > >>predicting "in-sample" lottery winners. Of course that model will not > work to > > >>predict next, "out-of-sample", lottery winner. > > > > >>>The "out-of-sample" set is a way to verify that the found model and > its > > >>>parameters are general instead of unique to the "in-sample" > development set. > > >>>Combining the two sets into a single set defeats that purpose. > > > > ________________________________ > > From: ShaggsTheStud <[email protected]> > > > > > > > > > > > > > > > > >>>To: [email protected] > > >>>Sent: Mon, December 6, 2010 10:21:59 PM > > >>>Subject: Re: [JBookTrader] Re: Dynamic Parameter Optimization > > > > >>>That whole "in sample" and "out of sample" data thing strikes me very > as very > > >>>odd. If it works on the in-sample and not the out-sample, its going to > have a > > >>>bad distribution as a single set, so why not just combine it? > > > > >>>On Sun, Dec 5, 2010 at 5:56 AM, Astor <[email protected]> wrote: > > > > >>>> we would > > >>>>>be required to significantly shorten our optimization periods, thus > > >>>>>incurring a penalty of standard error in our confidence bands. > > > > >>>>I understand your concern Eugene. However, it is important to > recognize that > > >>>>in strategy development and validation there are two sets of data and > two sets > > >>>>of confidence bands. First set is used for strategy development and > parameter > > >>>>optimization and is often called "in-sample". The second set is used > only to > > >>>>validate the strategy performance and is called "out-of-sample". > > > > >>>>If the confidence interval is very broad (standard error is large) in > the > > >>>>"in-sample" data, your strategy is not reliable and should not be > used. > > > > >>>>If the "in-sample" results are good and have acceptable confidence > intervals, > > >>>>the next step is validation of the strategy on "out-of-sample" data. > > >>>>Because "out-of-sample" data has not been used for parameter > optimization, the > > >>>>results obtained on this data are far more important than those from > > >>>>"in-sample". If the "out-of-sample confidence interval is too broad, > the > > >>>>validation results are not reliable and the strategy should not be > used. > > > > >>>>It is extremely common that the available data set is too small to > partition the > > >>>>data into in- and out- of sample sets of adequate size. In financial > > >>>>research, the data set size is usually limited not by the data > availability but > > >>>>by the data stationarity. To create valid sample sizes from small > data, a > > >>>>technique called "leave-one-out" or "bootstrapping" or "jackknifing" > is used. In > > >>>>those techniques the model is developed on the entire data except for > one > > >>>>"holdout" point, then tested on this point. Then a different point is > selected > > >>>>and the process is repeated. The validation results are obtained by > combining > > >>>>the results of holdout points. Walk-forward optimization is an > example of this > > >>>>technique and actually reduces standard error in the more > > >>>>important "out-of-sample" test. > > > > >>>>>better model would be the one which not only > > >>>>>accounts for the supply/demand, but also for its changing elasticity > > >>>>>over time > > > > >>>>That is definitely so and is often driven by seasonality as well > as regime > > >>>>shifts. For futures, such as ES, the elasticity could drift in > response to > > >>>>the proximity of the expiration date or as a result of changing > market sentiment > > >>>>or increased trading in spot or in "dark pools", which impacts demand > but is not > > >>>>reflected in bid/ask quotes. > > > > >>>>>the manner in which its parameters change overtime is not intuitive > at > > >>>>>all > > > > >>>>If the value of the parameters themselves is not intuitive, then its > change over > > >>>>time is very likely not to be intuitive as well and vice versa. Most > > >>>>non-intuitive parameter changes happen when the optimization surface > is very > > >>>>flat or has many local maxima. Then a minor change in the data can > put you into > > >>>>a very different local maxima and cause very unsettling parameter > jumps. That is > > >>>>why restricting the optimization region to the vicinity of the most > recent > > >>>>parameter values allows for parameters to only drift gradually. Then > trends in > > >>>>parameter changes can be spotted and understood intuitively. > > > > ________________________________ > > From: nonlinear5 <[email protected]> > > > > > > > > > > > > > > > > >>>>To: JBookTrader <[email protected]> > > >>>>Sent: Sat, December 4, 2010 11:34:20 PM > > >>>>Subject: [JBookTrader] Re: Dynamic Parameter Optimization > > > > >>>>> Eugene, your comment goes to the need to have sufficiently large > backtest > > >>>>> database relative to the number of adjustable parameters, so that > the > > >>results > > >>>>> are statistically significant. How does that relate to potential > > >>>>> non-stationarity of parameters? > > > > >>>>The non-stationarity of parameters is a problem, indeed. However, > some > > >>>>things are more or less absolute. Think of the supply/demand > > >>>>relationship. If you can capture its essence in the strategy, that > > >>>>should work today, tomorrow, and 10 years in the future. Now, I do > > >>>>acknowledge that a better model would be the one which not only > > >>>>accounts for the supply/demand, but also for its changing elasticity > > >>>>over time. However, such model would be more complex, more difficult > > >>>>to understand, and more time-consuming to test. Perhaps more > > >>>>importantly, while the supply/demand law by itself is quite > intuitive, > > >>>>the manner in which its parameters change overtime is not intuitive > at > > >>>>all. The best we can hope for in our walk-forward optimization is > that > > >>>>whatever parameters were the "optimal" in a recent period would still > > >>>>be the optimal in the next period. For the sake of this hope, we > would > > >>>>be required to significantly shorten our optimization periods, thus > > >>>>incurring a penalty of standard error in our confidence bands. > > > > >>>>-- > > >>>>You received this message because you are subscribed to the Google > Groups > > >>>>"JBookTrader" group. > > >>>>To post to this group, send email to [email protected]. > > >>>>To unsubscribe from this group, send email to > > >>>>[email protected]. > > >>>>For more options, > > > > ... > > > > Erfahren Sie mehr ยป > > -- > You received this message because you are subscribed to the Google Groups > "JBookTrader" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to jbooktrader+ > [email protected]. > For more options, visit this group at > http://groups.google.com/group/jbooktrader?hl=en. > > > -- > You received this message because you are subscribed to the Google Groups > "JBookTrader" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<jbooktrader%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/jbooktrader?hl=en. > -- You received this message because you are subscribed to the Google Groups "JBookTrader" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/jbooktrader?hl=en.
