No, this is not discouraging, but it does underscore the reality that it really does take some effort (measured in CPU and man-hours) to get a Strategy that works for "real reasons", not just some hacked over-optimized junk against too small of a dataset.
Thanks for your insight, it just helps me confirm that the hours I'm spending are in the right direction based on your experience. Marcus On Tue, Jan 15, 2013 at 5:01 PM, Eugene Kononov <[email protected]>wrote: > Marcus, > > My optimization workflow is the same as yours. Here are some comments: > > 1. If the backtest chart takes too long to render, it's probably because > you set the "bar size" at a high resolution. On my machine, rendering the > 2-year chart with the "1-hr" bar size takes a few seconds. If you'd like a > higher resolution, then you can specify a specific backtesting period > period and then create the chart with that. > > 2. With regards to the speed of optimizer, I profiled and optimized the > hell out of it. If I run it through a profiler now, the only bottlenecks > that it shows are evenly distributed in the low level JDK libraries. In > other words, I don't think there is anything left to optimize. Both the > "divide-and-conquer" and the "brute force" optimizer engage all CPU cores. > On my middle-level machine (i7, 4 cores, Windows 64bit, 8Gb RAM), the > optimizer speed for typical strategies is about 150 million samples per > second. That is to say, if my data file has 150 million 1-second samples, > and I optimize 60 strategies, it would take 60 seconds. Just like you, I do > overnight runs to optimize millions of strategies. Yes, that's a lot of > work, and a lot of trial and error. One thing worth experimenting is the > "strategies per processor" setting in the advanced optimization dialog. The > default is 50, but on a specific hardware and OS, so other number is likely > to improve the throughput. Some people also explored the parallel > optimization using GridGain. JBT optimization can be easily distributed to > different machines. All in all, the historical market depth data file have > a *lot* of data, so a lot of computational power is required to crunch > through all the numbers. I've done so much optimization with JBT over the > years, that I can tell what the optimizer is doing by just listening to the > fan speed. I am not kidding! > > 3. Optimization itself is a delicate process. The dangers of > over-optimizations (or over-fitting) are well publicized. I look for > several things in the optimization results: > a) What you call a "best island" (I call it a "high plateau") must be > broad enough in all dimensions. That is, instead of a random "spike" in > performance, it must be a wide area of elevated performance. Try to set the > parameter ranges to 15% away from the center of the plateau. How much > degradation in performance is there as you move away from the center? > b) The number of trades must be high enough. It's very easy to get > infinitely high performance metrics (such as PF and PI), if the number of > trades is low. This is just a game of permutations. The larger the number > of trades, the more significant are the results. In fact, I think this > statistical significance scales as the square root of the number of trades. > c) The data file must cover sufficiently long period of time. I prefer at > least 1 year. > d) The number of parameters must be low. I prefer below 5. > e) Almost exclusively, I use PI as my optimization selection criteria. I > believe this metric is superior than the other ones in JBT. > > 4. After the optimization job is completed, I select the optimal set of > parameters, and run a backtest with that set of parameters. Next, I pop up > the chart, and look at all the *losing* trades. Here is where I attempt to > improve the candidate strategy. Is there a commonality between all the > losing trades? Is there a single pre-condition in my strategy which would > have prevented this losers. Next, I add a precondition, and run the next > optimization job. And so it goes. > > 5. Often times, there are too many indicators and preconditions. The > strategy performs well, but what contributes to the good performance? Here > is where I often to the factor analysis. Eliminate an indicator or a > condition, and run the optimization job again. By how much did the > performance degrade?. If it's only by a little, then this particular > indicator or a condition is probably worthless. > > 6. After I identify the candidate for trading, I typically forward test > it, to make sure I have not missed anything gross. > > 7. Now the candidate is ready for trading. You think you are done? Nope! > Now comes the part where you need to continuously monitor the live > performance over time, and decide when to re-optimize, and when to > discontinue this strategy, in case the market has shifted away from the > mode for which the strategy was well suited. > > Hope this is not very discouraging. I am not aware of an easy way around > it. > > On Tue, Jan 15, 2013 at 6:01 PM, Marcus Williford <[email protected]>wrote: > >> I wanted to see if anyone would post additional advice on optimization >> workflows that work, and perhaps help me explore ways to improve my own. >> Here are some things that work for me so far, but I'm not yet satisfied >> with the efficiency. >> >> What works for me: >> - I come up with model theory in my head, about what might improve my >> trades. I do this by observation of prior trades (backtest utility), and I >> try to see obvious things the Strategy didn't get right. For this, I love >> the "chart" feature of the backtest data. >> - I then code some changes, and watch backtest again, knowing that it >> isn't optimized, but I look for some improvement with any algo changes. >> - Then I run divide-conquer optimization, across a broad range of >> min/max. But, I never can tell at what scale I want to operate on. How >> fast is a fast ema period, etc.... I'm always a bit unsure if even my >> broad range is broad enough, but I eventually settle on something. >> - If the divide and conquer method started to look promising, I note >> (and hope to see) hot spots in the heat map. Unfortunately, it is usually >> too spares to really see and understand what is going on. So, I take a >> guess at some ranges for each param. >> - Finally, if all went well, I setup an overnight run (usually around >> 3,000,000+ "combinations", which takes forever on my blazing fast new >> macbook retina with 16GB ram, etc, etc.. Usually 20 hours with the fan on >> high. Maybe I need Applecare after-all. >> - Now maybe I have something, then I take the best island area in this >> process, back-test it, and start to study trades again. Mostly, this puts >> me back into a loop of trying to make another Strategy change to improve it >> further. >> >> So, this is the process I came up with so far. It works, I get >> improvements, but it is very slow going. >> >> What might need improvement in either my process or the jbooktrader: >> - I wish to see good/bad trades fast! The chart graphs everything, and >> hence takes forever. I am thinking of a "trade view" chart, which shows >> only the area around the trade, in great detail, then maybe advance to the >> next trade fast. This would speed up my review of trades. >> - Optimizer, figuring out why it takes so long. I know, 3,000,000 >> combinations is a lot of work. But, I feel like maybe studying this in a >> profiler, and trying to make some improvements. Nevermind the wild ideas I >> have about farming out work to a dynamic cluster of EC2 servers using >> hadoop. >> - Indicator graphing. I have placed indicators I don't even use into >> Strategies, just to see them in the graph, and get a feel for if they make >> sense. Maybe people use some external program for this? If so, how to >> tell if your indicator logic is correct. So, I dirty up my strategies with >> unused indicators, just to see them on the graphs. >> - I considered trying to wire up scalalab (and learn it) to my java >> strategies, so i can play with more advanced analysis without rolling my >> own code for math routines. For example, write scalalab adapter to run my >> strategies, and use it like matlab to make improvements. I could then use >> the same java code for both math software package, and trading. A cool >> idea, but I only got as far as installing scalalab, nevermind learning it. >> - Move this onto a giant linux box in my home, and offload this >> optimization for now. >> >> As you can see, I'm all over the place, any advice? Did anyone else have >> any of these ideas? Maybe you did, and already executed on it? Since I >> have so many thoughts, each which require a lot of work, I'm seeking some >> feedback from people who have been doing this for years. I'm a newbie. >> Maybe I just need to learn how to use what we have better? >> >> Meanwhile, running a giant CL optimization, maybe I'll have something to >> trade soon. >> >> Marcus >> >> >> >> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "JBookTrader" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/jbooktrader?hl=en. >> > > -- > You received this message because you are subscribed to the Google Groups > "JBookTrader" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/jbooktrader?hl=en. > -- You received this message because you are subscribed to the Google Groups "JBookTrader" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/jbooktrader?hl=en.
