Nick, you make one wrong assumption: that the default optimization process is based on seperate optimizations per individual symbol. Instead, default optimization is at the portfolio level, which is why ALL symbols of the backtest-universe need to be included. Optimized parameters for one symbol don't have any value if I do not know what their impact is at the portfolio level, i.e. on all the other symbols. Hope this makes sense.
PS --- In [email protected], "nhall" <[EMAIL PROTECTED]> wrote: > > Hello Tomasz, > > Thanks for all you've done with AmiBroker. It is a great program. In > your response to your dual-core optimization comments, there's > something I've been wondering about. You said that the core cache > becomes a limitation for AFL because backtesting is so memory > intensive and the memory interface speed is fixed no matter the number > of cores on a processor. > > Currently, if I do an Optimization over a large watchlist, AB will run > the AFL with the given parameters for the entire watchlist all the way > until it has the overall system result for the entire watchlist, for > the given parameters. Then it alters the parameters according to the > Optimization and reruns the AFL for the entire watchlist again. This > continues for all the Optimization iterations. > > As I understand you, the problem with this is that the data for the > watchlist takes up a lot more memory than will fit in a core's cache, > so if you have multiple cores doing this processing simultaneously, > they will be fighting each other for memory bandwidth. > > Would it be possible to alter the order of events? If I'm running an > Optimization with 100 combinations I don't need to see the results > from each combination until the entire set has been processed. What if > the Optimization sequence of events was changed to run the AFL for > just one symbol from the watchlist, then alter the parameters and run > the AFL again for the *same symbol*, just with different parameters, > and continue this for all the combinations of the Optimization. After > signals have been generated for this particular symbol for all > parameter combinations, the signals can be stored in memory and then > it can move on to the next symbol. After all the symbols have been > processed, AB can do the backtesting for all the signals. > > The advantage of this is if I am Optimizing 100 combinations of > parameters on a watchlist of 5000 symbols, hopefully 1 symbol can fit > in the processor's cache and it can do 100 runs through the AFL, > generating signals, before it has to fetch more data from the memory. > This could provide some concurrency as another core could do the same > thing for a different symbol. The Optimization would be more efficient > with more combinations. > > Does this make sense? I know that I glossed over details of how the > cache really works and also that I do not know the internals of > AmiBroker and could be missing some critical information and therefore > this idea may not work at all. But maybe it could help. Thanks for > listening! > > Nick > > > --- In [email protected], "Tomasz Janeczko" <groups@> wrote: > > > > Hello, > > > > It is perfectly valid question. > > > > First it does not really matter if the process goes through both ends or > > sequentially but one core goes though odd and another through even > steps, > > and at first look it seems like this would give significant speed up. > > > > BUT... in real world things are more ugly that in theory. > > I did lots of testing and profiling (measuring time of execution of > code on function-level), > > and dual thread execution on dual core processor is faster if and > only if > > each core can execute accessing data only from its own on-chip data > cache. > > This is unfortunatelly NOT the case for backtesting/optimization. > > On-chip caches are usually limited to well below 1MB. Almost every > backtest > > requires way more than 1MB. > > Now what happens if you run code that uses more memory - > > BOTH cores need to access on-board (regular) RAM. Both cores do this > > through single memory interface that is SHARED between cores and > > access one memory that runs at fixed speed (no matter if 1 or 8 > cores access > > the memory - it can not respond quicker than factory limit and one > core is > > fast enough to actually need to WAIT for memory). > > > > Now if you run on 2 or more cores, they have to wait for the same, > single shared memory > > that runs at constant pace, slow enough for one core, not to mention > more. > > > > Net result is that if you actually try to run something that needs > more than 1MB > > of data and does not fit into individual data cache, the performance > drops down > > to actually single-core. What's more it can run slower because of > additional overhead with > > thread management. > > > > And it is not imagination or theory. I did actual code profiling and > I was surprised to when I tested multi-threaded > > code. It works upto 2x faster, on dual core BUT ONLY IF you don't > access more than > > the size of on-chip per-core data cache. Or your code needs way more > calculation than memory access. > > If your code does a LOT of memory access (more than 1MB) and does it > QUICKLY > > (backtesting is extremely memory intensive and AFL scans through mem > like crazy) > > all advantages of running in multiple cores are gone. > > > > BTW: what I did in this upgrade to speed up the > backtest/optimization was to reduce the COUNT > > of memory accesses to absolute minimum required. As it turns out > even single CPU core was waiting for memory. > > > > Best regards, > > Tomasz Janeczko > > amibroker.com > > ----- Original Message ----- > > From: "tipequity" <tagroups@> > > To: <[email protected]> > > Sent: Friday, October 05, 2007 4:20 AM > > Subject: [amibroker] Re: Optimization speed increase in 5.01 > > > > > > > Tomasz, at the risk of sounding stupid, I am gonna run this idea by > > > you. Since AB during backtest and optimizations work on a list of > > > stocks why not have one cpu (dual core CPUs) work on symbols from top > > > of the list and another cpu to work on symbols from the bottom of the > > > list. Like buring candles from both end. > > > > > > Regards > > > > > > Kam > > > > > > > > > --- In [email protected], "Tomasz Janeczko" <groups@> > > > wrote: > > >> > > >> Hello, > > >> > > >> If you are running optimizations using new version I would love to > > > hear about the timings you get > > >> compared with old one. > > >> Note that optimization with new version may run even 2 times faster > > > (or more), > > >> but actual speed increase depends how complex the formula is and > > > how often system > > >> trades and how large baskets. Speed increases are larger with > > > simpler formulas, > > >> because AFL execution speed did NOT change. The only things that > > > has changed > > >> is collection of signals (1st backtest phase) and entire 2nd phase > > > of backtest. > > >> As it turns out, when backtesting very simple formulas the AFL code > > > execution is only less > > >> than 20% of total time, the rest is collecting signals and sorting > > > them > > >> according to score and 2nd phase of the backtest (actual trading > > > simulation). > > >> These latter areas were the subject of performance tweaking. > > >> > > >> Best regards, > > >> Tomasz Janeczko > > >> amibroker.com > > >> > > > > > > > > > > > > > > > Please note that this group is for discussion between users only. > > > > > > To get support from AmiBroker please send an e-mail directly to > > > SUPPORT {at} amibroker.com > > > > > > For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG: > > > http://www.amibroker.com/devlog/ > > > > > > For other support material please check also: > > > http://www.amibroker.com/support.html > > > > > > Yahoo! Groups Links > > > > > > > > > > > > > > > > > >
