[amibroker] Re: Optimization speed increase in 5.01

vlanschot Mon, 08 Oct 2007 01:12:12 -0700

Nick, you make one wrong assumption: that the default optimization 
process is based on seperate optimizations per individual symbol. 
Instead, default optimization is at the portfolio level, which is why 
ALL symbols of the backtest-universe need to be included. Optimized 
parameters for one symbol don't have any value if I do not know what 
their impact is at the portfolio level, i.e. on all the other 
symbols. Hope this makes sense.


PS 

--- In [email protected], "nhall" <[EMAIL PROTECTED]> wrote:
>
> Hello Tomasz,
> 
> Thanks for all you've done with AmiBroker. It is a great program. In
> your response to your dual-core optimization comments, there's
> something I've been wondering about. You said that the core cache
> becomes a limitation for AFL because backtesting is so memory
> intensive and the memory interface speed is fixed no matter the 
number
> of cores on a processor.
> 
> Currently, if I do an Optimization over a large watchlist, AB will 
run
> the AFL with the given parameters for the entire watchlist all the 
way
> until it has the overall system result for the entire watchlist, for
> the given parameters. Then it alters the parameters according to the
> Optimization and reruns the AFL for the entire watchlist again. This
> continues for all the Optimization iterations.
> 
> As I understand you, the problem with this is that the data for the
> watchlist takes up a lot more memory than will fit in a core's 
cache,
> so if you have multiple cores doing this processing simultaneously,
> they will be fighting each other for memory bandwidth.
> 
> Would it be possible to alter the order of events? If I'm running an
> Optimization with 100 combinations I don't need to see the results
> from each combination until the entire set has been processed. What 
if
> the Optimization sequence of events was changed to run the AFL for
> just one symbol from the watchlist, then alter the parameters and 
run
> the AFL again for the *same symbol*, just with different parameters,
> and continue this for all the combinations of the Optimization. 
After
> signals have been generated for this particular symbol for all
> parameter combinations, the signals can be stored in memory and then
> it can move on to the next symbol. After all the symbols have been
> processed, AB can do the backtesting for all the signals.
> 
> The advantage of this is if I am Optimizing 100 combinations of
> parameters on a watchlist of 5000 symbols, hopefully 1 symbol can 
fit
> in the processor's cache and it can do 100 runs through the AFL,
> generating signals, before it has to fetch more data from the 
memory.
> This could provide some concurrency as another core could do the 
same
> thing for a different symbol. The Optimization would be more 
efficient
> with more combinations.
> 
> Does this make sense? I know that I glossed over details of how the
> cache really works and also that I do not know the internals of
> AmiBroker and could be missing some critical information and 
therefore
> this idea may not work at all. But maybe it could help. Thanks for
> listening!
> 
> Nick
> 
> 
> --- In [email protected], "Tomasz Janeczko" <groups@> wrote:
> >
> > Hello,
> > 
> > It is perfectly valid question.
> > 
> > First it does not really matter if the process goes through both 
ends or
> > sequentially but one core goes though odd and another through even
> steps,
> > and at first look it seems like this would give significant speed 
up.
> > 
> > BUT... in real world things are more ugly that in theory.
> > I did lots of testing and profiling (measuring time of execution 
of
> code on function-level),
> > and dual thread execution on dual core processor is faster if and
> only if
> > each core can execute accessing data only from its own on-chip 
data
> cache.
> > This is unfortunatelly NOT the case for backtesting/optimization.
> > On-chip caches are usually limited to well below 1MB. Almost every
> backtest
> > requires way more than 1MB. 
> > Now what happens if you run code that uses more memory - 
> > BOTH cores need to access on-board (regular) RAM. Both cores do 
this
> > through single memory interface that is SHARED between cores and
> > access one memory that runs at fixed speed (no matter if 1 or 8
> cores access
> > the memory - it can not respond quicker than factory limit and one
> core is
> > fast enough to actually need to WAIT for memory).
> > 
> > Now if you run on 2 or more cores, they have to wait for the same,
> single shared memory
> > that runs at constant pace, slow enough for one core, not to 
mention
> more.
> > 
> > Net result is that if you actually try to run something that needs
> more than 1MB
> > of data and does not fit into individual data cache, the 
performance
> drops down
> > to actually single-core. What's more it can run slower because of
> additional overhead with
> > thread management.
> > 
> > And it is not imagination or theory. I did actual code profiling 
and
> I was surprised to when I tested multi-threaded 
> > code. It works upto 2x faster, on dual core BUT ONLY IF you don't
> access more than
> > the size of on-chip per-core data cache. Or your code needs way 
more
> calculation than memory access.
> > If your code does a LOT of memory access (more than 1MB) and does 
it
> QUICKLY
> > (backtesting is extremely memory intensive and AFL scans through 
mem
> like crazy) 
> > all advantages of running in multiple cores are gone.
> > 
> > BTW: what I did in this upgrade to speed up the
> backtest/optimization was to reduce the COUNT
> > of memory accesses to absolute minimum required. As it turns out
> even single CPU core was waiting for memory.
> > 
> > Best regards,
> > Tomasz Janeczko
> > amibroker.com
> > ----- Original Message ----- 
> > From: "tipequity" <tagroups@>
> > To: <[email protected]>
> > Sent: Friday, October 05, 2007 4:20 AM
> > Subject: [amibroker] Re: Optimization speed increase in 5.01
> > 
> > 
> > > Tomasz, at the risk of sounding stupid, I am gonna run this 
idea by 
> > > you. Since AB during backtest and optimizations work on a list 
of 
> > > stocks why not have one cpu (dual core CPUs) work on symbols 
from top 
> > > of the list and another cpu to work on symbols from the bottom 
of the 
> > > list. Like buring candles from both end.
> > > 
> > > Regards
> > > 
> > > Kam
> > > 
> > > 
> > > --- In [email protected], "Tomasz Janeczko" <groups@> 
> > > wrote:
> > >>
> > >> Hello,
> > >> 
> > >> If you are running optimizations using new version I would 
love to 
> > > hear about the timings you get
> > >> compared with old one.
> > >> Note that optimization with new version may run even 2 times 
faster 
> > > (or more),
> > >> but actual speed increase depends how complex the formula is 
and 
> > > how often system
> > >> trades and how large baskets. Speed increases are larger with 
> > > simpler formulas,
> > >> because AFL execution speed did NOT change. The only things 
that 
> > > has changed
> > >> is collection of signals (1st backtest phase) and entire 2nd 
phase 
> > > of backtest.
> > >> As it turns out, when backtesting very simple formulas the AFL 
code 
> > > execution is only less
> > >> than 20% of total time, the rest is collecting signals and 
sorting 
> > > them
> > >> according to score and 2nd phase of the backtest (actual 
trading 
> > > simulation). 
> > >> These latter areas were the subject of performance tweaking.
> > >> 
> > >> Best regards,
> > >> Tomasz Janeczko
> > >> amibroker.com
> > >>
> > > 
> > > 
> > > 
> > > 
> > > Please note that this group is for discussion between users 
only.
> > > 
> > > To get support from AmiBroker please send an e-mail directly to 
> > > SUPPORT {at} amibroker.com
> > > 
> > > For NEW RELEASE ANNOUNCEMENTS and other news always check 
DEVLOG:
> > > http://www.amibroker.com/devlog/
> > > 
> > > For other support material please check also:
> > > http://www.amibroker.com/support.html
> > > 
> > > Yahoo! Groups Links
> > > 
> > > 
> > > 
> > > 
> > >
> >
>

[amibroker] Re: Optimization speed increase in 5.01

Reply via email to