That was promised to be delivered in April right after the last AmiBroker
conference I was at which was several years ago .

 

I would suspect AB could do it internally better than my relatively crude
external method .

 

But if you have a quad . why wait ?

 

  _____  

From: [email protected] [mailto:[EMAIL PROTECTED] On Behalf
Of Ronald Davis
Sent: Sunday, June 15, 2008 10:16 AM
To: [email protected]
Subject: Re: [amibroker] Multi Core Optimization, L2 Cache & Optimization
Run Times

 

When TJ modifies Amibroker to enable it to take full advantage of  two
cores,  can we expect that a single version of Amibroker will produce speed
improvements similar to what you just posted?    Ron D

 

 



Fred Tonetti <[EMAIL PROTECTED]> wrote:

Given TJ's comments about:

 

-          The amount of memory utilized in processing symbols of data 

-          Whether or not this would fit in the L2 cache 

-          The effect it would have on optimizations when it didn't

 

I finally got around to running a little benchmark for Multi Core
Optimization using the program I wrote and posted ( MCO ) which I'll be
posting a new version of shortly .

 

These tests were run under the following conditions:

 

-          A less than state of the art laptop with 

o        Core 2 Duo 1.86 Ghz processor

o        2 MB of L2 Cache

 

-          Watch Lists of symbols each of which 

o        Contains the next power of two number of symbols of the previous
i.e. 1, 2, 4, 8, 16, 32, 64, 128, 256

o        Contains Symbols containing ~5000 bars of data .

 

Given the above:

 

-          Each symbol should require 160,000 bytes i.e. ~5,000 bars * 32
bytes per bar

-          Loading more than 13 symbols should cause L2 cache misses to
occur

 

Results:

 

-          See the attached data & chart

 

There are several interesting things I find regarding the results .

 

-          The "dent" in the curve looking left to right occurs right where
you'd think it would, between 8 symbols and 16 symbols i.e. from the point
at which all data can be loaded to and accessed from the L2 cache to the
point where it no longer can .

-          The "dent" occurs in the same place running either one or two
instances of AB

-          The "dent" while clearly visible is hardly traumatic in terms of
run times

-          The relationship of run times between running one and two
instances of AB is consistent at 40% savings in terms of run times
regardless of the number of symbols.  

-          This is also in line when one looks at how much CPU is utilized
when running one instance of AB which on the test machine is typically in
the 54 - 60% range.

 

I have a new toy that I'll be trying these benchmarks on again shortly i.e.
a dual core 2 duo quad 3.0 ghz . 

 

 

Reply via email to