On 17/02/2014 3:32 AM, Ed Jaffe wrote:
On 2/16/2014 1:10 AM, Binyamin Dissen wrote:
Say I have two words,
CURRENT DS F
SUM DS F
I want to add CURRENT to SUM, but most of the time CURRENT will be zero.
CURRENT and SUM are not adjacent (different data lines)
If only a single unit of work and both fields are in the same 256-byte
cache line, I predict it will be faster just to add them - no branch.
However, if multiple units of work simultaneously executing on
different CPs will be doing this operation, then I would consider the
branch technique to avoid cache "thrash."
It's interesting you should mention that because I was looking at a
concurrent queue implementation for x86 just the other day and all the
node data structures were padded to fit into a cache line. That seemed
like an enormous waste of space but I suppose it's a space for time
trade-off. If a consumer was popping while a producer was pushing and
the queue was small that would thrash the cache. Phew, concurrent
programming is tricky!
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN