On 17/02/2014 3:32 AM, Ed Jaffe wrote:
On 2/16/2014 1:10 AM, Binyamin Dissen wrote:
Say I have two words,

              CURRENT   DS     F
              SUM      DS   F

I want to add CURRENT to SUM, but most of the time CURRENT will be zero.
CURRENT and SUM are not adjacent (different data lines)

If only a single unit of work and both fields are in the same 256-byte cache line, I predict it will be faster just to add them - no branch. However, if multiple units of work simultaneously executing on different CPs will be doing this operation, then I would consider the branch technique to avoid cache "thrash."

It's interesting you should mention that because I was looking at a concurrent queue implementation for x86 just the other day and all the node data structures were padded to fit into a cache line. That seemed like an enormous waste of space but I suppose it's a space for time trade-off. If a consumer was popping while a producer was pushing and the queue was small that would thrash the cache. Phew, concurrent programming is tricky!

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to