On 9 Feb 2014, at 15:53, Greg Parker gpar...@apple.com wrote:
On Feb 9, 2014, at 12:19 AM, Gerriet M. Denkmann gerr...@mdenkmann.de wrote:
The real app (which I am trying to optimise) has actually two loops: one is
counting, the other one is modifying. Which seems to be good news.
But I
On Sat, Feb 8, 2014, at 11:35 PM, Gerriet M. Denkmann wrote:
But using two threads takes much longer than just using one!
How could this happen?
Because now you've got two CPUs fighting over one cache line?
Optimization is hard. Throw more threads at it is not a panacea.
--Kyle Sluder
On 9 Feb 2014, at 14:57, Kyle Sluder k...@ksluder.com wrote:
On Sat, Feb 8, 2014, at 11:35 PM, Gerriet M. Denkmann wrote:
But using two threads takes much longer than just using one!
How could this happen?
Because now you've got two CPUs fighting over one cache line?
Optimization is
On Feb 9, 2014, at 00:19 , Gerriet M. Denkmann gerr...@mdenkmann.de wrote:
But I would really like to understand what I should do.
You might get a happier outcome by using a different approach that’s designed
for this sort of the thing. For example:
— One of the vector frameworks, like
On Feb 9, 2014, at 12:19 AM, Gerriet M. Denkmann gerr...@mdenkmann.de wrote:
The real app (which I am trying to optimise) has actually two loops: one is
counting, the other one is modifying. Which seems to be good news.
But I would really like to understand what I should do. Trial and error
On 9 Feb 2014, at 15:53, Greg Parker gpar...@apple.com wrote:
On Feb 9, 2014, at 12:19 AM, Gerriet M. Denkmann gerr...@mdenkmann.de wrote:
The real app (which I am trying to optimise) has actually two loops: one is
counting, the other one is modifying. Which seems to be good news.
But I
The 2011 WWDC Blocks and Grand Central Dispatch in practice talks about cache
line size which I believe is relevant here.
You can read my notes from that session here:
http://blog.yvs.eu.com/2013/07/blocks-and-grand-central-dispatch-in-practice/
Kevin
On 9 Feb 2014, at 08:53, Greg Parker
On 9 Feb 2014, at 15:53, Greg Parker gpar...@apple.com wrote:
On Feb 9, 2014, at 12:19 AM, Gerriet M. Denkmann gerr...@mdenkmann.de wrote:
The real app (which I am trying to optimise) has actually two loops: one is
counting, the other one is modifying. Which seems to be good news.
But I
I am trying to optimise a Cocoa app which spends most of it's time in a
for-loop looking at the bytes of a huge array.
So I decided to use dispatch_apply to divide the work of the for-loop onto
different cpus (I seem to have 8 of them).
Note: no two threads ever share a common byte of this