I am trying to optimise a Cocoa app which spends most of it's time in a 
for-loop looking at the bytes of a huge array.

So I decided to use dispatch_apply to divide the work of the for-loop onto 
different cpus (I seem to have 8 of them).
Note: no two threads ever share a common byte of this array.

When I just count the bytes, it behaves as expected: using more threads makes 
it work faster:

# count (no modify) 999888777 bytes in 1 ... 5 threads
1       2.28948
2       1.31898
3       0.889931
4       0.736646
5       0.799812

But when the for-loop also modifies the byte array, then I get:

# count and modify 999888777 bytes in 1 ... 5 threads
1       2.26273
2       3.62382
3       2.41613
4       2.11105
5       2.54171

Here, using only one thread takes about the same time as before. Not very 
surprising.
But using two threads takes much longer than just using one!
How could this happen?

The code is:

- (IBAction)doClicked: sender
{
        BOOL alsoModify = self.selectedMode == mod_CountModify;

        fprintf(stderr,"\n# count %s %lu bytes in %lu ... %lu threads\n", 
                alsoModify ? "and modify" : "(no modify)", self.nbrBytes, 
self.minThread, self.maxThread);
                
        size_t arraySize = self.nbrBytes * sizeof(unsigned char);
        unsigned char *byteArray = malloc( arraySize );
        memset( byteArray, 1, arraySize );
        
        NSUInteger sums[self.maxThread];
        NSUInteger *sumPointer = sums;

        dispatch_queue_t queue = dispatch_get_global_queue( 
DISPATCH_QUEUE_PRIORITY_HIGH, 0 );
        
        for( size_t nbrThreads = self.minThread; nbrThreads <= self.maxThread; 
nbrThreads++ )
        {
                NSDate *date = [ NSDate date ];
                
                NSUInteger bytesPerThread = _nbrBytes / nbrThreads;
                NSUInteger bytesInLastThread = _nbrBytes - bytesPerThread * ( 
nbrThreads - 1 );
                
                for( NSUInteger i = 0; i < nbrThreads; i++ ) sums[i] = 0;

                dispatch_apply( nbrThreads, queue, ^void(size_t idx)
                        {
                                NSUInteger start = idx * bytesPerThread;
                                NSUInteger len = idx + 1 == nbrThreads ? 
bytesInLastThread : bytesPerThread;
                                NSUInteger *threadSumPointer = sumPointer + idx 
;
                                                                
                                for( NSUInteger i = 0; i < len; i++ ) 
                                {
                                        *threadSumPointer += byteArray[ start + 
i ];
                                        if ( alsoModify ) byteArray[ start + i 
] &= 0xf; 
                                };
                        }
                );      

                NSUInteger total = 0; for( NSUInteger i = 0; i < nbrThreads; 
i++ ) total += sums[i];
                
                NSTimeInterval elapsedTime = -[ date timeIntervalSinceNow ];
                fprintf(stderr,"%lu\t%g\n", nbrThreads, elapsedTime);
        };
        
        free( byteArray );
}

Gerriet.


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to