Re: [Prime] Multi-threaded carry propagation

George Woltman Fri, 22 Sep 2006 18:46:54 -0700

At 09:54 AM 9/14/2006, George Woltman wrote:
>1)  Give each thread a contiguous set of data blocks to crunch through.
>Thread 1 gets blocks 0 - 127, thread 2 gets 128 - 255, etc.
>
>2)  Have the 8 threads start on data blocks 0-7.  Put a lock around the
>carry code such that it insures that data blocks are processed in order.
>
>I lean toward the second solution.  Does anyone have any real-world
>experiences or alternative suggestions?


Upon further reflection, I think the first solution is the only attractive one.
Both methods should have the same theoretical throughput, but when there
isn't one CPU for each thread, then the first solution is better.  Here's why:

Say there are 7 available CPUs for 8 threads. Threads 0 to 6 start processing
blocks 0-6 and prefetching blocks 8-14.  Now all 7 threads stall and the
thread processing block 7 is run destroying all that nice prefetching we did.

My fear with the 2nd proposed solution that I was originally considering
is that the serialization done at the carry propagation step will give the OS
more opportunities to preempt one of our threads destroying the benefits
of prefetching.


I implemented multi-threading in pass 2 and it is working great.  The true data
independence made it a pretty easy chore.  I've not started multi-threading
in pass 1 as I'm still looking for better solutions.  It would be nice if the
algorithm worked well no matter how many CPUs are available.  So far no luck.
If there are 8 CPUs and 8 threads, then as soon as one CPU is busy with other
work throughput is cut in half.  Yuk.


_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Re: [Prime] Multi-threaded carry propagation

Reply via email to