Bjorn Lisper wrote:

It depends on what you compare with. Multicore CPU:s will probably have
cores that are simpler than current processor cores, which means you will
want to have some parallelism. Cf. a superscalar processor, which really in
a sense is a parallel machine but where you add some complex control
hardware to make it emulate a sequential machine. Running a multicore CPU on
a single core at a time corresponds to running a superscalar machine where
you use none of the superscalarity. Or, running a pipelined CPU (another
form of parallelism) without utilizing any of the pipelining. This is not
what you want.


Well, intel have 2 core now, 4 core soon. AMD have 2 core now...
IBM have 8 core Power5... These are full spec superscalar cores.

When I am thinking about exploiting parallelism, I am thinking about
2/4 cpu (4/8 with hyperthreading) Intel boxes, maybe 8/16 cpu AMD
Opteron boxes, or 8 way Power5 boxes.

Sure you can get some overlap, which can be quite big if the task to perform
per list element is big, but a dependence graph with O(n) nodes will still
have O(n) depth => at maximum constant speedup.


A constant speedup per CPU. Imagine a loop:

   loop [] = 0
   loop (a:b) = length a + loop b

we can run as many "length" calculations in parallel as we have CPUs.
(Here length may take a significant time as a could be long)

I'm not talking about the next generation but further ahead. Maybe these
architectures will have some kind of on-chip shared cache, who knows, but
that memory will then still be much more expensive to access than the memory
in a local core. So you will want to use the core-local memory all the time,
and only communicate between processor cores which are quite close on the
chip.


One step at a time, are you suggesting we fail to take advantage of
architectures that are on the horizon, because things may be different
in the far future...

No, I mean automatic techniques to detect the parallelism statically (and
schedule and allocate work statically), which only will work well for
restricted classes of programs.



Well this is how I imagine things happening as well. As do as much static
analysis as you can at compile time. You end up with a lot of dependancies
resolved, but a few will be unknown until runtime (due to IO for example)...
These remaining few are the ones that you might want to try speculative
execution on... If you have the spare CPU cycles (and the machine would
otherwise be idle).

   Keean.
_______________________________________________
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell

Reply via email to