Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-26 Thread Mitar
Hi!

On Sat, Jul 26, 2008 at 3:17 AM, Ben Lippmeier [EMAIL PROTECTED] wrote:
 http://valgrind.org/info/tools.html

No support for Mac OS X. :-(


Mitar
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-26 Thread Mitar
Hi!

On Sat, Jul 26, 2008 at 1:35 PM, Mitar [EMAIL PROTECTED] wrote:
 No support for Mac OS X. :-(

Apple provides Shark in Xcode Tools which has something called L2
Cache Miss Profile. I will just have to understand results it
produces.


Mitar
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-25 Thread Mitar
Hi!

 If we spend so long blocked on memory reads that we're only utilising
 50% of a core's time then there's lots of room for improvements if we
 can fill in that wasted time by running another thread.

How can you see how much does your program wait because of L2 misses?
I have been playing lately with dual Quad-Core Intel Xeon Mac Pros
with 12 MB L2 cache per CPU and 1.6 GHz bus speed and it would be
interesting to check this things there.


Mitar
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-25 Thread Ben Lippmeier


http://valgrind.org/info/tools.html

On 26/07/2008, at 11:02 AM, Mitar wrote:


Hi!


If we spend so long blocked on memory reads that we're only utilising
50% of a core's time then there's lots of room for improvements if we
can fill in that wasted time by running another thread.


How can you see how much does your program wait because of L2 misses?
I have been playing lately with dual Quad-Core Intel Xeon Mac Pros
with 12 MB L2 cache per CPU and 1.6 GHz bus speed and it would be
interesting to check this things there.


Mitar


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-25 Thread Don Stewart
A tool originally developed to measure cache misses in GHC :)

Ben.Lippmeier:
 
 http://valgrind.org/info/tools.html
 
 On 26/07/2008, at 11:02 AM, Mitar wrote:
 
 Hi!
 
 If we spend so long blocked on memory reads that we're only utilising
 50% of a core's time then there's lots of room for improvements if we
 can fill in that wasted time by running another thread.
 
 How can you see how much does your program wait because of L2 misses?
 I have been playing lately with dual Quad-Core Intel Xeon Mac Pros
 with 12 MB L2 cache per CPU and 1.6 GHz bus speed and it would be
 interesting to check this things there.
 
 
 Mitar
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-25 Thread Duncan Coutts

On Sat, 2008-07-26 at 03:02 +0200, Mitar wrote:
 Hi!
 
  If we spend so long blocked on memory reads that we're only utilising
  50% of a core's time then there's lots of room for improvements if we
  can fill in that wasted time by running another thread.
 
 How can you see how much does your program wait because of L2 misses?
 I have been playing lately with dual Quad-Core Intel Xeon Mac Pros
 with 12 MB L2 cache per CPU and 1.6 GHz bus speed and it would be
 interesting to check this things there.

Take a look at the paper that Ben referred to

http://www.cl.cam.ac.uk/~am21/papers/msp02.ps.gz

They use hardware performance counters.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-24 Thread Duncan Coutts

On Thu, 2008-07-24 at 16:43 +1200, Richard A. O'Keefe wrote:
 On 24 Jul 2008, at 3:52 am, Duncan Coutts wrote:
 [Sun have donated a T5120 server + USD10k to develop
   support for Haskell on the SPARC.]
 
 This is wonderful news.
 
 I have a 500MHz UltraSPARC II on my desktop running Solaris 2.10.

I have 500MHz UltraSPARC II on my desktop running Gentoo Linux. :-)

 Some time ago I tried to install GHC 6.6.1 on it, but ended up
 with something that compiles to C ok, but then invokes some C
 compiler with option -fwrapv, which no compiler on that machine
 accepts, certainly none that was present when I installed it.

I've got ghc 6.8.2 working, but only -fvia-C and only unregisterised.

-fwrapv is an option to some version of gcc, but I couldn't tell you
which.

 I would really love to be able to use GHC on that machine.

Me too :-), or in my case use it a bit quicker. Unregisterised ghc
builds are pretty slow.

 I also have an account on a T1 server, but the research group
 who Sun gave it to chose to run Linux on it, of all things.

Our tentative plan is to partition our T2 server using logical domains
and run both Solaris and Linux. We'd like to set up ghc build bots on
both OSs.

 So binary distributions for SPARC/Solaris and SPARC/Linux would
 be very very nice things for this new project to deliver early.

I guess this is something that anyone with an account on the box could
do. So once we get to the stage where we're handing out accounts then
hopefully this would follow.

The project isn't aiming to get the registerised C backend working
nicely, we're aiming to get a decent native backend. That should also be
much less fragile by not depending on the version of gcc so closely.

 (Or some kind of source distribution that doesn't need a working
 GHC to start with.)

That's a tad harder. Needs lot of build system hacking.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-24 Thread John Meacham
Neat stuff. I used to work at Sun in the solaris kernel group, the SPARC
architecture is quite elegant. I wonder if we can find an interesting
use for the register windows in a haskell compiler. Many compilers for
non c-like languages (such as boquist's one that jhc is based on (in
spirit, if not code)) just ignore the windows and treat the architecture
as having a flat 32 register file.

John


-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-24 Thread Duncan Coutts

On Thu, 2008-07-24 at 14:38 -0700, John Meacham wrote:
 Neat stuff. I used to work at Sun in the solaris kernel group, the SPARC
 architecture is quite elegant. I wonder if we can find an interesting
 use for the register windows in a haskell compiler. Many compilers for
 non c-like languages (such as boquist's one that jhc is based on (in
 spirit, if not code)) just ignore the windows and treat the architecture
 as having a flat 32 register file.

Right. GHC on SPARC has also always disabled the register window when
running Haskell code (at least for registerised builds) and only uses it
when using the C stack and calling C functions.

We should discuss this with our project adviser from the SPARC compiler
group.

The problem of course is recursion and deeply nested call stacks which
don't make good use of register windows because they keep having to
interrupt to spill them to the save area.

I vaguely wondered if they might be useful for leaf calls or more
generally where we can see statically that the call depth is small (and
we can see all callers of said function, since it'd change the calling
convention).

But now you mention it, I wonder if there is anything even more cunning
we could do, perhaps with lightweight threads or something. Or perhaps
an area to quickly save registers at GC safe points.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-24 Thread Ben Lippmeier


On 25/07/2008, at 8:55 AM, Duncan Coutts wrote:

Right. GHC on SPARC has also always disabled the register window when
running Haskell code (at least for registerised builds) and only  
uses it

when using the C stack and calling C functions.



I'm not sure whether register windows and continuation based back-ends  
are ever going to be very good matches - I don't remember the last  
time I saw a 'ret' instruction in the generated code :). If there's a  
killer application for register windows in GHC it'd be something tricky.


I'd be more interested in the 8 x hardware threads per core, [1]  
suggests that (single threaded) GHC code spends over half its time  
stalled due to L2 data cache miss. 64 threads per machine is a good  
incentive for trying out a few `par` calls..


Ben.

[1] http://www.cl.cam.ac.uk/~am21/papers/msp02.ps.gz

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-24 Thread Duncan Coutts

On Fri, 2008-07-25 at 10:38 +1000, Ben Lippmeier wrote:

 I'd be more interested in the 8 x hardware threads per core, [1]  
 suggests that (single threaded) GHC code spends over half its time  
 stalled due to L2 data cache miss.

Right, that's what I think is most interesting and why I wanted to get
this project going in the first place. 

If we spend so long blocked on memory reads that we're only utilising
50% of a core's time then there's lots of room for improvements if we
can fill in that wasted time by running another thread. So that's the
supposed advantage of multiplexing several threads per core. If Haskell
is suffering more than other languages with the memory latency and low
utilisation then we've also got most to gain with this multiplexing
approach.

 64 threads per machine is a good incentive for trying out a few `par`
 calls..

Of course then it means we need to have enough work to do. Indeed we
need quite a bit just to break even because each core is relatively
stripped down without all the out-of-order execution etc.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-24 Thread Richard A. O'Keefe


On 25 Jul 2008, at 10:55 am, Duncan Coutts wrote:

The problem of course is recursion and deeply nested call stacks which
don't make good use of register windows because they keep having to
interrupt to spill them to the save area.


A fair bit of thought was put into SPARC V9 to making saving and  
restoring
register windows a lot cheaper than it used to be.  (And the Sun C  
compiler

learned how to do TRO.)

It's nice to have 3 windows:
 C worldstartup
 Haskell world  normal Haskell code
 millicode  special support code
so that normal code doesn't have to leave registers spare for millicode
routines.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-24 Thread Ben Lippmeier


On 25/07/2008, at 12:42 PM, Duncan Coutts wrote:


Of course then it means we need to have enough work to do. Indeed we
need quite a bit just to break even because each core is relatively
stripped down without all the out-of-order execution etc.


I don't think that will hurt too much. The code that GHC emits is very  
regular and the basic blocks tend to be small. A good proportion of it  
is just for copying data between the stack and the heap. On the  
upside, it's all very clean and amenable to some simple peephole  
optimization / compile time reordering.


I remember someone telling me that one of the outcomes of the Itanium  
project was that they didn't get the (low level) compile-time  
optimizations to perform as well as they had hoped. The reasoning was  
that a highly speculative/out-of-order processor with all the  
trimmings had a lot more dynamic information about the state of the  
program, and could make decisions on the fly which were better than  
what you could ever get statically at compile time. -- does anyone  
have a reference for this?


Anyway, this problem is moot with GHC code. There's barely any  
instruction level parallelism to exploit anyway, but adding an extra  
hardware thread is just a `par` away.


To quote a talk from that paper earlier: GHC programs turn an Athlon  
into a 486 with a high clock speed!


Ben.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-23 Thread Duncan Coutts

 http://haskell.org/opensparc/

I am very pleased to announce a joint project between Sun Microsystems
and the Haskell.org community to exploit the high performance
capabilities of Sun's latest multi-core OpenSPARC systems via Haskell!

 http://opensparc.net/

Sun has donated a powerful 8 core SPARC Enterprise T5120 Server to the
Haskell community, and $10,000 to fund a student, to further develop
support for high performance Haskell on the SPARC.

The aim of the project is to improve the SPARC native code generator
in GHC and to demonstrate and improve the results of parallel Haskell
benchmarks. The student will work with a mentor from Haskell.org and
an adviser from Sun's SPARC compiler team.

  ** We are now inviting applications from students **

Please forward this announcement to any and all mailing lists where you
think interested students might be lurking. Further details for
students may be found below, and on the project website.

Haskell and Multi-core Systems
--

The latest generation of multi-core machines pose a number of problems
for traditional languages and parallel programming techniques.
Haskell, in contrast, supports a wealth of approaches for writing
correct parallel programs: traditional explicit threads and locks
(forkIO and MVars), pure parallel evaluation strategies (par) and also
Software Transactional Memory (STM).

GHC has supported lightweight preemptable threads for a long time, and
for the last couple of years it has been able to take advantage of
machines with multiple CPUs or CPU cores. The GHC runtime has also
recently
gained a parallel garbage collector.

OpenSPARC
-

We think the UltraSPARC T1/T2 architecture is a very interesting
platform for Haskell. In particular the way that each core multiplexes
many threads as a way of hiding memory latency. Memory latency is a
performance bottleneck for Haskell code because the execution model
uses a lot of memory indirections.

Essentially, when one thread blocks due to a main memory read, the
next thread is able to continue. This is in contrast to traditional
architectures where the CPU core would stall until the result of the
memory read was available. This approach can achieve high utilisation
as long as there is enough parallelism available.

The Project
---

GHC is increasingly relying on its native code backend for high
performance. Respectable single-threaded performance is a prerequisite
for decent parallel performance. The first stage of the project
therefore is to implement a new SPARC native code generator, taking
advantage of the recent and ongoing infrastructure improvements in the
C-- and native layers of the GHC backend. There is some existing support
for SPARC in the native code generator but it has not kept up with
changes in the GHC backend in the last few years.

Once the code generator is working we will want to get a range of
single threaded and parallel benchmarks running and look for various
opportunities for improvement. There is plenty of ongoing work on the
generic parts of the GHC backend and run-time system so the project
will focus on SPARC-specific aspects.

The UltraSPARC T1/T2 architecture supports very fast thread
synchronisation (by taking advantage of the fact that all threads
share the same L2 cache). We would like to optimise the
synchronisation primitives in the GHC libraries and run-time system to
take advantage of this. This should provide the basis for exploring
whether the lower synchronisation costs make it advantageous to use
more fine-grained parallelism.

The Server
--

The T5120 server has a T2 UltraSPARC processor with 8 cores running at
1.2GHz. Each core multiplexes 8 threads giving 64 hardware threads
overall. It comes equipped with 32GB of memory. It also has two 146GB
10k RPM SAS disks.

 http://www.sun.com/servers/coolthreads/t5120/
 http://www.sun.com/processors/UltraSPARC-T2/

This server is a donation to the whole Haskell community. We will make
accounts available on the same basis as the existing community server as
soon as is practical. Our friends at Chalmers University of Technology
are kindly hosting the server on our behalf.

We will encourage people to use the server for building, testing and
benchmarking their Haskell software on SPARC, under both Solaris and
Linux.

Student applications


This is a challenging and exciting project and will need a high
calibre student. Familiarity with Haskell is obviously important as is
some experience with code generation for RISC instruction sets.

The summer is now upon us so we do not expect students to be able to
work 3 months all in one go.

We are inviting students to suggest their own schedule when they apply.
This may involve blocks of time in the next 9 months or so. It should
add up to the equivalent of 3 months full time work.

The application process is relatively informal. Students should 

Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-23 Thread Richard A. O'Keefe

On 24 Jul 2008, at 3:52 am, Duncan Coutts wrote:
[Sun have donated a T5120 server + USD10k to develop
 support for Haskell on the SPARC.]

This is wonderful news.

I have a 500MHz UltraSPARC II on my desktop running Solaris 2.10.
Some time ago I tried to install GHC 6.6.1 on it, but ended up
with something that compiles to C ok, but then invokes some C
compiler with option -fwrapv, which no compiler on that machine
accepts, certainly none that was present when I installed it.

I would really love to be able to use GHC on that machine.

I also have an account on a T1 server, but the research group
who Sun gave it to chose to run Linux on it, of all things.

So binary distributions for SPARC/Solaris and SPARC/Linux would
be very very nice things for this new project to deliver early.
(Or some kind of source distribution that doesn't need a working
GHC to start with.)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: Sun Microsystems and Haskell.org joint project on OpenSPARC

2008-07-23 Thread Brandon S. Allbery KF8NH


On 2008 Jul 24, at 0:43, Richard A. O'Keefe wrote:


So binary distributions for SPARC/Solaris and SPARC/Linux would
be very very nice things for this new project to deliver early.
(Or some kind of source distribution that doesn't need a working
GHC to start with.


I'm still working on SPARC/Solaris here as well.  (Still trying to get  
a build that doesn't produce executables that throw schedule re- 
entered unsafely immediately on startup.)


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe