RE: Compiling large source files

2009-08-06 Thread Simon Peyton-Jones
It should be pretty much linear (modulo a log(n) factor, but then log(n) is 
practically constant (=64 or so).).

But people often report that GHC is slow (perhaps non-linearly so) when 
compiling vast blobs of literal data.  Because there is a reasonable workaround 
(just parse the data), which actually makes your program more flexible (since 
you can change the data without recompiling), we have serially failed to look 
hard at this problem.  I wish someone would nail it, though. 

Simon

| -Original Message-
| From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-haskell-
| users-boun...@haskell.org] On Behalf Of Serge D. Mechveliani
| Sent: 04 August 2009 13:31
| To: Simon Marlow
| Cc: glasgow-haskell-users@haskell.org
| Subject: Re: Compiling large source files
| 
| On Tue, Aug 04, 2009 at 09:12:37AM +0100, Simon Marlow wrote:
|  I suggest not using Haskell for your list.  Put the data in a file and
|  read it at runtime, or put it in a static C array and link it in.
| 
|  On 03/08/2009 22:09, G?nther Schmidt wrote:
|  Hi Thomas,
|  yes, a source file with a single literal list with 85k elements.
| 
| 
| People,
| when a program only defines and returns a String constant of  n
| literals, how much memory needs ghc-6.10.4 to compile it ?
| O(n), or may be O(n^2), or ...
| 
| Regards,
| 
| -
| Serge Mechveliani
| mech...@botik.ru
| 
| ___
| Glasgow-haskell-users mailing list
| Glasgow-haskell-users@haskell.org
| http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


What is the mutator?

2009-08-06 Thread Jason Dusek
  I've been reading a little about GC latency and have run
  across statements like this:

One solution to the GC synchronisation problem would be to
implement a concurrent garbage collector. Typically,
however, concurrent GC adds some overhead to the mutator,
since it must synchronise with the collector.some thunks are
never “black-holed”, so giving a potential performance win.
Unfortunately, in the parallel setting, it substantially
enlarges the time window in which two or more duplicate
threads might evaluate the same think, and thus

 -- Comparing and Optimising Parallel Haskell Implementations
for Multicore Machines

  What is the mutator?

--
Jason Dusek
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


gcc version for GHC 6.10.4 under Sparc/Solaris

2009-08-06 Thread Christian Maeder
Hi Ian,

could you add a note on the download page that
GCC version 4.3.x is not suited for:

http://www.haskell.org/ghc/dist/6.10.4/maeder/ghc-6.10.4-sparc-sun-solaris2.tar.bz2

The binary-dist was compiled using gcc-4.2.2 (but also works i.e. for
gcc-3.4.4)

page
http://hackage.haskell.org/trac/ghc/wiki/Building/Solaris
says:

GCC version 4.1.2 is recommended
and
GHC has not yet been updated to understand the assembly output of GCC
version 4.3.x.

Maybe a ticket should be opened for gcc-4.3.x under Sparc/Solaris

Did anyway use gcc-4.3.x under x86 Solaris? (I don't think, it'll have
the same problem.)

Cheers Christian

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: use gtar and not tar under solaris

2009-08-06 Thread Duncan Coutts
On Tue, 2009-08-04 at 10:15 +0200, Christian Maeder wrote:
 Hi,
 
 I've just been informed that unpacking the binary (i386) solaris
 distribution using bunzip2 and tar:

It may work better in future if you use a non-GNU tar to pack it up in
the first place. GNU tar uses a non-standard tar format by default.
Solaris tar would likely have more luck unpacking a POSIX/USTAR tar
format file. It's also possible to use gnu tar to make standard tar
format files, using --format ustar rather than gnu tar's default of
--format gnu.

Duncan
(who now knows an unhealthy amount about tar file formats after writing
a Haskell package to read them)

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: gcc version for GHC 6.10.4 under Sparc/Solaris

2009-08-06 Thread Duncan Coutts
On Thu, 2009-08-06 at 10:04 +0200, Christian Maeder wrote:
 Hi Ian,
 
 could you add a note on the download page that
 GCC version 4.3.x is not suited for:
 
 http://www.haskell.org/ghc/dist/6.10.4/maeder/ghc-6.10.4-sparc-sun-solaris2.tar.bz2
 
 The binary-dist was compiled using gcc-4.2.2 (but also works i.e. for
 gcc-3.4.4)

I should also note that there is a GHC 6.10.4 binary for Sparc/Linux
that is now included with Gentoo. It's got all features turned on except
for split objects (which fails due to mixing ld -r and --relax flags).
In particular it's a registerised via-C build with ghci, TH and
profiling working.

It's a distro package not a generic relocatable GHC binary tarball so
there's no point putting it on the ghc download page, but it's there
nevertheless if people want it (look for the gentoo ghc ebuild).

Duncan

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: use gtar and not tar under solaris

2009-08-06 Thread Duncan Coutts
On Thu, 2009-08-06 at 12:30 +0100, Duncan Coutts wrote:
 On Tue, 2009-08-04 at 10:15 +0200, Christian Maeder wrote:
  Hi,
  
  I've just been informed that unpacking the binary (i386) solaris
  distribution using bunzip2 and tar:
 
 It may work better in future if you use a non-GNU tar to pack it up in
 the first place. GNU tar uses a non-standard tar format by default.
 Solaris tar would likely have more luck unpacking a POSIX/USTAR tar
 format file. It's also possible to use gnu tar to make standard tar
 format files, using --format ustar rather than gnu tar's default of
 --format gnu.

In fact I think I'd always advocate using the USTAR tar format over the
GNU tar format when distributing software, since portability is of prime
concern. This is what cabal-install does. I'd recommend ghc do it too. I
also filed a ticket for darcs dist about this some time ago.

Duncan

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: What is the mutator?

2009-08-06 Thread Bulat Ziganshin
Hello Jason,

Thursday, August 6, 2009, 11:38:08 AM, you wrote:

 One solution to the GC synchronisation problem would be to
 implement a concurrent garbage collector. Typically,
 however, concurrent GC adds some overhead to the mutator,
 since it must synchronise with the collector.some thunks are
 never “black-holed”, so giving a potential performance win.
 Unfortunately, in the parallel setting, it substantially
 enlarges the time window in which two or more duplicate
 threads might evaluate the same think, and thus

i'm not an expert, but: lazy haskell value is some expression to
comupte. when this value started to evaluate, it's replaced by black
hole - special value. attempt to compute black-holed value (in the
same thread) means that we have cyclic computation dependency -
exception triggered. once value of thunk is evaluated, it's written
back by code called mutator


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: What is the mutator?

2009-08-06 Thread Malcolm Wallace
i'm not an expert, but:  once value of thunk is evaluated, it's  
written

back by code called mutator


Whilst that is indeed mutation, it is not what is usually referred to  
as the mutator in the context of garbage collection.  Quite simply,  
the mutator is the actual running program, as opposed to the GC,  
which is part of the underlying runtime system.  Conceptually, the  
mutator and GC are the two mutually-exclusive threads of control that  
modify the heap.  Usually one must halt while the other runs.


Regards,
Malcolm

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: use gtar and not tar under solaris

2009-08-06 Thread Christian Maeder
ghc's configure script set:

TAR = /opt/csw/bin/gtar
ZIP = zip

in mk/config.mk although I've got /usr/bin/tar, too (and earlier in my
path).

Cheers Christian

Duncan Coutts wrote:
 On Tue, 2009-08-04 at 10:15 +0200, Christian Maeder wrote:
 Hi,

 I've just been informed that unpacking the binary (i386) solaris
 distribution using bunzip2 and tar:
 
 It may work better in future if you use a non-GNU tar to pack it up in
 the first place. GNU tar uses a non-standard tar format by default.
 Solaris tar would likely have more luck unpacking a POSIX/USTAR tar
 format file. It's also possible to use gnu tar to make standard tar
 format files, using --format ustar rather than gnu tar's default of
 --format gnu.
 
 Duncan
 (who now knows an unhealthy amount about tar file formats after writing
 a Haskell package to read them)
 
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: Compiling large source files

2009-08-06 Thread Simon Peyton-Jones
Gunther

Could you make a Trac bug report for this, and attach your source file?

It'd help if you could first check that things are still bad with GHC 6.10.4.

Another useful thing would be to provide data on whether the behaviour is 
non-linear.  Eg try with 1k, 2k, 4k, 8k, etc elements in your list and see how 
it behaves.

Providing reproducible and well-characterised bug reports greatly increases the 
likelihood that we'll fix it! 

Thanks

Simon


| -Original Message-
| From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-haskell-users-
| boun...@haskell.org] On Behalf Of Günther Schmidt
| Sent: 03 August 2009 22:09
| To: Thomas DuBuisson
| Cc: glasgow-haskell-users@haskell.org
| Subject: Re: Compiling large source files
| 
| Hi Thomas,
| 
| yes, a source file with a single literal list with 85k elements.
| 
| 
| Günther
| 
| 
| Am 03.08.2009, 22:20 Uhr, schrieb Thomas DuBuisson
| thomas.dubuis...@gmail.com:
| 
|  Can you define very large and compiler?  I know an old version of
|  GHC (6.6?) would eat lots of memory when there were absurd numbers of
|  let statements.
| 
|  Thomas
| 
|  2009/8/3 Günther Schmidt red...@fedoms.com:
|  Hi all,
| 
|  I'm having trouble compiling very large source files, the compiler eats
|  2GB
|  and then dies. Is there a way around it?
| 
|  Günther
|  ___
|  Glasgow-haskell-users mailing list
|  Glasgow-haskell-users@haskell.org
|  http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
| 
| 
| 
| ___
| Glasgow-haskell-users mailing list
| Glasgow-haskell-users@haskell.org
| http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: GHC 6.10.2 consuming lots of memory while compiling - help?

2009-08-06 Thread Simon Peyton-Jones
Justin

Presumably DeliveryManagementQueries uses TH to generate lots of glop?  
*** Desugar:
Result size = 616,969

That's a big program!  What kind of glop is it?

Maybe it's the same kind of thing as one of these?
http://hackage.haskell.org/trac/ghc/query?status=newstatus=assignedstatus=reopenedtype=compile-time+performance+bugorder=priority

Regardless, if you can make a reproducible test case that'd help us. Probably 
you can do this simply by generating a file with lots of repeated glop of the 
same kind as your TH is spitting out.  

Simon


| -Original Message-
| From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-haskell-users-
| boun...@haskell.org] On Behalf Of Justin Bailey
| Sent: 24 July 2009 20:03
| To: glasgow-haskell-users@haskell.org
| Subject: GHC 6.10.2 consuming lots of memory while compiling - help?
| 
| I apologize in advance for the vagueness of my report here - it's one
| of those situations I'm not sure how to cut it down to size yet.
| 
| I have a module that uses HaskellDB and Template Haskell together. The
| module itself depends on 23 other modules, each of which give a type
| definition for a particular database table or view. I only mention
| that to emphasize that the module depends on some big types (HList
| records w/ 20+ members) and on compile-time generated code.
| 
| My problem is this - when GHC compile the module, it consumes 1.2 GB
| of memory, takes about 10 minutes, and finally produces an object
| file. The memory usage seems related to template haskell, but I'm not
| positive.
| 
| I've attached verbose output from compiling the module in question.
| The command line I used was:
| 
|   ghc -v --make -c DeliveryManagementQueries.hs -XEmptyDataDecls
| -XTypeSynonymInstances -XTemplateHa
| skell
| 
| Now for my question - please ignore the specifics of
| haskelldb/template haskell - any suggestions for figuring out what GHC
| is doing, besides tried-and-true divide and conquer?
| 
| Justin
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: use gtar and not tar under solaris

2009-08-06 Thread Matthias Kilian
On Thu, Aug 06, 2009 at 12:30:51PM +0100, Duncan Coutts wrote:
  I've just been informed that unpacking the binary (i386) solaris
  distribution using bunzip2 and tar:
 
 It may work better in future if you use a non-GNU tar to pack it up in
 the first place. GNU tar uses a non-standard tar format by default.
 Solaris tar would likely have more luck unpacking a POSIX/USTAR tar
 format file. It's also possible to use gnu tar to make standard tar
 format files, using --format ustar rather than gnu tar's default of
 --format gnu.

Is there something like pax(1) available on solaris? If so, it
should be be preferred, because it's a POSIX tool, so there's some
hope that it behaves the same on different systems.

pax(1) should be available on all BSD systems, and to my knowledge,
there's a pax package available at least for Debian Linux.

Ciao,
Kili
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: What is the mutator?

2009-08-06 Thread Jost Berthold

Message: 1
Date: Thu, 6 Aug 2009 00:38:08 -0700
From: Jason Dusek jason.du...@gmail.com
Subject: What is the mutator?
To: glasgow-haskell-users@haskell.org
Message-ID:
42784f260908060038h53d7cc0dy9f80e43f269a2...@mail.gmail.com
Content-Type: text/plain; charset=UTF-8

  I've been reading a little about GC latency and have run
  across statements like this:

One solution to the GC synchronisation problem would be to
implement a concurrent garbage collector. Typically,
however, concurrent GC adds some overhead to the mutator,
since it must synchronise with the collector.some thunks are
never “black-holed�, so giving a potential performance win.
Unfortunately, in the parallel setting, it substantially
enlarges the time window in which two or more duplicate
threads might evaluate the same think, and thus

 -- Comparing and Optimising Parallel Haskell Implementations
for Multicore Machines

  What is the mutator?


Hi Jason,

as Malcolm already said, the mutator in this text is the/a  thread
evaluating some Haskell expression. Just to add some more details to the 
picture...


In general, a Haskell expression is a computation represented as a graph 
in the heap. Haskell evaluates lazily and does not have to fully 
evaluate every part of it for the program to finish. Unevaluated parts 
are thunks. As soon as one of potentially several concurrent (mutator) 
threads starts to evaluate a thunk, it is replaced by a blackhole, which 
keeps other threads out of it until the node in the graph is evaluated 
(say, to a list cons (:), with probably unevaluated head and tail). Then 
the blackhole is updated with the new value. Other threads block on the 
blackhole in the meantime (so not necessarily an exception in the case 
of concurrent mutator threads) and are woken up by the update.


The passage you quote above is about two separate aspects:

1. Garbage collection and mutator running concurrently: while they 
usually do, they do not _have_ to exclude each other, but not doing so 
means that the objects they are treating have to be locked.


2. About Blackholing: in the sequential evaluation (where hitting a 
blackhole indeed means to have a loop), some better performance can be 
gained by not blackholing a thunk immediately, so this was done in GHC 
earlier. However, it increases the chance for 2 mutator threads to 
evaluate the same thunk (double work), and we got better performance by 
blackholing immediately.


Cheers,
Jost
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: use gtar and not tar under solaris

2009-08-06 Thread Christian Maeder
Matthias Kilian wrote:
 On Thu, Aug 06, 2009 at 12:30:51PM +0100, Duncan Coutts wrote:
 I've just been informed that unpacking the binary (i386) solaris
 distribution using bunzip2 and tar:
 It may work better in future if you use a non-GNU tar to pack it up in
 the first place. GNU tar uses a non-standard tar format by default.
 Solaris tar would likely have more luck unpacking a POSIX/USTAR tar
 format file. It's also possible to use gnu tar to make standard tar
 format files, using --format ustar rather than gnu tar's default of
 --format gnu.
 
 Is there something like pax(1) available on solaris? If so, it
 should be be preferred, because it's a POSIX tool, so there's some
 hope that it behaves the same on different systems.

Yes, pax is available under solaris. I thought GNU tar is the standard
packer under unix. (The usage message of pax is less clear.)

Below is a part of gtar --help

Cheers Christian

 Archive format selection:

  -H, --format=FORMATcreate archive of the given format.

 FORMAT is one of the following:

gnu  GNU tar 1.13.x format
oldgnu   GNU format as per tar = 1.12
pax  POSIX 1003.1-2001 (pax) format
posixSame as pax
ustarPOSIX 1003.1-1988 (ustar) format
v7   old V7 tar format
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: use gtar and not tar under solaris

2009-08-06 Thread Matthias Kilian
On Thu, Aug 06, 2009 at 08:54:49PM +0200, Christian Maeder wrote:
  Is there something like pax(1) available on solaris? If so, it
  should be be preferred, because it's a POSIX tool, so there's some
  hope that it behaves the same on different systems.
 
 Yes, pax is available under solaris. I thought GNU tar is the standard
 packer under unix.

Depends on what `standard' means ;-)

- tar has been there forever on unices, with several slightly
  incompatible format extensions

- GNU tar is just another implementation, typically used on Linux, and it
  has its own incompatible format extensions.

- pax is (or should be) available everywhere, its behaviour is defined
  by POSIX, it should (by default) create archives readable by most
  tar implemenations, but almost nobody knows about it ;-)

  http://www.opengroup.org/onlinepubs/9699919799/utilities/pax.html

I wonder wether Duncan did read and understood that bit of
documentation, I didn't even read all of it ;-)

 (The usage message of pax is less clear.)

The manpage (and of course the POSIX definition) are hard stuff, too.
However, to create an archive, you can use something like

$ pax -wf foo.tar directory

Ciao,
Kili
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: What is the mutator?

2009-08-06 Thread Jason Dusek
2009/08/06 Jost Berthold berth...@mathematik.uni-marburg.de:
 as Malcolm already said, the mutator in this text is the/a
 thread evaluating some Haskell expression.

  I want to thank everyone for taking the time to clarify that
  to me; I'm now much more able to follow discussions of Haskell
  garbage collection.

 1.  Garbage collection and mutator running concurrently: while
 they usually do, they do not _have_ to exclude each other,
 but not doing so means that the objects they are treating
 have to be locked.

  So this is the part that actually lead me here. Say you are
  implementing a network server, for example -- you don't want
  to have big spikes in the request latency due to GC. Not that
  Haskell is so much worse off relative to Java, say; Erlang is
  the only language I'm aware of that takes concurrent GC
  seriously. However, it seems that this problem is hard to
  solve for Haskell:

Parallel GC is when the whole system stops and performs
multi-threaded GC, as opposed to concurrent GC, which is
when the GC runs concurrently with the program. We think
concurrent GC is unlikely to be practical in the Haskell
setting, due to the extra synchronisation needed in the
mutator. However, there may always be clever techniques that
we haven't discovered, and synchronisation might become less
expensive, so the balance may change in the future.

 -- Simon Marlow

  So I wonder, to what degree is GC latency controllable in
  Haskell? It seems that, pending further research, we can not
  hope for concurrent GC.

 2.  About Blackholing: in the sequential evaluation (where
 hitting a blackhole indeed means to have a loop), some
 better performance can be gained by not blackholing a thunk
 immediately, so this was done in GHC earlier. However, it
 increases the chance for 2 mutator threads to evaluate the
 same thunk (double work), and we got better performance by
 blackholing immediately.

  Can blackholing too early could result in non-termination
  (...hitting a blackhole indeed means to have a loop)? Then
  it's not just a matter of performance when we do it?

--
Jason Dusek
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users