RE: Compiling large source files
It should be pretty much linear (modulo a log(n) factor, but then log(n) is practically constant (=64 or so).). But people often report that GHC is slow (perhaps non-linearly so) when compiling vast blobs of literal data. Because there is a reasonable workaround (just parse the data), which actually makes your program more flexible (since you can change the data without recompiling), we have serially failed to look hard at this problem. I wish someone would nail it, though. Simon | -Original Message- | From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-haskell- | users-boun...@haskell.org] On Behalf Of Serge D. Mechveliani | Sent: 04 August 2009 13:31 | To: Simon Marlow | Cc: glasgow-haskell-users@haskell.org | Subject: Re: Compiling large source files | | On Tue, Aug 04, 2009 at 09:12:37AM +0100, Simon Marlow wrote: | I suggest not using Haskell for your list. Put the data in a file and | read it at runtime, or put it in a static C array and link it in. | | On 03/08/2009 22:09, G?nther Schmidt wrote: | Hi Thomas, | yes, a source file with a single literal list with 85k elements. | | | People, | when a program only defines and returns a String constant of n | literals, how much memory needs ghc-6.10.4 to compile it ? | O(n), or may be O(n^2), or ... | | Regards, | | - | Serge Mechveliani | mech...@botik.ru | | ___ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
What is the mutator?
I've been reading a little about GC latency and have run across statements like this: One solution to the GC synchronisation problem would be to implement a concurrent garbage collector. Typically, however, concurrent GC adds some overhead to the mutator, since it must synchronise with the collector.some thunks are never “black-holed”, so giving a potential performance win. Unfortunately, in the parallel setting, it substantially enlarges the time window in which two or more duplicate threads might evaluate the same think, and thus -- Comparing and Optimising Parallel Haskell Implementations for Multicore Machines What is the mutator? -- Jason Dusek ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
gcc version for GHC 6.10.4 under Sparc/Solaris
Hi Ian, could you add a note on the download page that GCC version 4.3.x is not suited for: http://www.haskell.org/ghc/dist/6.10.4/maeder/ghc-6.10.4-sparc-sun-solaris2.tar.bz2 The binary-dist was compiled using gcc-4.2.2 (but also works i.e. for gcc-3.4.4) page http://hackage.haskell.org/trac/ghc/wiki/Building/Solaris says: GCC version 4.1.2 is recommended and GHC has not yet been updated to understand the assembly output of GCC version 4.3.x. Maybe a ticket should be opened for gcc-4.3.x under Sparc/Solaris Did anyway use gcc-4.3.x under x86 Solaris? (I don't think, it'll have the same problem.) Cheers Christian ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: use gtar and not tar under solaris
On Tue, 2009-08-04 at 10:15 +0200, Christian Maeder wrote: Hi, I've just been informed that unpacking the binary (i386) solaris distribution using bunzip2 and tar: It may work better in future if you use a non-GNU tar to pack it up in the first place. GNU tar uses a non-standard tar format by default. Solaris tar would likely have more luck unpacking a POSIX/USTAR tar format file. It's also possible to use gnu tar to make standard tar format files, using --format ustar rather than gnu tar's default of --format gnu. Duncan (who now knows an unhealthy amount about tar file formats after writing a Haskell package to read them) ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gcc version for GHC 6.10.4 under Sparc/Solaris
On Thu, 2009-08-06 at 10:04 +0200, Christian Maeder wrote: Hi Ian, could you add a note on the download page that GCC version 4.3.x is not suited for: http://www.haskell.org/ghc/dist/6.10.4/maeder/ghc-6.10.4-sparc-sun-solaris2.tar.bz2 The binary-dist was compiled using gcc-4.2.2 (but also works i.e. for gcc-3.4.4) I should also note that there is a GHC 6.10.4 binary for Sparc/Linux that is now included with Gentoo. It's got all features turned on except for split objects (which fails due to mixing ld -r and --relax flags). In particular it's a registerised via-C build with ghci, TH and profiling working. It's a distro package not a generic relocatable GHC binary tarball so there's no point putting it on the ghc download page, but it's there nevertheless if people want it (look for the gentoo ghc ebuild). Duncan ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: use gtar and not tar under solaris
On Thu, 2009-08-06 at 12:30 +0100, Duncan Coutts wrote: On Tue, 2009-08-04 at 10:15 +0200, Christian Maeder wrote: Hi, I've just been informed that unpacking the binary (i386) solaris distribution using bunzip2 and tar: It may work better in future if you use a non-GNU tar to pack it up in the first place. GNU tar uses a non-standard tar format by default. Solaris tar would likely have more luck unpacking a POSIX/USTAR tar format file. It's also possible to use gnu tar to make standard tar format files, using --format ustar rather than gnu tar's default of --format gnu. In fact I think I'd always advocate using the USTAR tar format over the GNU tar format when distributing software, since portability is of prime concern. This is what cabal-install does. I'd recommend ghc do it too. I also filed a ticket for darcs dist about this some time ago. Duncan ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: What is the mutator?
Hello Jason, Thursday, August 6, 2009, 11:38:08 AM, you wrote: One solution to the GC synchronisation problem would be to implement a concurrent garbage collector. Typically, however, concurrent GC adds some overhead to the mutator, since it must synchronise with the collector.some thunks are never “black-holed”, so giving a potential performance win. Unfortunately, in the parallel setting, it substantially enlarges the time window in which two or more duplicate threads might evaluate the same think, and thus i'm not an expert, but: lazy haskell value is some expression to comupte. when this value started to evaluate, it's replaced by black hole - special value. attempt to compute black-holed value (in the same thread) means that we have cyclic computation dependency - exception triggered. once value of thunk is evaluated, it's written back by code called mutator -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: What is the mutator?
i'm not an expert, but: once value of thunk is evaluated, it's written back by code called mutator Whilst that is indeed mutation, it is not what is usually referred to as the mutator in the context of garbage collection. Quite simply, the mutator is the actual running program, as opposed to the GC, which is part of the underlying runtime system. Conceptually, the mutator and GC are the two mutually-exclusive threads of control that modify the heap. Usually one must halt while the other runs. Regards, Malcolm ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: use gtar and not tar under solaris
ghc's configure script set: TAR = /opt/csw/bin/gtar ZIP = zip in mk/config.mk although I've got /usr/bin/tar, too (and earlier in my path). Cheers Christian Duncan Coutts wrote: On Tue, 2009-08-04 at 10:15 +0200, Christian Maeder wrote: Hi, I've just been informed that unpacking the binary (i386) solaris distribution using bunzip2 and tar: It may work better in future if you use a non-GNU tar to pack it up in the first place. GNU tar uses a non-standard tar format by default. Solaris tar would likely have more luck unpacking a POSIX/USTAR tar format file. It's also possible to use gnu tar to make standard tar format files, using --format ustar rather than gnu tar's default of --format gnu. Duncan (who now knows an unhealthy amount about tar file formats after writing a Haskell package to read them) ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: Compiling large source files
Gunther Could you make a Trac bug report for this, and attach your source file? It'd help if you could first check that things are still bad with GHC 6.10.4. Another useful thing would be to provide data on whether the behaviour is non-linear. Eg try with 1k, 2k, 4k, 8k, etc elements in your list and see how it behaves. Providing reproducible and well-characterised bug reports greatly increases the likelihood that we'll fix it! Thanks Simon | -Original Message- | From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-haskell-users- | boun...@haskell.org] On Behalf Of Günther Schmidt | Sent: 03 August 2009 22:09 | To: Thomas DuBuisson | Cc: glasgow-haskell-users@haskell.org | Subject: Re: Compiling large source files | | Hi Thomas, | | yes, a source file with a single literal list with 85k elements. | | | Günther | | | Am 03.08.2009, 22:20 Uhr, schrieb Thomas DuBuisson | thomas.dubuis...@gmail.com: | | Can you define very large and compiler? I know an old version of | GHC (6.6?) would eat lots of memory when there were absurd numbers of | let statements. | | Thomas | | 2009/8/3 Günther Schmidt red...@fedoms.com: | Hi all, | | I'm having trouble compiling very large source files, the compiler eats | 2GB | and then dies. Is there a way around it? | | Günther | ___ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users | | | | ___ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: GHC 6.10.2 consuming lots of memory while compiling - help?
Justin Presumably DeliveryManagementQueries uses TH to generate lots of glop? *** Desugar: Result size = 616,969 That's a big program! What kind of glop is it? Maybe it's the same kind of thing as one of these? http://hackage.haskell.org/trac/ghc/query?status=newstatus=assignedstatus=reopenedtype=compile-time+performance+bugorder=priority Regardless, if you can make a reproducible test case that'd help us. Probably you can do this simply by generating a file with lots of repeated glop of the same kind as your TH is spitting out. Simon | -Original Message- | From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-haskell-users- | boun...@haskell.org] On Behalf Of Justin Bailey | Sent: 24 July 2009 20:03 | To: glasgow-haskell-users@haskell.org | Subject: GHC 6.10.2 consuming lots of memory while compiling - help? | | I apologize in advance for the vagueness of my report here - it's one | of those situations I'm not sure how to cut it down to size yet. | | I have a module that uses HaskellDB and Template Haskell together. The | module itself depends on 23 other modules, each of which give a type | definition for a particular database table or view. I only mention | that to emphasize that the module depends on some big types (HList | records w/ 20+ members) and on compile-time generated code. | | My problem is this - when GHC compile the module, it consumes 1.2 GB | of memory, takes about 10 minutes, and finally produces an object | file. The memory usage seems related to template haskell, but I'm not | positive. | | I've attached verbose output from compiling the module in question. | The command line I used was: | | ghc -v --make -c DeliveryManagementQueries.hs -XEmptyDataDecls | -XTypeSynonymInstances -XTemplateHa | skell | | Now for my question - please ignore the specifics of | haskelldb/template haskell - any suggestions for figuring out what GHC | is doing, besides tried-and-true divide and conquer? | | Justin ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: use gtar and not tar under solaris
On Thu, Aug 06, 2009 at 12:30:51PM +0100, Duncan Coutts wrote: I've just been informed that unpacking the binary (i386) solaris distribution using bunzip2 and tar: It may work better in future if you use a non-GNU tar to pack it up in the first place. GNU tar uses a non-standard tar format by default. Solaris tar would likely have more luck unpacking a POSIX/USTAR tar format file. It's also possible to use gnu tar to make standard tar format files, using --format ustar rather than gnu tar's default of --format gnu. Is there something like pax(1) available on solaris? If so, it should be be preferred, because it's a POSIX tool, so there's some hope that it behaves the same on different systems. pax(1) should be available on all BSD systems, and to my knowledge, there's a pax package available at least for Debian Linux. Ciao, Kili ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: What is the mutator?
Message: 1 Date: Thu, 6 Aug 2009 00:38:08 -0700 From: Jason Dusek jason.du...@gmail.com Subject: What is the mutator? To: glasgow-haskell-users@haskell.org Message-ID: 42784f260908060038h53d7cc0dy9f80e43f269a2...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 I've been reading a little about GC latency and have run across statements like this: One solution to the GC synchronisation problem would be to implement a concurrent garbage collector. Typically, however, concurrent GC adds some overhead to the mutator, since it must synchronise with the collector.some thunks are never “black-holed�, so giving a potential performance win. Unfortunately, in the parallel setting, it substantially enlarges the time window in which two or more duplicate threads might evaluate the same think, and thus -- Comparing and Optimising Parallel Haskell Implementations for Multicore Machines What is the mutator? Hi Jason, as Malcolm already said, the mutator in this text is the/a thread evaluating some Haskell expression. Just to add some more details to the picture... In general, a Haskell expression is a computation represented as a graph in the heap. Haskell evaluates lazily and does not have to fully evaluate every part of it for the program to finish. Unevaluated parts are thunks. As soon as one of potentially several concurrent (mutator) threads starts to evaluate a thunk, it is replaced by a blackhole, which keeps other threads out of it until the node in the graph is evaluated (say, to a list cons (:), with probably unevaluated head and tail). Then the blackhole is updated with the new value. Other threads block on the blackhole in the meantime (so not necessarily an exception in the case of concurrent mutator threads) and are woken up by the update. The passage you quote above is about two separate aspects: 1. Garbage collection and mutator running concurrently: while they usually do, they do not _have_ to exclude each other, but not doing so means that the objects they are treating have to be locked. 2. About Blackholing: in the sequential evaluation (where hitting a blackhole indeed means to have a loop), some better performance can be gained by not blackholing a thunk immediately, so this was done in GHC earlier. However, it increases the chance for 2 mutator threads to evaluate the same thunk (double work), and we got better performance by blackholing immediately. Cheers, Jost ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: use gtar and not tar under solaris
Matthias Kilian wrote: On Thu, Aug 06, 2009 at 12:30:51PM +0100, Duncan Coutts wrote: I've just been informed that unpacking the binary (i386) solaris distribution using bunzip2 and tar: It may work better in future if you use a non-GNU tar to pack it up in the first place. GNU tar uses a non-standard tar format by default. Solaris tar would likely have more luck unpacking a POSIX/USTAR tar format file. It's also possible to use gnu tar to make standard tar format files, using --format ustar rather than gnu tar's default of --format gnu. Is there something like pax(1) available on solaris? If so, it should be be preferred, because it's a POSIX tool, so there's some hope that it behaves the same on different systems. Yes, pax is available under solaris. I thought GNU tar is the standard packer under unix. (The usage message of pax is less clear.) Below is a part of gtar --help Cheers Christian Archive format selection: -H, --format=FORMATcreate archive of the given format. FORMAT is one of the following: gnu GNU tar 1.13.x format oldgnu GNU format as per tar = 1.12 pax POSIX 1003.1-2001 (pax) format posixSame as pax ustarPOSIX 1003.1-1988 (ustar) format v7 old V7 tar format ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: use gtar and not tar under solaris
On Thu, Aug 06, 2009 at 08:54:49PM +0200, Christian Maeder wrote: Is there something like pax(1) available on solaris? If so, it should be be preferred, because it's a POSIX tool, so there's some hope that it behaves the same on different systems. Yes, pax is available under solaris. I thought GNU tar is the standard packer under unix. Depends on what `standard' means ;-) - tar has been there forever on unices, with several slightly incompatible format extensions - GNU tar is just another implementation, typically used on Linux, and it has its own incompatible format extensions. - pax is (or should be) available everywhere, its behaviour is defined by POSIX, it should (by default) create archives readable by most tar implemenations, but almost nobody knows about it ;-) http://www.opengroup.org/onlinepubs/9699919799/utilities/pax.html I wonder wether Duncan did read and understood that bit of documentation, I didn't even read all of it ;-) (The usage message of pax is less clear.) The manpage (and of course the POSIX definition) are hard stuff, too. However, to create an archive, you can use something like $ pax -wf foo.tar directory Ciao, Kili ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: What is the mutator?
2009/08/06 Jost Berthold berth...@mathematik.uni-marburg.de: as Malcolm already said, the mutator in this text is the/a thread evaluating some Haskell expression. I want to thank everyone for taking the time to clarify that to me; I'm now much more able to follow discussions of Haskell garbage collection. 1. Garbage collection and mutator running concurrently: while they usually do, they do not _have_ to exclude each other, but not doing so means that the objects they are treating have to be locked. So this is the part that actually lead me here. Say you are implementing a network server, for example -- you don't want to have big spikes in the request latency due to GC. Not that Haskell is so much worse off relative to Java, say; Erlang is the only language I'm aware of that takes concurrent GC seriously. However, it seems that this problem is hard to solve for Haskell: Parallel GC is when the whole system stops and performs multi-threaded GC, as opposed to concurrent GC, which is when the GC runs concurrently with the program. We think concurrent GC is unlikely to be practical in the Haskell setting, due to the extra synchronisation needed in the mutator. However, there may always be clever techniques that we haven't discovered, and synchronisation might become less expensive, so the balance may change in the future. -- Simon Marlow So I wonder, to what degree is GC latency controllable in Haskell? It seems that, pending further research, we can not hope for concurrent GC. 2. About Blackholing: in the sequential evaluation (where hitting a blackhole indeed means to have a loop), some better performance can be gained by not blackholing a thunk immediately, so this was done in GHC earlier. However, it increases the chance for 2 mutator threads to evaluate the same thunk (double work), and we got better performance by blackholing immediately. Can blackholing too early could result in non-termination (...hitting a blackhole indeed means to have a loop)? Then it's not just a matter of performance when we do it? -- Jason Dusek ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users