Poor array behaviour

2009-12-22 Thread Herk, Robert van
Hi all,

I stumbled upon something odd with respect to arrays. I know about GHC not 
doing card marking and traversing whole arrays one each GC for each array with 
alterations, but still I don't understand the behaviour.

The situation is like this. I am building a compiler. This compiler manipulates 
a large graph (for optimizations, etc.). This graph is in memory. As the graph 
is vast, we did a lot of effort to optimize access to it. The nodes in the 
graph are IORefs to some data structure, and this data structure contains its 
edges.

Each node stores its edges in buckets. This is because edges have different 
types (some are control flow links, others are other types of dependencies). 
Most of the graph manipulations only traverse over one type of edge, so we 
figured it would be faster to store the edges in buckets to support such 
queries. Inside the buckets, there are Data.Maps containing the actual edges 
for that bucket. The keys in this map are the target nodes of that edges, which 
are IORefsOrds, which are pairs of a unique integers and a IORef, such that 
they can be ordered and used as keys in a Map. The values are lists of edges to 
that target.

The weird thing is in the buckets. Per node, all buckets are stored in an 
array.  We gave each edge type an integer key.

And we use that key as array index to determine the bucket. I've tried 
implementing this array in two ways:

1) Each node contains a immutable array with IORefs. The IORefs contain the 
actual Data.Maps for the buckets. So, for instance, initializing a node looks 
something like

import Data.Array

uBucket = 8 //There are 8 buckets because there are 8 types of edges.

emptyEdges =
  do  buckets <- sequence ( take uBucket $ repeat (newIORef Map.empty) )
let myArray = listArray (0, 7) buckets
return myArray

So in this solution we have an extra layer of indirection, namely the IORefs, 
but the array is immutable. Because when the edges change for a particular 
node, we can write directly into the IORef and leave the array untouched.

2) Each node contains a mutable array that contains the Data.Maps directly. So, 
for instance, initializing a node looks something like:

import Data.Array.IO

emptyEdges =
   do myArray <- newArray (0, 7) Map.empty
return $! myArray

Of course, one expect 2 to be quickest. However, it turns out that for 2, the 
application is spending much more time in GC, and __even without GC__ the 
application is still slower! I find both things rather weird: I know that for 
huge arrays I am expected to suffer from the missing card-marking "bug", but my 
array sizes are only 8.

Yet the difference I get in GC time are huge:

Solution 1
-
  13,476,008,560 bytes allocated in the heap
   1,714,767,712 bytes copied during GC
 151,518,528 bytes maximum residency (23 sample(s))
   1,743,176 bytes maximum slop
 325 MB total memory in use (2 MB lost due to fragmentation)

  Generation 0: 25689 collections, 0 parallel,  2.43s,  2.53s elapsed
  Generation 1:23 collections, 0 parallel,  1.89s,  2.05s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time   20.58s  ( 20.80s elapsed)
  GCtime4.32s  (  4.58s elapsed)
  RPtime0.00s  (  0.00s elapsed)
  PROF  time0.00s  (  0.00s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time   24.90s  ( 25.38s elapsed)

  %GC time  17.4%  (18.0% elapsed)

  Alloc rate654,751,131 bytes per MUT second

  Productivity  82.6% of total user, 81.1% of total elapsed

Solution 2


  15,901,133,296 bytes allocated in the heap
   9,083,063,848 bytes copied during GC
 117,501,208 bytes maximum residency (23 sample(s))
   1,902,568 bytes maximum slop
 265 MB total memory in use (2 MB lost due to fragmentation)

  Generation 0: 30315 collections, 0 parallel, 44.73s, 44.89s elapsed
  Generation 1:23 collections, 0 parallel,  1.78s,  1.91s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time   25.93s  ( 26.31s elapsed)
  GCtime   46.51s  ( 46.80s elapsed)
  RPtime0.00s  (  0.00s elapsed)
  PROF  time0.00s  (  0.00s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time   72.43s  ( 73.11s elapsed)

  %GC time  64.2%  (64.0% elapsed)

  Alloc rate613,325,569 bytes per MUT second

  Productivity  35.8% of total user, 35.5% of total elapsed

Is the behaviour I am seeing to be expected? And if so, wouldn't it make more 
sense to implement Data.Array.IO internally such that it contains an immutable 
array of IORefs? I also saw that in the next GHC, card marking will be done per 
128 array items. Yet this behaviour seems to point out that at least for my 
application problems can already occur with array size 8. And is it expected 
that solution 1), that has an extra layer of indirection, still outperforms 
solution 2) even with GC times substracted?

Regards,
Robert

The information contained in this message m

Re: Weird warnings when using ViewPatterns

2009-12-22 Thread Bas van Dijk
On Mon, Dec 21, 2009 at 10:17 PM, Bas van Dijk  wrote:
> On Mon, Dec 21, 2009 at 9:03 PM, Robert Greayer  wrote:
>> There's been some improvement at least in 6.12.1, see:
>> http://hackage.haskell.org/trac/ghc/ticket/2395
>
> Thanks for pointing me to the ticket!
>
> I'm emerging ghc-6.12.1 right now to try it out (I'm on Gentoo Linux).

All weird warnings that I got when using ViewPatterns are gone when
using 6.12.1.

Thanks,

Bas
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHC-6.12.1: broken configure

2009-12-22 Thread Jens Petersen
2009/12/19 Kirill A. Shutemov :
> I want to build ghc for i586-alt-linux-gnu.

Why? :)

> Why do not use config.guess to guess correct host/target/build
> instead of reinvent wheel?

While I sympathize (I gave up long ago trying to use  host/target/build
with ghc - we use them by default in Fedora) - I guess the short answer
is something like cross-compiling is not supported or the effort
to support host/target/build is more than the win?  But those
are just my assumptions - I leave a real answer to a ghc buildsystem
expert.

If you really want a fix searching trac and opening a ticket would be
more effective probably.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: x64 linux, x86 program and libgmp shared library

2009-12-22 Thread Christian Maeder
Christian Maeder schrieb:
> I think, libgmp.so* should be available under /usr/lib64/ if you use ghc.
> 
> There are rpm packages, i.e. SuSE
>   gmp-4.2.1-58
>   gmp-devel-4.2.1-58
> 
> In order to create binaries with libgmp being statically linked in, it
> is possible to copy libgmp.a into ghc'c libdir so that those binaries
> can run on systems with libgmp.so.
  without the shared library.
> 
> HTH Christian
> 
> Bulat Ziganshin schrieb:
>> Hello glasgow-haskell-users,
>>
>> i compile my program for linux using ghc 6.6.1 (32-bit)
>>
>> user of my program asks: "any plans of making a static build for
>> linux? that one does not work under x86-64 due to dependencies,
>> missing libgmp which it seems is not available under 64bit. i could be
>> wrong tho."
>>
>> how this may be resolved? should libgmp be available in x64 linux, or
>> i should change compile options?
>>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: x64 linux, x86 program and libgmp shared library

2009-12-22 Thread Christian Maeder
I think, libgmp.so* should be available under /usr/lib64/ if you use ghc.

There are rpm packages, i.e. SuSE
  gmp-4.2.1-58
  gmp-devel-4.2.1-58

In order to create binaries with libgmp being statically linked in, it
is possible to copy libgmp.a into ghc'c libdir so that those binaries
can run on systems with libgmp.so.

HTH Christian

Bulat Ziganshin schrieb:
> Hello glasgow-haskell-users,
> 
> i compile my program for linux using ghc 6.6.1 (32-bit)
> 
> user of my program asks: "any plans of making a static build for
> linux? that one does not work under x86-64 due to dependencies,
> missing libgmp which it seems is not available under 64bit. i could be
> wrong tho."
> 
> how this may be resolved? should libgmp be available in x64 linux, or
> i should change compile options?
> 
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: cabal-install-0.8 final testing

2009-12-22 Thread Duncan Coutts
On Mon, 2009-12-21 at 11:34 +, Gracjan Polak wrote:
> Duncan Coutts  googlemail.com> writes:
> 
> >   if flag(test)
> >   Buildable: True
> >   Build-depends: base<5, bytestring, HUnit, directory
> >   else
> >   Buildable: False
> > 
> 
> Is this solution good for the time being? If so, I'll change it to make peace
> and happiness prevail among cabal users.

I suggest you don't change anything until we decide what the semantics
are supposed to be in this case.

> Side question: mmaptest is meant to be devel/testing thing only that is not
> build during normal usage. Is there a better way to achieve such purpose?

Not something purpose designed at the moment.

Duncan

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users