[Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-06-01 Thread Johan Tibell
Hi Aleksandar, I thought it'd be educational to do some back-of-the-envelope calculations to see how much memory we'd expect to use to store words in a HashMap ByteString Int. First, lets start by looking at how much memory one ByteString uses. Here's the definition of ByteString [1]: data

Re: [Haskell-cafe] Enabling GADTs breaks Rank2Types code compilation - Why?

2011-06-01 Thread austin seipp
On Tue, May 31, 2011 at 11:13 PM, dm-list-haskell-c...@scs.stanford.edu wrote: It definitely felt like I was running up against something like the monomorphism restriction, but my bindings were function and not pattern bindings, so I couldn't understand what was going on.  I had even gone and

Re: [Haskell-cafe] ANNOUNCE: CamHac: Haskell Hackathon in Cambridge, UK, 12-14 August 2011

2011-06-01 Thread Simon Marlow
On 26/05/2011 15:35, Simon Marlow wrote: CamHac is happening - come and spend a long weekend in Cambridge hacking Haskell code in great surroundings with fantastic company! Full details on the wiki page: http://www.haskell.org/haskellwiki/CamHac When: Friday-Sunday 12-14 August 2011 Where:

[Haskell-cafe] ANN: dtd-types 0.2.0.0

2011-06-01 Thread Yitzchak Gale
The dtd-types package provides types for processing XML DTDs in Haskell. These types are intended to be compatible with and extend the set of types provided by John Millikin's xml-types package. The consensus seems to be to leave this as a separate package and not to merge it with xml-types.

Re: [Haskell-cafe] *GROUP HUG*

2011-06-01 Thread Henning Thielemann
Adrien Haxaire schrieb: I fully agree. These are two of the three reasons which made me choose haskell as the functional language to learn. Coding fortran all day, I wanted a new approach on programming. The strong scientific roots of haskell would give me stuff to learn and discover for a

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread John Lato
From: Brandon Moore brandon_m_mo...@yahoo.com I was worried data sharing might mean your keys retain entire 64K chunks of the input. However, it seems enumLines depends on the StringLike ByteString instance, which just converts to and from String. That can't be efficient, but I suppose it

Re: [Haskell-cafe] *GROUP HUG*

2011-06-01 Thread Adrien Haxaire
On Wed, 01 Jun 2011 11:46:36 +0200, Henning Thielemann wrote: Really, you can write foldr in terms of foldl? So far I was glad I could manage the opposite direction. i didn't try it, that was just an example of how strange/interesting the enthusiasm appeared to me when i started Haskell.

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread John Lato
On Wed, Jun 1, 2011 at 12:55 AM, Aleksandar Dimitrov aleks.dimit...@googlemail.com wrote: On Tue, May 31, 2011 at 11:30:06PM +0100, John Lato wrote: None of these leak space for me (all compiled with ghc-7.0.3 -O2). Performance was pretty comparable for every version, although

Re: [Haskell-cafe] *GROUP HUG*

2011-06-01 Thread Daniel Fischer
On Wednesday 01 June 2011 12:25:06, Adrien Haxaire wrote: On Wed, 01 Jun 2011 11:46:36 +0200, Henning Thielemann wrote: Really, you can write foldr in terms of foldl? So far I was glad I could manage the opposite direction. i didn't try it, that was just an example of how

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread Daniel Fischer
On Wednesday 01 June 2011 12:13:54, John Lato wrote: From: Brandon Moore brandon_m_mo...@yahoo.com I was worried data sharing might mean your keys retain entire 64K chunks of the input. However, it seems enumLines depends on the StringLike ByteString instance, which just converts

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread Daniel Fischer
On Wednesday 01 June 2011 12:28:28, John Lato wrote: There are a few solutions to this. The first is to make a copy of the bytestring so only the required data is retained. In my experiments this wasn't helpful, but it would depend on your corpus. The second is to start with smaller chunks.

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread Edward Z. Yang
That sounds like a plausible reason why naive copying explodes space. Something like string interning would be good here... and since you're hashing already... Edward Excerpts from Daniel Fischer's message of Wed Jun 01 06:46:24 -0400 2011: On Wednesday 01 June 2011 12:28:28, John Lato wrote:

Re: [Haskell-cafe] maybe a GHC bug but not sure

2011-06-01 Thread Edward Amsden
And after a lot more sleep and some digging, it turns out that the build script was forcing GCC to build the .o file as a 32 bit binary, and thus causing the magic mismatch. On Mon, May 30, 2011 at 11:20 PM, Edward Amsden eca7...@cs.rit.edu wrote: When building the Haskell Objective C bindings

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread Aleksandar Dimitrov
Hi John, I think the issue is data sharing, as Brandon mentioned. A bytestring consists of an offset, length, and a pointer. You're using a chunk size of 64k, which means the generated bytestrings all have a pointer to that 64k of data. Suppose there's one new word in that 64k, and it's

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread Ketil Malde
Aleksandar Dimitrov aleks.dimit...@googlemail.com writes: Now, here's some observations: on a 75M input file (minuscule, compared to what I actually need) this program will eat 30M of heap space (says profiling) and return in 14 secs. I have two problems with that: a) that's too much heap

Re: [Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-06-01 Thread Aleksandar Dimitrov
Hello Johan, On Wed, Jun 01, 2011 at 08:52:04AM +0200, Johan Tibell wrote: I thought it'd be educational to do some back-of-the-envelope calculations to see how much memory we'd expect to use to store words in a HashMap ByteString Int. Thank you for your writeup, which is very informative!

[Haskell-cafe] Oracle Sessions in Takusen

2011-06-01 Thread Dmitry Olshansky
Hello, Could anyone explain strange behavior of Takusen with OracleDB (OraClient 11.x)? Several sequential sessions give Seqmentation Fault error. In case of nested sessions it works well. {-# LANGUAGE ScopedTypeVariables #-} module Main where import Database.Oracle.Enumerator import

Re: [Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-06-01 Thread Johan Tibell
On Wed, Jun 1, 2011 at 4:24 PM, Aleksandar Dimitrov aleks.dimit...@googlemail.com wrote: One additional thought: it might be interesting to provide this outside of this mailing list, perhaps as a documentation addendum to unordered-containers, since it really explains the size needs for

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread Johan Tibell
Hi Aleks, On Wed, Jun 1, 2011 at 12:14 AM, Aleksandar Dimitrov aleks.dimit...@googlemail.com wrote: I implemented your method, with these minimal changes (i.e. just using a main driver in the same file.) countUnigrams :: Handle - IO (M.Map S.ByteString Int) countUnigrams = foldLines (\ m s -

Re: [Haskell-cafe] Oracle Sessions in Takusen

2011-06-01 Thread Jason Dagit
On Wed, Jun 1, 2011 at 7:44 AM, Dmitry Olshansky olshansk...@gmail.com wrote: Hello, Could anyone explain strange behavior of Takusen with OracleDB (OraClient 11.x)? Several sequential sessions give Seqmentation Fault error. In case of nested sessions it works well. I'm CC'ing the takusen

Re: [Haskell-cafe] Oracle Sessions in Takusen

2011-06-01 Thread Jason Dagit
On Wed, Jun 1, 2011 at 9:01 AM, Jason Dagit dag...@gmail.com wrote: I'm CC'ing the takusen email list so that Oleg and Alistair will see your message.  They are more familiar with the Oracle support than I am. I should really link to the original message:

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread malcolm.wallace
Just out of interest, did you try reading the input as plain old Strings? They may be unfashionable these days, and have their own type of badness in space and time performance, but might perhaps be a win over ByteStrings for extremely large datasets.Regards, MalcolmOn 01 Jun, 2011,at 02:49

Re: [Haskell-cafe] Haskell School of Expression (graphics)

2011-06-01 Thread Jason Dagit
On Mon, May 30, 2011 at 1:19 AM, Jerzy Karczmarczuk jerzy.karczmarc...@unicaen.fr wrote: Henk-Jan van Tuyl commenting Andrew Coppin ... (about HOpenGL and H.Platform) ... Uh... yes, you might be right about that. However, AFAIK you still need something with which to create a rendering

Re: [Haskell-cafe] *GROUP HUG*

2011-06-01 Thread Ivan Tarasov
On Wed, Jun 1, 2011 at 3:28 AM, Daniel Fischer daniel.is.fisc...@googlemail.com wrote: On Wednesday 01 June 2011 12:25:06, Adrien Haxaire wrote: On Wed, 01 Jun 2011 11:46:36 +0200, Henning Thielemann wrote: Really, you can write foldr in terms of foldl? So far I was glad I could

[Haskell-cafe] [iteratee] how to do nothing .. properly

2011-06-01 Thread Sergey Mironov
Hi. Would anybody explain a situation with iter6 and iter7 below? Strange thing - first one consumes no intput, while second consumes it all, while all the difference is peek which should do no processing (just copy next item in stream and return to user). What I am trying to do - is to write an

Re: [Haskell-cafe] *GROUP HUG*

2011-06-01 Thread Tom Murphy
How about this: myFoldr :: (a - b - b) - b - [a] - b myFoldr f z xs = foldl' (\s x v - s (x `f` v)) id xs $ z Cheers, Ivan Great! Now I really can say Come on! It's fun! I can write foldr with foldl! ___ Haskell-Cafe mailing list

Re: [Haskell-cafe] *GROUP HUG*

2011-06-01 Thread Don Stewart
http://stackoverflow.com/questions/6172004/writing-foldl-using-foldr/6172270#6172270 Thank Graham Hutton and Richard Bird. On Wed, Jun 1, 2011 at 7:12 PM, Tom Murphy amin...@gmail.com wrote: How about this: myFoldr :: (a - b - b) - b - [a] - b myFoldr f z xs = foldl' (\s x v - s (x `f` v))

Re: [Haskell-cafe] *GROUP HUG*

2011-06-01 Thread Albert Y. C. Lai
On 11-06-01 07:15 PM, Don Stewart wrote: http://stackoverflow.com/questions/6172004/writing-foldl-using-foldr/6172270#6172270 Thank Graham Hutton and Richard Bird. Another one along the same line: http://www.vex.net/~trebla/haskell/natprim.xhtml Yet one more, along the tangent:

Re: [Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-06-01 Thread Jason Dagit
On Wed, Jun 1, 2011 at 8:33 AM, Johan Tibell johan.tib...@gmail.com wrote: On Wed, Jun 1, 2011 at 4:24 PM, Aleksandar Dimitrov aleks.dimit...@googlemail.com wrote: One additional thought: it might be interesting to provide this outside of this mailing list, perhaps as a documentation

[Haskell-cafe] Haskell Weekly News: Issue 184

2011-06-01 Thread Daniel Santa Cruz
Welcome to issue 184 of the HWN, a newsletter covering developments in the Haskell community. This release covers the week of May 22 to 28, 2011. Announcements The newsletter has not been posting new library announcements, but Ivan Lazar's announcement of his new wl-pprint-text

Re: [Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-06-01 Thread Johan Tibell
On Thu, Jun 2, 2011 at 5:10 AM, Jason Dagit dag...@gmail.com wrote: One of the cool things about SO is that you can answer your own question.  For example, you might do that if you're anticipating an FAQ.  I think asking this question on SO and reposting your answer from this thread would be