Re[4]: [Haskell] Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-08 Thread Bulat Ziganshin
Hello Jason,

Thursday, May 7, 2009, 9:06:33 PM, you wrote:

 you are right again. so, that remains: you shouldn't suppose that
 user have read 90's GC paper. give a short excerpt of it: how
 generational GC works and how memory usage converts to memory
 footprint. then descriptions of RTS options will serve as
 modifications to this baseline

 Perhaps you could write a description here to the list or the wiki and
 after a few iterations it could be included in the GHC manual?

ok, once i will find spare time

-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-07 Thread Anton van Straaten

FFT wrote:

On Wed, May 6, 2009 at 8:20 PM, Anton van Straaten
an...@appsolutions.com wrote:


The app is written for a client under NDA, so a blog about it would have to
be annoyingly vague.



No doubt the potential for encountering space leaks goes up as one writes
less pure code, persist more things in memory, and depend on more libraries.


Exactly. I'm worried about, e.g. needing to use something as simple as
a stream of prime numbers (see the recent thread about leaks there)


Haskell lets you easily create infinite lists, which is a powerful and 
useful feature.  But if you abuse that feature, you'll have problems, 
because you don't have infinite memory.  The further you traverse an 
infinite list, while maintaining a reference to its head, the more 
memory you use.  It'd be a stretch to characterize this as hard.


I don't see much connection between this and the space efficiency of 
long-running programs.


I feel a bit like we're discussing a monster under the bed: i bet it's 
huge! its teeth must be so sharp!  Maybe we should just look under the 
bed, i.e. implement what we need and see what happens?


Anton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-07 Thread Ketil Malde
Anton van Straaten an...@appsolutions.com writes:

 Exactly. I'm worried about, e.g. needing to use something as simple as
 a stream of [...]

 Haskell lets you easily create infinite lists, which is a powerful
 and useful feature.

This has bit me on several occasions, and I think streaming over
an infinite list with too little strictness has been the cause of most
of my space leaks.  I've become pretty good at spotting this in
advance now, so I no longer need to whip out the profiling as often.

The archetypical case is parsing a large file into a non-strict data
structure.  The data structure will then tend to hang onto the whole
input, instead of the result, which typically is much more compact.
Making the data structure strict solves the problem.

 It'd be a stretch to characterize this as hard.

I guess it's hard in the sense that a) it's something different from
what you need to think about in other languages, and b) it's not
really an error, so you don't get as much help from the compiler or
RTS. 

 I don't see much connection between this and the space efficiency of
 long-running programs.

Except a long-running program could be doing something like
parsing a conceptually infinite input stream.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: [Haskell-cafe] Is Haskell a Good Choice for Web Applications?(ANN: Vocabulink)

2009-05-07 Thread Tim Docker
I think that multi-threading in combination with laziness makes space
usage harder to manage. In fact, just today
I have discovered a problem with a long running server process with a
subtle space leak.

With a regular process that communicates with the outside world via IO,
I know that the act of communicating a
value causes it to be fully evaluated. However, with a multi threaded
process, communication occurs via writes
to TVars/IOVars and nothing gets evaluated. This gives lots of
opportunities for space leaks. In this
particularly case cleanup code was removing a large entry from a map
stored in a Tvar. Because that map is only
read infrequently, however, the memory release is delayed.

This is the second such problem I've found. The profiling tools do help
in discovering them, but it still needs
a bit of thought and analysis. I wonder if, for my application, I should
work out some means of deepseqing every
value written to a Tvar.

Tim

-Original Message-
From: haskell-cafe-boun...@haskell.org
[mailto:haskell-cafe-boun...@haskell.org] On Behalf Of wren ng thornton
Sent: Thursday, 7 May 2009 2:06 PM
To: haskell-cafe@haskell.org
Subject: Re: [Haskell-cafe] Is Haskell a Good Choice for Web
Applications?(ANN: Vocabulink)

FFT wrote:
 Anton van Straaten wrote:
 
  The app is written for a client under NDA, so a blog about it would 
  have to be annoyingly vague.
 
  No doubt the potential for encountering space leaks goes up as one 
  writes less pure code, persist more things in memory, and depend on
more libraries.
 
 Exactly. I'm worried about, e.g. needing to use something as simple as

 a stream of prime numbers (see the recent thread about leaks there)

The issues here are going to be the same in Haskell as in every other
language. There's always a tradeoff between the memory of caching old
results vs the time of recalculating them. At present no language's
RTS/GC is smart enough to make that tradeoff for you, and so memoization
must be done manually.

There are some tricks to help make this easier (e.g. weak pointers), but
the problem will always remain. The only thing that makes this perhaps
trickier than in other languages is that, AFAIK/R, the reflection API to
ask the RTS how it's feeling about memory at any given point isn't
terribly portable (between Haskell compilers) or polished/pretty. Then
again, the other GC languages I've dealt with aren't much better and are
often worse.

--
Live well,
~wren
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: [Haskell-cafe] Is Haskell a Good Choice for Web Applications?(ANN: Vocabulink)

2009-05-07 Thread Tim Docker
I think that multi-threading in combination with laziness makes space
usage harder to manage. In fact, just today
I have discovered a problem with a long running server process with a
subtle space leak.

With a regular process that communicates with the outside world via IO,
I know that the act of communicating a
value causes it to be fully evaluated. However, with a multi threaded
process, communication occurs via writes
to TVars/IOVars and leaves thunks unevaluated. This gives lots of
opportunities for space leaks. In this
particularly case cleanup code was removing a large entry from a map
stored in a Tvar. Because that map is only
read infrequently, however, the memory release is delayed.

This is the second such problem I've found. The profiling tools do help
in discovering them, but it still needs
a bit of thought and analysis. I wonder if, for my application, I should
work out some means of deepseqing every
value written to a Tvar.

Tim 

-Original Message-
From: haskell-cafe-boun...@haskell.org
[mailto:haskell-cafe-boun...@haskell.org] On Behalf Of Anton van
Straaten
Sent: Thursday, 7 May 2009 4:28 PM
To: haskell-cafe@haskell.org
Subject: Re: [Haskell-cafe] Is Haskell a Good Choice for Web
Applications?(ANN: Vocabulink)

FFT wrote:
 On Wed, May 6, 2009 at 8:20 PM, Anton van Straaten 
 an...@appsolutions.com wrote:
 
 The app is written for a client under NDA, so a blog about it would 
 have to be annoyingly vague.
 
 No doubt the potential for encountering space leaks goes up as one 
 writes less pure code, persist more things in memory, and depend on
more libraries.
 
 Exactly. I'm worried about, e.g. needing to use something as simple as

 a stream of prime numbers (see the recent thread about leaks there)

Haskell lets you easily create infinite lists, which is a powerful and
useful feature.  But if you abuse that feature, you'll have problems,
because you don't have infinite memory.  The further you traverse an
infinite list, while maintaining a reference to its head, the more
memory you use.  It'd be a stretch to characterize this as hard.

I don't see much connection between this and the space efficiency of
long-running programs.

I feel a bit like we're discussing a monster under the bed: i bet it's
huge! its teeth must be so sharp!  Maybe we should just look under the
bed, i.e. implement what we need and see what happens?

Anton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell] Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-07 Thread Simon Marlow

On 06/05/2009 21:18, Bulat Ziganshin wrote:

Hello FFT,

Wednesday, May 6, 2009, 11:59:53 PM, you wrote:


I've heard it's hard to contain a long-running Haskell application in
a finite amount of memory


not exactly. you may alloc fixed pool of memory to application (say, 1gb)
if you know that it never need more memory. but as far as you don't do
it, memory usage grows with each major GC. ghc just don't provide
any way to return memory to OS (there is a ticket on it, you can add
yourself to CC list to vote for its resolution)


http://hackage.haskell.org/trac/ghc/ticket/698

But let's be clear: this is not a memory leak, the issue is only that 
GHC's runtime retains as much memory as was required by the program's 
high-water-mark memory usage.  Fixing this will never reduce the 
high-water-mark.  So I'm not sure what you meant by memory usage grows 
with each major GC.


Cheers,
Simon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell] Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-07 Thread Bulat Ziganshin
Hello Simon,

Thursday, May 7, 2009, 2:04:05 PM, you wrote:

 I've heard it's hard to contain a long-running Haskell application in
 a finite amount of memory

 not exactly. you may alloc fixed pool of memory to application (say, 1gb)
 if you know that it never need more memory. but as far as you don't do
 it, memory usage grows with each major GC. ghc just don't provide
 any way to return memory to OS (there is a ticket on it, you can add
 yourself to CC list to vote for its resolution)

 http://hackage.haskell.org/trac/ghc/ticket/698

 But let's be clear: this is not a memory leak, the issue is only that 
 GHC's runtime retains as much memory as was required by the program's 
 high-water-mark memory usage.  Fixing this will never reduce the 
 high-water-mark.  So I'm not sure what you meant by memory usage grows
 with each major GC.

1. none said that it's memory leak, at least in haskell code

2. it seems that one of us doesn't know intimate details of GHC GC :)
the following is my understanding, learned from +RTS -s/-S listings
with ghc 6.4 or so:

copying GC: let's imagine that we perform major GC when 100 mb is
allocated, of those 10 mb are live. doing GC, ghc will alloc 10+ mb
from OS, and promote 100 mb freed to allocation area. so from this
point program will occupy 110mb of memory (although only 10 mb is
really used ATM)

next major GC will occur when all these 100 mb will be allocated, i.e.
overall memory allocated will be 110 mb. again, GHC will allocate as
much memory as required for live data, increasing program footprint


compacting GC: if major GC occurs when 100 mb is allocated, GHC
increase memory footprint 2x (controlled by +RTS -F), and then perform
GC. or it will perform GC and only then alloc more memory? i don't
remember


Simon, it will be VERY USEFUL if GC description that says about these
behaviors will be added to manual. i'm definitely not the
authoritative source on this topic and it seems that even you forget
some details as time goes. this topic is very important for large
programs and this question regularly arises here. we need
authoritative description that pays attention to all GC-related RTS
switches



-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell] Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-07 Thread Simon Marlow

On 07/05/2009 11:51, Bulat Ziganshin wrote:

Hello Simon,

Thursday, May 7, 2009, 2:04:05 PM, you wrote:


I've heard it's hard to contain a long-running Haskell application in
a finite amount of memory

not exactly. you may alloc fixed pool of memory to application (say, 1gb)
if you know that it never need more memory. but as far as you don't do
it, memory usage grows with each major GC. ghc just don't provide
any way to return memory to OS (there is a ticket on it, you can add
yourself to CC list to vote for its resolution)



http://hackage.haskell.org/trac/ghc/ticket/698



But let's be clear: this is not a memory leak, the issue is only that
GHC's runtime retains as much memory as was required by the program's
high-water-mark memory usage.  Fixing this will never reduce the
high-water-mark.  So I'm not sure what you meant by memory usage grows
with each major GC.


1. none said that it's memory leak, at least in haskell code


You said memory usage grows with each major GC, which sounded like a 
leak, I just wanted to clarify.



2. it seems that one of us doesn't know intimate details of GHC GC :)
the following is my understanding, learned from +RTS -s/-S listings
with ghc 6.4 or so:

copying GC: let's imagine that we perform major GC when 100 mb is
allocated, of those 10 mb are live. doing GC, ghc will alloc 10+ mb
from OS,


Yes.


and promote 100 mb freed to allocation area.


No.  Well, it depends on the GC settings.  With the default GC settings, 
the allocation area for each core is fixed at 512k (the docs are a bit 
out of date and say 256k, I've just fixed that).  The old generation is 
allowed to grow to 2x its previous size by default before being 
collected.  This is tunable with +RTS -Fn.


In your example, the live data is 10MB, so the old generation will be 
allowed to grow to 20MB before being collected, meanwhile the young 
generation has 512KB for the allocation area, plus step 2 of the young 
generation which will be at most another 512KB and usually a lot 
smaller.  So 21MB max in total.  When the old generation is next 
collected, assuming 10MB are still live, we'll need in total 21MB + 10MB 
of copied live data = 31MB.


The problem referred to originally in this thread is that even though 
the program might be running with a steady memory usage of 31MB, if in 
the past it needed 100MB that extra memory is retained by the RTS and 
won't be freed back to the OS.



so from this
point program will occupy 110mb of memory (although only 10 mb is
really used ATM)


Yes - if 100MB has been allocated, and 10MB copied, then the total 
memory that the RTS has allocated from the OS will be 110MB.



next major GC will occur when all these 100 mb will be allocated, i.e.
overall memory allocated will be 110 mb.


No - see above.


again, GHC will allocate as
much memory as required for live data, increasing program footprint


compacting GC: if major GC occurs when 100 mb is allocated, GHC
increase memory footprint 2x (controlled by +RTS -F), and then perform
GC. or it will perform GC and only then alloc more memory? i don't
remember


exactly as above, except that the heap is compacted in-place rather than 
the live data being copied, so no additional memory is required by the 
GC.  Well, there's a little memory used for the bitmaps and the mark 
stack, but that will be a few percent at most.



Simon, it will be VERY USEFUL if GC description that says about these
behaviors will be added to manual. i'm definitely not the
authoritative source on this topic and it seems that even you forget
some details as time goes. this topic is very important for large
programs and this question regularly arises here. we need
authoritative description that pays attention to all GC-related RTS
switches


The -A and -F flags are pretty well documented in the manual:

http://www.haskell.org/ghc/docs/latest/html/users_guide/runtime-control.html#rts-options-gc

I'm happy to add more text here, what else would you like to see?

Cheers,
Simon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell] Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-07 Thread Bulat Ziganshin
Hello Simon,

Thursday, May 7, 2009, 5:45:53 PM, you wrote:

 out of date and say 256k, I've just fixed that).  The old generation is
 allowed to grow to 2x its previous size by default before being 
 collected.

you are right. i just checked old logs - seems that previously i just
misinterpreted them

 we need  authoritative description that pays attention to all
 GC-related RTS  switches

 I'm happy to add more text here, what else would you like to see?

how about text that describes GC scenario like you was wrote in this
letter? plus, description of RTS +S report format - it may be added to
this scenario, i.e. after this major GC, the following line will be
printed to +S report: ...

for example, i still doesn't understand why

old_live + alloc - collect != new_live

in the report below (i extracted two sequential lines):

AllocCollectLiveGCGC TOT TOT  Page Flts
bytes bytes bytes  user  elapuserelap
   265016303104840008  0.00  0.000.110.1700  (Gen:  0)
   265612266240904196  0.00  0.000.110.1700  (Gen:  0)


and completely separate topic - +RTS -s report description also
doesn't exist

-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell] Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-07 Thread Simon Marlow

On 07/05/2009 15:17, Bulat Ziganshin wrote:

Hello Simon,

Thursday, May 7, 2009, 5:45:53 PM, you wrote:


out of date and say 256k, I've just fixed that).  The old generation is
allowed to grow to 2x its previous size by default before being
collected.


you are right. i just checked old logs - seems that previously i just
misinterpreted them


we need  authoritative description that pays attention to all
GC-related RTS  switches



I'm happy to add more text here, what else would you like to see?


how about text that describes GC scenario like you was wrote in this
letter? plus, description of RTS +S report format - it may be added to
this scenario, i.e. after this major GC, the following line will be
printed to +S report: ...

for example, i still doesn't understand why

old_live + alloc - collect != new_live


The output has changed: it now counts copied data, since as you say 
collect is a function of the other columns.



in the report below (i extracted two sequential lines):

 AllocCollectLiveGCGC TOT TOT  Page Flts
 bytes bytes bytes  user  elapuserelap
265016303104840008  0.00  0.000.110.1700  (Gen:  0)
265612266240904196  0.00  0.000.110.1700  (Gen:  0)


and completely separate topic - +RTS -s report description also
doesn't exist


Scroll down in that section I linked to before:

http://www.haskell.org/ghc/docs/latest/html/users_guide/runtime-control.html#rts-options-gc

Cheers,
Simon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell] Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-07 Thread Bulat Ziganshin
Hello Simon,

Thursday, May 7, 2009, 6:58:02 PM, you wrote:

 and completely separate topic - +RTS -s report description also
 doesn't exist

 Scroll down in that section I linked to before:

 http://www.haskell.org/ghc/docs/latest/html/users_guide/runtime-control.html#rts-options-gc

you are right again. so, that remains: you shouldn't suppose that
user have read 90's GC paper. give a short excerpt of it: how
generational GC works and how memory usage converts to memory
footprint. then descriptions of RTS options will serve as
modifications to this baseline

one minor update: manual said

For each garbage collection, we print:

* How many bytes we allocated this garbage collection.

i think it should be How many bytes we allocated *SINCE* last garbage
collection. 


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[2]: [Haskell] Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-07 Thread Jason Dagit
On Thu, May 7, 2009 at 9:55 AM, Bulat Ziganshin
bulat.zigans...@gmail.com wrote:
 Hello Simon,

 Thursday, May 7, 2009, 6:58:02 PM, you wrote:

 and completely separate topic - +RTS -s report description also
 doesn't exist

 Scroll down in that section I linked to before:

 http://www.haskell.org/ghc/docs/latest/html/users_guide/runtime-control.html#rts-options-gc

 you are right again. so, that remains: you shouldn't suppose that
 user have read 90's GC paper. give a short excerpt of it: how
 generational GC works and how memory usage converts to memory
 footprint. then descriptions of RTS options will serve as
 modifications to this baseline

Bulat,

Perhaps you could write a description here to the list or the wiki and
after a few iterations it could be included in the GHC manual?

Thanks,
Jason
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread FFT
I've heard it's hard to contain a long-running Haskell application in
a finite amount of memory, but this is probably not a problem if your
web site sleeps 0.001% of the time (like XMonad), or you can restart
it every once in a while without anyone noticing.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Don Stewart
fft1976:
 I've heard it's hard to contain a long-running Haskell application in
 a finite amount of memory, but this is probably not a problem if your

Hmm. Gossip driven development?

 web site sleeps 0.001% of the time (like XMonad), or you can restart
 it every once in a while without anyone noticing.

Keeping footprints small and stable isn't so hard. After all, we have
wonderful heap profiling tools. I recommend the type-based heap
profiler, in particular. Run your app for a week, and look at the heap
graph. You'll know if things are ok. (Try this in C++ !)

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Bulat Ziganshin
Hello FFT,

Wednesday, May 6, 2009, 11:59:53 PM, you wrote:

 I've heard it's hard to contain a long-running Haskell application in
 a finite amount of memory

not exactly. you may alloc fixed pool of memory to application (say, 1gb)
if you know that it never need more memory. but as far as you don't do
it, memory usage grows with each major GC. ghc just don't provide
any way to return memory to OS (there is a ticket on it, you can add
yourself to CC list to vote for its resolution)

-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Jason Dagit
On Wed, May 6, 2009 at 1:01 PM, Don Stewart d...@galois.com wrote:
 fft1976:
 I've heard it's hard to contain a long-running Haskell application in
 a finite amount of memory, but this is probably not a problem if your

 Hmm. Gossip driven development?

I don't mean to undermine your marketing efforts, but I don't think
this is gossip driven.

I know from experience that lambdabot tends to be leaky.  Otherwise,
lambdabot wouldn't be running on my server to begin with.  And, even
so, Cale monitors lambdabot to make sure it is not using too many
resources (and I complain when/if I notice it).  I have heard similar
stories related to hpaste and happs.  I have also experienced it with
writing a forever loop in Haskell that did polling from channels.  I
would leave my app running for, say, 4 hours and it would be using
tons of memory.

 web site sleeps 0.001% of the time (like XMonad), or you can restart
 it every once in a while without anyone noticing.

 Keeping footprints small and stable isn't so hard. After all, we have
 wonderful heap profiling tools. I recommend the type-based heap
 profiler, in particular. Run your app for a week, and look at the heap
 graph. You'll know if things are ok. (Try this in C++ !)

In C++ we have valgrind right?  But, yes, C++ has issues too with
memory leaks.  I think Java does as well.  In fact, probably every
programming language has issues here :)  Garbage collectors are good
at removing errors from freeing too soon and other manual memory
management issues, but in my experience they actually cause an
increase in space leaks by being conservative.

I think it's fair to say that keeping the memory usage low of a long
running Haskell app is hard, but that is a general issue not just a
Haskell issue.  It's hard in most languages.  I think what we need to
address this is more information about preventative measures.  What
programming styles cause the problem and which ones solve it.  I would
say that I lack confidence recommending anyone to use Haskell for long
running processes because I don't understand well the problem of
keeping the usage low.  If it is a well documented problem with
documented solutions (more than just profiling), then I would regain
my confidence because I know the problem can be worked around
reliably.  Does this make sense?  Maybe it's already well documented?

In particular, we need expert Haskell programmers, such as Don, to
write more about how they avoid space leaks in long running apps.
Again, profiling is nice, but that's more of a tuning effort.

Thanks,
Jason
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Andrew Coppin

Jason Dagit wrote:

I don't mean to undermine your marketing efforts, but I don't think
this is gossip driven.

I know from experience that lambdabot tends to be leaky.  Otherwise,
lambdabot wouldn't be running on my server to begin with.  And, even
so, Cale monitors lambdabot to make sure it is not using too many
resources (and I complain when/if I notice it).  I have heard similar
stories related to hpaste and happs.  I have also experienced it with
writing a forever loop in Haskell that did polling from channels.  I
would leave my app running for, say, 4 hours and it would be using
tons of memory.
  



I think it's fair to say that keeping the memory usage low of a long
running Haskell app is hard, but that is a general issue not just a
Haskell issue.  It's hard in most languages.  I think what we need to
address this is more information about preventative measures.  What
programming styles cause the problem and which ones solve it.  I would
say that I lack confidence recommending anyone to use Haskell for long
running processes because I don't understand well the problem of
keeping the usage low.  If it is a well documented problem with
documented solutions (more than just profiling), then I would regain
my confidence because I know the problem can be worked around
reliably.  Does this make sense?  Maybe it's already well documented?

In particular, we need expert Haskell programmers, such as Don, to
write more about how they avoid space leaks in long running apps.
Again, profiling is nice, but that's more of a tuning effort.
  


Mmm, interesting. Clearly I don't run anything for long enough to notice 
this effect.


(I just had a look at a Haskell program I wrote myself which I happen to 
have running in the background right now. Process Explorer tells me it's 
used 4 hours of CPU time, and it's memory graph is still flat. It was 
reading 106 MB shortly after I started it, and it still says 106 MB now. 
OTOH, it's a fairly trivial program, so...)


But if you're going to advocate more expert knowledge being diseminated, 
I certainly won't argue against that! :-D Export knowledge is something 
it seems difficult to have too much of...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Don Stewart
dagit:

 In particular, we need expert Haskell programmers, such as Don, to
 write more about how they avoid space leaks in long running apps.
 Again, profiling is nice, but that's more of a tuning effort.

I talk a bit about that in my LondonHUG talk:


http://www.galois.com/blog/2009/04/27/engineering-large-projects-in-haskell-a-decade-of-fp-at-galois/

As I said earlier: stress test with heap profiling on (is one way to
absolutely ensure you know what's going on).

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Anton van Straaten

Jason Dagit wrote:

I know from experience that lambdabot tends to be leaky.  Otherwise,
lambdabot wouldn't be running on my server to begin with.  And, even
so, Cale monitors lambdabot to make sure it is not using too many
resources (and I complain when/if I notice it).  I have heard similar
stories related to hpaste and happs.


FWIW, I have an internal HAppS application that's been running 
continuously since November last year, used daily, with stable memory usage.



I have also experienced it with
writing a forever loop in Haskell that did polling from channels.  I
would leave my app running for, say, 4 hours and it would be using
tons of memory.


If you posted an example of that, there are probably people who'd be 
interested in debugging it.



I think it's fair to say that keeping the memory usage low of a long
running Haskell app is hard, but that is a general issue not just a
Haskell issue.  It's hard in most languages.  


I don't agree with this.  This should not be hard in language 
implementations with good garbage collectors, that don't have known 
limitations (this excludes e.g. conservative GC and reference-counting 
systems).  In my experience, it's not hard to write stable long-running 
code in good implementations of languages like Haskell, Scheme, Common 
Lisp, or Java.



I think what we need to
address this is more information about preventative measures.  What
programming styles cause the problem and which ones solve it.  I would
say that I lack confidence recommending anyone to use Haskell for long
running processes because I don't understand well the problem of
keeping the usage low.  If it is a well documented problem with
documented solutions (more than just profiling), then I would regain
my confidence because I know the problem can be worked around
reliably.  Does this make sense?  Maybe it's already well documented?


This doesn't completely make sense to me, in that either a program has a 
space leak, or it doesn't.  If it does, you debug it and resolve it.  If 
a leak really is due to a problem with the language implementation or 
standard libraries, then that should be identified and reported.  There 
shouldn't be any deep mystery here.  At the very least, it should be 
possible to point to the relevant trac tickets if there really is a problem.


Anton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Jason Dagit
On Wed, May 6, 2009 at 2:28 PM, Don Stewart d...@galois.com wrote:
 dagit:

 In particular, we need expert Haskell programmers, such as Don, to
 write more about how they avoid space leaks in long running apps.
 Again, profiling is nice, but that's more of a tuning effort.

 I talk a bit about that in my LondonHUG talk:

    
 http://www.galois.com/blog/2009/04/27/engineering-large-projects-in-haskell-a-decade-of-fp-at-galois/

 As I said earlier: stress test with heap profiling on (is one way to
 absolutely ensure you know what's going on).

I wish I could have gone to the talk.  I have a feeling the slides
only tell part of the story.

I did see these bullet points which I think are related:
This one on Laziness:
• Makes time and space reasoning harder!
–Mostly harmless in practice
– Stress testing tends to reveal retainers
– Graphical profiling knocks it dead
• Must be able to precisely enable/disable
• Be careful with exceptions and mutation
• whnf/rnf/! are your friends

This one on performance:
• Really precise performance requires
expertise
• Libraries are helping reify “oral traditions”
about optimization
• Still a lack of clarity about performance
techniques in the broader Haskell
community though

The last bullet point makes me think you do agree with me :)

I thought about it more since my last email, and I think what I want
is something like the Effective C++ and Effective Java books, but for
Haskell.  A series of expert advice items with advice about when to
use them.  Real-World Haskell is a very good title, but I feel like
it's at a different audience and level.  Looking over Real-World
haskell I see that some of these topics are discussed, which is really
good.  In particular, Chapter 25 would be valuable to anyone trying to
find space leaks.  There you discuss reduction to normal form, for
example, and some strictness issues and how to control evaluation.

While I'm thinking out loud, it would be very cool if someone wrote
some articles, say for the monad reader, that follow the formula of
the Effective C++ books.  Write up the oral traditions of how to
effectively use Haskell along with the dos/don'ts of each idiom.  I
think it could make a nice companion to the wisdom of Real-World
Haskell.

Have we made any tools yet that analyze haskell code and give warnings
about styles and idioms that are prone to causing space leaks?  For
example, leaky folds are a well known problem.  I know Neil Mitchel
was working on some utilities to offer advice about Haskell source.
Anyone know if his programs can detect potential space leaks?

Thanks,
Jason
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Bryan O'Sullivan
On Wed, May 6, 2009 at 4:12 PM, Jason Dagit da...@codersbase.com wrote:


 While I'm thinking out loud, it would be very cool if someone wrote
 some articles, say for the monad reader, that follow the formula of
 the Effective C++ books.


The last couple of times I've wanted a book like that, I wrote the book
myself. It's a very effective way to get the book you want, compared to
wishing.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Jason Dagit
On Wed, May 6, 2009 at 3:54 PM, Anton van Straaten
an...@appsolutions.com wrote:
 FWIW, I have an internal HAppS application that's been running continuously
 since November last year, used daily, with stable memory usage.

Do you have advice about the way you wrote you app?  Things you
knowingly did to avoid space leaks?  Maybe a blog about your HAppS
app?

 I have also experienced it with
 writing a forever loop in Haskell that did polling from channels.  I
 would leave my app running for, say, 4 hours and it would be using
 tons of memory.

 If you posted an example of that, there are probably people who'd be
 interested in debugging it.

I'm not sure if I still have the source code or not.  I did ask people
at the time and the only thing I recall now was that I was told to
make sure that I fully evaluate the thunks I create on each iteration
of my forever loop.  This was at least 2 years ago now.  I don't know
if I ever resolved it as it was just a toy benchmark anyway and I
wasn't really happy with the results.

 I think it's fair to say that keeping the memory usage low of a long
 running Haskell app is hard, but that is a general issue not just a
 Haskell issue.  It's hard in most languages.

 I don't agree with this.  This should not be hard in language
 implementations with good garbage collectors, that don't have known
 limitations (this excludes e.g. conservative GC and reference-counting
 systems).  In my experience, it's not hard to write stable long-running code
 in good implementations of languages like Haskell, Scheme, Common Lisp, or
 Java.

There are certainly cases where no automatic garbage collector could
know when it is safe to collect certain things.  A quick google search
for java space leaks turned up this article:
http://www.ibm.com/developerworks/java/library/j-leaks/

I think wikipedia uses the term logical leak for the type of space
leak I'm thinking of.  The garbage collector thinks you care about an
object but in fact, you want it to be freed.  Yes, it's because of a
bug, but these are bugs that tend to be subtle and tedious.

 I think what we need to
 address this is more information about preventative measures.  What
 programming styles cause the problem and which ones solve it.  I would
 say that I lack confidence recommending anyone to use Haskell for long
 running processes because I don't understand well the problem of
 keeping the usage low.  If it is a well documented problem with
 documented solutions (more than just profiling), then I would regain
 my confidence because I know the problem can be worked around
 reliably.  Does this make sense?  Maybe it's already well documented?

 This doesn't completely make sense to me, in that either a program has a
 space leak, or it doesn't.  If it does, you debug it and resolve it.  If a
 leak really is due to a problem with the language implementation or standard
 libraries, then that should be identified and reported.  There shouldn't be
 any deep mystery here.  At the very least, it should be possible to point to
 the relevant trac tickets if there really is a problem.

The ambiguity is me thinking of relative cost of finding/fixing these
bugs.  Testing for correctness is something we tend to automate very
well.  See unit testing for example.  But, testing for space leaks is
not something we have automated in my experience.  Take Don's comment
that you let it run for a while and then look at the profiling graph.
How do you automate this and make it part of your automatic test
suite?  It seems like something that requires manual testing.  So
maybe you do some QA prior to releases?  That sounds reasonable to me.
 The darcs team collects misbehaving repositories and has tools to
automate the collection of statistics so that at regular intervals
they can check the performance of darcs to make sure it's getting
better or staying the same with each release (in practice I think
these stats are collected by buildbots on each applied patch bundle).
So then, at some point we must have a bag of tricks for dealing with
these space leaks.  I want to talk about those tricks.  I'm not
talking about bugs in a specific program, but instead about techniques
and styles that are known to work well in practice.

I hope that clarifies things a bit.

Thanks,
Jason
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread Anton van Straaten

Jason Dagit wrote:

On Wed, May 6, 2009 at 3:54 PM, Anton van Straaten
an...@appsolutions.com wrote:

FWIW, I have an internal HAppS application that's been running continuously
since November last year, used daily, with stable memory usage.


Do you have advice about the way you wrote you app?  Things you
knowingly did to avoid space leaks?  Maybe a blog about your HAppS
app?


The app is written for a client under NDA, so a blog about it would have 
to be annoyingly vague.  But I don't think there's much mystery about 
why it doesn't leak:


The app does simulations.  Each simulation uses at least about 10MB of 
memory, more depending on parameters.  Typically a few thousand 
simulations are run successively, and the results are aggregated and 
analyzed.  The computation itself is purely functional - it takes some 
input parameters and produces results.  The results are written to a 
file.  Since each run of a set of simulations is essentially 
independent, there's not much risk of space leaks persisting across runs.


No doubt the potential for encountering space leaks goes up as one 
writes less pure code, persist more things in memory, and depend on more 
libraries.  My main point in mentioning my app is that long-running 
isn't really the issue - that's just a way of saying that an app has 
space leaks that are small enough not to be noticed until it's stressed.



In my experience, it's not hard to write stable long-running code
in good implementations of languages like Haskell, Scheme, Common Lisp, or
Java.


There are certainly cases where no automatic garbage collector could
know when it is safe to collect certain things.  


If there are bugs in the user's program, sure - but that still doesn't 
make it hard to write applications that don't leak, given a decent GC. 
 On the contrary, I'd say it's very easy, in the great majority of cases.



A quick google search
for java space leaks turned up this article:
http://www.ibm.com/developerworks/java/library/j-leaks/

I think wikipedia uses the term logical leak for the type of space
leak I'm thinking of.  The garbage collector thinks you care about an
object but in fact, you want it to be freed.  Yes, it's because of a
bug, but these are bugs that tend to be subtle and tedious.


The example given in the IBM article is quite typical, but isn't subtle 
at all - it was simply an object being added to a table and never being 
removed.  You can often find such bugs quite easily by searching the 
source tree, without touching a debugging tool.  It's also possible to 
prevent them quite easily, with good coding practices (e.g. centralize 
uses of long-lived tables) and some simple code auditing practices.


If you're dealing with code that's complex enough to involve the kinds 
of non-trivial mutually dependent references that you need in order to 
encounter truly subtle instances of these bugs, the increased difficulty 
of memory management comes with the territory, i.e. it's harder because 
the application is harder.



The ambiguity is me thinking of relative cost of finding/fixing these
bugs.  


To put this back into context, I was objecting to your having extended 
the space leak worrying to all GC'd languages.  I'm saying that it isn't 
hard, using most decent language implementations, to avoid space leaks. 
 For trivial cases such as the IBM example, it should be no harder in 
Haskell, either - possibly easier, since use of things like mutable 
tables is more controlled, and may be rarer.


However, Haskell does theoretically introduce a new class of dangers for 
space leaks, I'm not denying that.  Being pure and lazy introduces its 
own set of space leak risks.  But on that front, I was disturbed by the 
vagueness of the claims about long-running apps.  I haven't seen any 
solid justification for scaring people off about writing long-running 
apps in Haskell.  If there is such a justification, it needs to be more 
clearly identified.



Testing for correctness is something we tend to automate very
well.


How do you automate testing for performance under load?  Space usage is 
a similar kind of dynamic issue, in general.



So then, at some point we must have a bag of tricks for dealing with
these space leaks.  I want to talk about those tricks.  I'm not
talking about bugs in a specific program, but instead about techniques
and styles that are known to work well in practice.


OK.  That's a bit different from FFT's original contention, hard to 
contain a long-running Haskell application in a finite amount of 
memory.  For my own part, I'm at least as non-strict as Haskell, and 
that bag of tricks, for me, is a thunk that hasn't yet been forced.


Anton

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread FFT
On Wed, May 6, 2009 at 8:20 PM, Anton van Straaten
an...@appsolutions.com wrote:

 The app is written for a client under NDA, so a blog about it would have to
 be annoyingly vague.

 No doubt the potential for encountering space leaks goes up as one writes
 less pure code, persist more things in memory, and depend on more libraries.

Exactly. I'm worried about, e.g. needing to use something as simple as
a stream of prime numbers (see the recent thread about leaks there)

  My main point in mentioning my app is that long-running isn't really the
 issue - that's just a way of saying that an app has space leaks that are
 small enough not to be noticed until it's stressed.

An internal web site with few benign users is one thing, but if it's
an external web site, it might get stressed in ways different from
your expected usage scenarios, if you know what I mean.

 To put this back into context, I was objecting to your having extended the
 space leak worrying to all GC'd languages.

I agree.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-06 Thread wren ng thornton

FFT wrote:

Anton van Straaten wrote:

 The app is written for a client under NDA, so a blog about it would have to
 be annoyingly vague.

 No doubt the potential for encountering space leaks goes up as one writes
 less pure code, persist more things in memory, and depend on more libraries.

Exactly. I'm worried about, e.g. needing to use something as simple as
a stream of prime numbers (see the recent thread about leaks there)


The issues here are going to be the same in Haskell as in every other 
language. There's always a tradeoff between the memory of caching old 
results vs the time of recalculating them. At present no language's 
RTS/GC is smart enough to make that tradeoff for you, and so memoization 
must be done manually.


There are some tricks to help make this easier (e.g. weak pointers), but 
the problem will always remain. The only thing that makes this perhaps 
trickier than in other languages is that, AFAIK/R, the reflection API to 
ask the RTS how it's feeling about memory at any given point isn't 
terribly portable (between Haskell compilers) or polished/pretty. Then 
again, the other GC languages I've dealt with aren't much better and are 
often worse.


--
Live well,
~wren
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-04 Thread Chris Forno
I decided to find out for myself. You can find the results at
http://jekor.com/article/is-haskell-a-good-choice-for-web-applications

Included is the source code for the web application powering
http://www.vocabulink.com/

The source is roughly 2,000 lines of Haskell, along with some SQL and
JavaScript. It's written in literate style and includes a 75-page PDF.
It demonstrates and explains how to:

 * use FastCGI to communicate with a web server (nginx in this case)
 * move data to and from a PostgreSQL database (HDBC)
 * authenticate users with cookies
 * interact with users via asynchronous JavaScript and JSON (AJAJ)
 * collect data with HTML forms (formlets)
 * communicate with users via email
 * cache with memcached
 * implement a custom forums system (with threaded comments)

I make no claims that the code is elegant or idiomatic. It is however
real code that's running in the wild. And I hope it's useful to anyone
else considering doing web development in Haskell.

I welcome and encourage your feedback!

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-04 Thread Daniel Carrera

Hi Chris,

Thanks. This should be interesting. I currently work as a web developer 
and I've been wondering how easy or hard it would be to develop web 
applications with Haskell. So I'll be interested in reading our article.


On a separate topic, I also took a glance at vocabulink.com. I'm 
interested in languages (I am fluent in English and Spanish and speak 
advanced German). I will have to disagree a bit with your pedagogy:


1) You say that grammar doesn't matter. Well, for some languages it 
matters more than others. German, for example, has a very particular 
word order that takes some effort to learn, and if you get it wrong 
people really won't understand you. In German it's ok if you conjugate 
wrong, but it's not ok if you put words in the wrong place. Second, some 
people actually enjoy grammar better and find that grammar helps them 
understand the language. I am one of those people. Different people 
learn differently. I learn rules more easily than disconnected words. 
When I learn vocabulary I do better by learning word families, and so 
on. The Germanic languages rely heavily in word derivation (not so much 
English) so that can be important for learners like me.


2) Your analysis of word count is flawed. Sure, most of the words you 
read come from a very small vocabulary set, but most of the *meaning* in 
a sentence comes from the more obscure words. Imagine that you read this 
sentence: In the newspaper I read that the __ said that the 
problem is that the river has too much  .  In this sentence you 
can understand 90% of the words, but you have almost no idea of what's 
happening. What your word count test really shows is that human 
languages have a lot of redundancy. You could omit the word the from 
the above sentence and you would understand it almost as well. The word 
the is common and contains very little information.



That said, do you have any stories in German? I can't figure out where 
to get the stories.


Daniel.

Chris Forno wrote:

I decided to find out for myself. You can find the results at
http://jekor.com/article/is-haskell-a-good-choice-for-web-applications

Included is the source code for the web application powering
http://www.vocabulink.com/

The source is roughly 2,000 lines of Haskell, along with some SQL and
JavaScript. It's written in literate style and includes a 75-page PDF.
It demonstrates and explains how to:

 * use FastCGI to communicate with a web server (nginx in this case)
 * move data to and from a PostgreSQL database (HDBC)
 * authenticate users with cookies
 * interact with users via asynchronous JavaScript and JSON (AJAJ)
 * collect data with HTML forms (formlets)
 * communicate with users via email
 * cache with memcached
 * implement a custom forums system (with threaded comments)

I make no claims that the code is elegant or idiomatic. It is however
real code that's running in the wild. And I hope it's useful to anyone
else considering doing web development in Haskell.

I welcome and encourage your feedback!

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

2009-05-04 Thread Justin Bailey
Thanks for sharing your code and experience. Very interesting and a
good example of how to put the libraries together to build a real app.

On Mon, May 4, 2009 at 12:45 AM, Chris Forno je...@jekor.com wrote:
 I decided to find out for myself. You can find the results at
 http://jekor.com/article/is-haskell-a-good-choice-for-web-applications

 Included is the source code for the web application powering
 http://www.vocabulink.com/

 The source is roughly 2,000 lines of Haskell, along with some SQL and
 JavaScript. It's written in literate style and includes a 75-page PDF.
 It demonstrates and explains how to:

  * use FastCGI to communicate with a web server (nginx in this case)
  * move data to and from a PostgreSQL database (HDBC)
  * authenticate users with cookies
  * interact with users via asynchronous JavaScript and JSON (AJAJ)
  * collect data with HTML forms (formlets)
  * communicate with users via email
  * cache with memcached
  * implement a custom forums system (with threaded comments)

 I make no claims that the code is elegant or idiomatic. It is however
 real code that's running in the wild. And I hope it's useful to anyone
 else considering doing web development in Haskell.

 I welcome and encourage your feedback!

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe