Re: GADT Type Checking GHC 6.10 versus older GHC

2008-11-28 Thread Dominic Steinitz
Ignore my last email. I was accidentally using

 The Glorious Glasgow Haskell Compilation System, version 6.9.20080616

Mind you I am still having problems just not the same ones. I'll report
back later.

Dominic.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: GADT Type Checking GHC 6.10 versus older GHC

2008-11-28 Thread Simon Peyton-Jones
|  I also feel that the type errors given when working with existential
|  types, especially GADTs with existentials, are confusing.  I think
|
| I am using existential types to test GADT code. See
| http://www.haskell.org/haskellwiki/QuickCheck_/_GADT which no longer
| works with 6.10.1.

Really?  I've just compiled that entire page with 6.10.1, and it was fine, 
except that I had to add a type signature for prettyRep.  No problems there.

Simon
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GADT Type Checking GHC 6.10 versus older GHC

2008-11-28 Thread Dominic Steinitz
Simon Peyton-Jones wrote:
 |  I also feel that the type errors given when working with existential
 |  types, especially GADTs with existentials, are confusing.  I think
 |
 | I am using existential types to test GADT code. See
 | http://www.haskell.org/haskellwiki/QuickCheck_/_GADT which no longer
 | works with 6.10.1.
 
 Really?  I've just compiled that entire page with 6.10.1, and it was fine, 
 except that I had to add a type signature for prettyRep.  No problems there.
 
 Simon
 
 
Sorry I inadvertently used an old copy of ghc.

Dominic.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: GADT Type Checking GHC 6.10 versus older GHC

2008-11-28 Thread Simon Peyton-Jones
|  arbitrarySeq :: Sequence a - Gen RepSeqVal
|  arbitrarySeq Nil =
| return (RepSeqVal Nil Empty)
|  arbitrarySeq (Cons (CTMandatory (NamedType n i t)) ts) =
| do u - arbitraryType t
|us - arbitrarySeq ts
|case u of
|   RepTypeVal a v -
|  case us of
| RepSeqVal bs vs -
|return (RepSeqVal (Cons (CTMandatory (NamedType n i a)) 
bs) (v:*:vs))
|
|
|  QuickTest.lhs:240:13:
|  GADT pattern match in non-rigid context for `Nil'
|Solution: add a type signature
|  In the pattern: Nil
|  In the definition of `arbitrarySeq':
|  arbitrarySeq Nil = return (RepSeqVal Nil Empty)

That looks odd to me.  But it's hard to help without having the code. If you 
send it I'll try to help.

|  Did you try giving a type signature to the (entire) case expression,
|  as I suggested?  That should do it.
| 
|
| I'm not sure what this means or how to do it. Can you give an example or
| is it buried in some earlier email? I will go and have another look.

I mean replace
(case blah of { ... })
by
(case blah of { ... }) :: type-sig

That is, attach a type signature to the case expression itself.  Does that help 
at least explain what the sentence means? If so would you like to clarify the 
wiki advice?

Thanks

Simon
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GADT Type Checking GHC 6.10 versus older GHC

2008-11-28 Thread Dominic Steinitz
Simon Peyton-Jones wrote:
 |  arbitrarySeq :: Sequence a - Gen RepSeqVal
 |  arbitrarySeq Nil =
 | return (RepSeqVal Nil Empty)
 |  arbitrarySeq (Cons (CTMandatory (NamedType n i t)) ts) =
 | do u - arbitraryType t
 |us - arbitrarySeq ts
 |case u of
 |   RepTypeVal a v -
 |  case us of
 | RepSeqVal bs vs -
 |return (RepSeqVal (Cons (CTMandatory (NamedType n i a)) 
 bs) (v:*:vs))
 |
 |
 |  QuickTest.lhs:240:13:
 |  GADT pattern match in non-rigid context for `Nil'
 |Solution: add a type signature
 |  In the pattern: Nil
 |  In the definition of `arbitrarySeq':
 |  arbitrarySeq Nil = return (RepSeqVal Nil Empty)
 
 That looks odd to me.  But it's hard to help without having the code. If you 
 send it I'll try to help.
 
 |  Did you try giving a type signature to the (entire) case expression,
 |  as I suggested?  That should do it.
 | 
 |
 | I'm not sure what this means or how to do it. Can you give an example or
 | is it buried in some earlier email? I will go and have another look.
 
 I mean replace
 (case blah of { ... })
 by
 (case blah of { ... }) :: type-sig
 
 That is, attach a type signature to the case expression itself.  Does that 
 help at least explain what the sentence means? If so would you like to 
 clarify the wiki advice?
 
 Thanks
 
 Simon
 
 
Simon,

I'm sorry to have put you to so much trouble by accidentally using an
old version of ghc I must have had lying around. As penance, I will go
and update the wiki with what I have learnt.

Dominic.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: ghc 6.10.1 on freebsd 7 amd64 - ghci problems

2008-11-28 Thread Simon Marlow

Markus Barenhoff wrote:

On Thu 27.11 09:49, Simon Marlow wrote:

Brandon S. Allbery KF8NH wrote:

On 2008 Nov 26, at 9:30, Markus Barenhoff wrote:

Because the ports seem not to get updated, I tried to compile ghc 6.10.1
under freebsd 7 on amd64 myself. For compiling I first used the ports ghc
The tree's not being updated because 64-bit on freebsd doesn't work yet, 
as you found.  I believe a fix for the mmap() problem has been committed 
for the upcoming 6.10.2.

Yes.

http://hackage.haskell.org/trac/ghc/ticket/2063

The patches haven't been merged into stable yet, but you can grab a HEAD 
snapshot and try that instead - we'd appreciate the testing.


I checked out and translated the head version of ghc today from darcs.
It compiled fine. When I now try to start the ghci I get the following:

 snip 
GHCi, version 6.11.20081126: http://www.haskell.org/ghc/  :? for help
ghc: internal error: loadObj: failed to mmap() memory below 2Gb; asked for 626688 
bytes at 0x4000, got 0x801635000.  Try specifying an address with +RTS 
-xmaddr -RTS
(GHC version 6.11.20081126 for x86_64_unknown_freebsd)
Please report this as a GHC bug:  
http://www.haskell.org/ghc/reportabug
Abort (core dumped) 
 snip 

If it helps somehow, you can find the core dump here:
http://www.alios.org/~alios/ghc.core.bz2


That's odd, because 6.8.3 is using 0x4000 on FreeBSD and is working 
fine (or is it?).


Ideally I need to find out what the memory map is for GHCi.  I posted some 
instructions for doing this on Linux/Xen, maybe you can adapt these to work 
on FreeBSD:


http://www.haskell.org/pipermail/glasgow-haskell-users/2008-November/016091.html

The bit that needs to change is 'cat /proc/pid/maps' - does FreeBSD have 
something similar?


Also you could try doing as the error message suggests, and specify a 
different address.  e.g.


  ghci +RTS -xm3000
  ghci +RTS -xm5000

for 0.75 and 1.25GB respectively.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHCi debugger status

2008-11-28 Thread Simon Marlow

Peter Hercek wrote:

Simon Marlow wrote:
A similar argument applies to keeping the dynamic stack.  The problem 
with the dynamic stack is that it doesn't look much like you expect, 
due to tail-calls.


Do you think people expect the tail-calls to add a stack frame to the 
dynamic stack or is there something more complicated?


Right, I think they expect exactly that, and it'll confuse people that some 
stack frames are missing.  Often it's not clear which calls are 
tail-calls and which are not.  Mind you, I think the fact that it's a 
dynamic call stack rather than a lexical call stack is likely to confuse 
the same set of users even more.


I would expect a tail-call to overwrite the last stack frame on the 
dynamic stack - just like imperative loops, which is what they 
correspond to. Dynamic stack should correspond closely to the stack 
which overflows when we get the stack overflow exception. That is what 
I would expect.


Fair enough - that at least is a data point suggesting that providing the 
dynamic call stack with no special provision for tail-calls would be useful 
to some people.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: cross module optimization issues

2008-11-28 Thread Neil Mitchell
Hi

I've talked to John a bit, and discussed test cases etc. I've tracked
this down a little way.

Given the attached file, compiling witih SHORT_EXPORT_LIST makes the
code go _slower_. By exporting the print_lines function the code
doubles in speed. This runs against everything I was expecting, and
that Simon has described.

Taking a look at the .hi files for the two alternatives, there are two
differences:

1) In the faster .hi file, the body of print_lines is exported. This
is reasonable and expected.

2) In the faster .hi file, there are additional specialisations, which
seemingly have little/nothing to do with print_lines, but are omitted
if it is not exported:

SPEC = [GHC.IOBase.IO] ALWAYS forall @ el
 $dMonad :: GHC.Base.Monad GHC.IOBase.IO
  Sound.IterateeM.= @ GHC.IOBase.IO @ el $dMonad
  = Sound.IterateeM.a
  `cast`
(forall el1 a b.
 Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO a
 - (a - Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO b)
 - trans
(sym ((GHC.IOBase.:CoIO)
  (Sound.IterateeM.IterateeG el1 GHC.IOBase.IO b)))
(sym ((Sound.IterateeM.:CoIterateeGM) el1 GHC.IOBase.IO b)))
  @ el
SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO] ALWAYS forall @ el
 $dMonad ::
GHC.Base.Monad GHC.IOBase.IO
  Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
  = Sound.IterateeM.$s$f2 @ el
SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO] ALWAYS forall @ el
 $dMonad ::
GHC.Base.Monad GHC.IOBase.IO
  Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
  = Sound.IterateeM.$s$f21 @ el
SPEC Sound.IterateeM.liftI [GHC.IOBase.IO] ALWAYS forall @ el
   @ a
   $dMonad ::
GHC.Base.Monad GHC.IOBase.IO
  Sound.IterateeM.liftI @ GHC.IOBase.IO @ el @ a $dMonad
  = Sound.IterateeM.$sliftI @ el @ a
SPEC return [GHC.IOBase.IO] ALWAYS forall @ el
$dMonad :: GHC.Base.Monad
GHC.IOBase.IO
  Sound.IterateeM.return @ GHC.IOBase.IO @ el $dMonad
  = Sound.IterateeM.a7
  `cast`
(forall el1 a.
 a
 - trans
(sym ((GHC.IOBase.:CoIO)
  (Sound.IterateeM.IterateeG el1 GHC.IOBase.IO a)))
(sym ((Sound.IterateeM.:CoIterateeGM) el1 GHC.IOBase.IO a)))
  @ el

My guess is that these cause the slowdown - but is there any reason
that print_lines not being exported should cause them to be omitted?

All these tests were run on GHC 6.10.1 with -O2.

Thanks

Neil


On Fri, Nov 21, 2008 at 10:33 AM, Simon Peyton-Jones
[EMAIL PROTECTED] wrote:
 | This project is based on Oleg's Iteratee code; I started using his
 | IterateeM.hs and Enumerator.hs files and added my own stuff to
 | Enumerator.hs (thanks Oleg, great work as always).  When I started
 | cleaning up by moving my functions from Enumerator.hs to MyEnum.hs, my
 | minimal test case increased from 19s to 43s.
 |
 | I've found two factors that contributed.  When I was cleaning up, I
 | also removed a bunch of unused functions from IterateeM.hs (some of
 | the test functions and functions specific to his running example of
 | HTTP encoding).  When I added those functions back in, and added
 | INLINE pragmas to the exported functions in MyEnum.hs, I got the
 | performance back.
 |
 | In general I hadn't added export lists to the modules yet, so all
 | functions should have been exported.

 I'm totally snowed under with backlog from my recent absence, so I can't look 
 at this myself, but if anyone else wants to I'd be happy to support with 
 advice and suggestions.

 In general, having an explicit export list is good for performance. I typed 
 an extra section in the GHC performance resource 
 http://haskell.org/haskellwiki/Performance/GHC to explain why.  In general 
 that page is where we should document user advice for performance in GHC.

 I can't explain why *adding* unused functions would change performance though!

 Simon


 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

{-# LANGUAGE CPP #-}

{- This file was originally take from http://okmij.org/ftp/Haskell/Iteratee/,
and modified to suit this project
-}

-- #define SHORT_EXPORT_LIST 1

module Sound.IterateeM (
  StreamG (..),
  IterateeG (..),
  IterateeGM (..),
  liftI,
  (==),
  (==),
  stream2list,
  sbreak,
  sdropWhile,
  snext,
  speek,
  skip_till_eof,
  sdrop,
  stake,
  map_stream,
  EnumeratorGM,
  enum_eof,
  enum_err,
  (.),
  enum_pure_1chunk,
  enum_pure_nchunk,
  enum_h,
  enum_file,
#ifdef SHORT_EXPORT_LIST
#else
  print_lines,
#endif
)

{-
-- #ifdef SHORT_EXPORT_LIST
-- #else 
--   Iteratee,
--   IterateeM,
--   Stream,
--   Line,
--   line,
--   print_lines,
--   enum_lines,
--   

Re: GHCi debugger status

2008-11-28 Thread Peter Hercek

Simon Marlow wrote:

Peter Hercek wrote:

Simon Marlow wrote:
A similar argument applies to keeping the dynamic stack.  The problem 
with the dynamic stack is that it doesn't look much like you expect, 
due to tail-calls.


Do you think people expect the tail-calls to add a stack frame to the 
dynamic stack or is there something more complicated?


Right, I think they expect exactly that, and it'll confuse people that 
some stack frames are missing.  Often it's not clear which calls are 
tail-calls and which are not.  Mind you, I think the fact that it's a 
dynamic call stack rather than a lexical call stack is likely to confuse 
the same set of users even more.


That is a good point, I might not see at the first look whether it is a 
tail call or not. Which reminds me that if it is implemented the way I 
expected then stack frames which are tail calls should be marked that 
way so that it is possible to see at the first look whether the given

stack frame is a tail-call or not.

If it will be a lexical call stack I'm curious how the pruning will be 
done so that we do not miss stack frames when we return from some code 
which corresponds to an imperative loop. Maybe a top limit on the number 
of stored lexical frames in one imperative (call-recursive) frame? From 
my point of view this could work well enough if it can print something 
like and here there were some lexical frames pruned and we are going 
one dynamic frame higher.


My reasons why I want to see it with tail-calls collapsed into one stack 
frame is that I need debugger to figure out why something does not work 
so I should see what it looks like close to the execution model where 
the bugs actually present themselves. I believe that collapsed 
tail-calls is not such a big deal if there is a way to filter trace 
history (like tracelocal idea or something similar) or maybe having a 
really long trace history. Hmmm, maybe it would be even possible to 
recover last part of the lexical stack from the dynamic stack and the 
trace history.


I discussed a bit with Pepe Iborra about how to build the dynamic (lazy) 
stack from a trace on the fly. Something like whenever we reduce an 
expression we would prune the corresponding node in the trace. Such a 
pruned trace should correspond to the dynamic stack. (If I do not miss 
something which I probably do.) And moreover if we record the 
expressions (their source code range) we just pruned and the result they 
 reduced to then we can show it with some command like 
:showexpressionresults. This would provide access to unnamed values 
which could have been sent to a lower level calls as inputs. And that is 
 part of the problem we discussed in this thread.


Anyway thank you, Clause Reinke and Pepe Iborra for all the great help 
with ghci ... I'm still learning how to script ghci debugger better. I 
hope I can make it better than printf debugging with the scripts :-)


Peter.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: cross module optimization issues

2008-11-28 Thread John Lato
Neil, thank you very much for taking the time to look at this; I
greatly appreciate it.

One thing I don't understand is why the specializations are caused by
print_lines.  I suppose the optimizer can infer something which it
couldn't otherwise.

If I read this properly, the functions being specialized are liftI,
(=), return, and $f2.  One thing I'm not sure about is when INLINE
provides the desired optimal behavior, as opposed to SPECIALIZE.  The
monad functions are defined in the Monad instance, and thus aren't
currently INLINE'd or SPECIALIZE'd.  However, if they are separate
functions, would INLINE be sufficient?  Would that give the optimizer
enough to work with the derive the specializations on its own?  I'll
have some time to experiment with this myself tomorrow, but I'd
appreciate some direction (rather than guessing blindly).

What is $f2?  I've seen that appear before, but I'm not sure where
it comes from.

Thanks,
John

On Fri, Nov 28, 2008 at 10:31 AM, Simon Peyton-Jones
[EMAIL PROTECTED] wrote:
 The specialisations are indeed caused (indirectly) by the presence of 
 print_lines.  If print_lines is dead code (as it is when print_lines is not 
 exported), then there are no calls to the overloaded functions at these 
 specialised types, and so you don't get the specialised versions.  You can 
 get specialised versions by a SPECIALISE pragma, or SPECIALISE INSTANCE

 Does that make sense?

 Simon

 | -Original Message-
 | From: Neil Mitchell [mailto:[EMAIL PROTECTED]
 | Sent: 28 November 2008 09:48
 | To: Simon Peyton-Jones
 | Cc: John Lato; glasgow-haskell-users@haskell.org; Don Stewart
 | Subject: Re: cross module optimization issues
 |
 | Hi
 |
 | I've talked to John a bit, and discussed test cases etc. I've tracked
 | this down a little way.
 |
 | Given the attached file, compiling witih SHORT_EXPORT_LIST makes the
 | code go _slower_. By exporting the print_lines function the code
 | doubles in speed. This runs against everything I was expecting, and
 | that Simon has described.
 |
 | Taking a look at the .hi files for the two alternatives, there are two
 | differences:
 |
 | 1) In the faster .hi file, the body of print_lines is exported. This
 | is reasonable and expected.
 |
 | 2) In the faster .hi file, there are additional specialisations, which
 | seemingly have little/nothing to do with print_lines, but are omitted
 | if it is not exported:
 |
 | SPEC = [GHC.IOBase.IO] ALWAYS forall @ el
 |  $dMonad :: GHC.Base.Monad 
 GHC.IOBase.IO
 |   Sound.IterateeM.= @ GHC.IOBase.IO @ el $dMonad
 |   = Sound.IterateeM.a
 |   `cast`
 | (forall el1 a b.
 |  Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO a
 |  - (a - Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO b)
 |  - trans
 | (sym ((GHC.IOBase.:CoIO)
 |   (Sound.IterateeM.IterateeG el1 GHC.IOBase.IO b)))
 | (sym ((Sound.IterateeM.:CoIterateeGM) el1 GHC.IOBase.IO b)))
 |   @ el
 | SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO] ALWAYS forall @ el
 |  $dMonad ::
 | GHC.Base.Monad GHC.IOBase.IO
 |   Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
 |   = Sound.IterateeM.$s$f2 @ el
 | SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO] ALWAYS forall @ el
 |  $dMonad ::
 | GHC.Base.Monad GHC.IOBase.IO
 |   Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
 |   = Sound.IterateeM.$s$f21 @ el
 | SPEC Sound.IterateeM.liftI [GHC.IOBase.IO] ALWAYS forall @ el
 |@ a
 |$dMonad ::
 | GHC.Base.Monad GHC.IOBase.IO
 |   Sound.IterateeM.liftI @ GHC.IOBase.IO @ el @ a $dMonad
 |   = Sound.IterateeM.$sliftI @ el @ a
 | SPEC return [GHC.IOBase.IO] ALWAYS forall @ el
 | $dMonad :: GHC.Base.Monad
 | GHC.IOBase.IO
 |   Sound.IterateeM.return @ GHC.IOBase.IO @ el $dMonad
 |   = Sound.IterateeM.a7
 |   `cast`
 | (forall el1 a.
 |  a
 |  - trans
 | (sym ((GHC.IOBase.:CoIO)
 |   (Sound.IterateeM.IterateeG el1 GHC.IOBase.IO a)))
 | (sym ((Sound.IterateeM.:CoIterateeGM) el1 GHC.IOBase.IO a)))
 |   @ el
 |
 | My guess is that these cause the slowdown - but is there any reason
 | that print_lines not being exported should cause them to be omitted?
 |
 | All these tests were run on GHC 6.10.1 with -O2.
 |
 | Thanks
 |
 | Neil
 |
 |
 | On Fri, Nov 21, 2008 at 10:33 AM, Simon Peyton-Jones
 | [EMAIL PROTECTED] wrote:
 |  | This project is based on Oleg's Iteratee code; I started using his
 |  | IterateeM.hs and Enumerator.hs files and added my own stuff to
 |  | Enumerator.hs (thanks Oleg, great work as always).  When I started
 |  | cleaning up by moving my functions from Enumerator.hs to MyEnum.hs, my
 |  | minimal test case increased from 19s to 43s.
 | 

RE: cross module optimization issues

2008-11-28 Thread Simon Peyton-Jones
The $f2 comes from the instance Monad (IterateeGM ...).
print_lines uses a specialised version of that instance, namely
Monad (IterateeGM el IO)
The fact that print_lines uses it makes GHC generate a specialised version of 
the instance decl.

Even in the absence of print_lines you can generate the specialised instance 
thus

instance Monad m = Monad (IterateeGM el m) where
{-# SPECIALISE instance Monad (IterateeGM el IO) #-}
... methods...

does that help?

Simon

| -Original Message-
| From: John Lato [mailto:[EMAIL PROTECTED]
| Sent: 28 November 2008 12:07
| To: Simon Peyton-Jones
| Cc: Neil Mitchell; glasgow-haskell-users@haskell.org; Don Stewart
| Subject: Re: cross module optimization issues
|
| Neil, thank you very much for taking the time to look at this; I
| greatly appreciate it.
|
| One thing I don't understand is why the specializations are caused by
| print_lines.  I suppose the optimizer can infer something which it
| couldn't otherwise.
|
| If I read this properly, the functions being specialized are liftI,
| (=), return, and $f2.  One thing I'm not sure about is when INLINE
| provides the desired optimal behavior, as opposed to SPECIALIZE.  The
| monad functions are defined in the Monad instance, and thus aren't
| currently INLINE'd or SPECIALIZE'd.  However, if they are separate
| functions, would INLINE be sufficient?  Would that give the optimizer
| enough to work with the derive the specializations on its own?  I'll
| have some time to experiment with this myself tomorrow, but I'd
| appreciate some direction (rather than guessing blindly).
|
| What is $f2?  I've seen that appear before, but I'm not sure where
| it comes from.
|
| Thanks,
| John
|
| On Fri, Nov 28, 2008 at 10:31 AM, Simon Peyton-Jones
| [EMAIL PROTECTED] wrote:
|  The specialisations are indeed caused (indirectly) by the presence of 
print_lines.  If
| print_lines is dead code (as it is when print_lines is not exported), then 
there are no calls
| to the overloaded functions at these specialised types, and so you don't get 
the specialised
| versions.  You can get specialised versions by a SPECIALISE pragma, or 
SPECIALISE INSTANCE
| 
|  Does that make sense?
| 
|  Simon
| 
|  | -Original Message-
|  | From: Neil Mitchell [mailto:[EMAIL PROTECTED]
|  | Sent: 28 November 2008 09:48
|  | To: Simon Peyton-Jones
|  | Cc: John Lato; glasgow-haskell-users@haskell.org; Don Stewart
|  | Subject: Re: cross module optimization issues
|  |
|  | Hi
|  |
|  | I've talked to John a bit, and discussed test cases etc. I've tracked
|  | this down a little way.
|  |
|  | Given the attached file, compiling witih SHORT_EXPORT_LIST makes the
|  | code go _slower_. By exporting the print_lines function the code
|  | doubles in speed. This runs against everything I was expecting, and
|  | that Simon has described.
|  |
|  | Taking a look at the .hi files for the two alternatives, there are two
|  | differences:
|  |
|  | 1) In the faster .hi file, the body of print_lines is exported. This
|  | is reasonable and expected.
|  |
|  | 2) In the faster .hi file, there are additional specialisations, which
|  | seemingly have little/nothing to do with print_lines, but are omitted
|  | if it is not exported:
|  |
|  | SPEC = [GHC.IOBase.IO] ALWAYS forall @ el
|  |  $dMonad :: GHC.Base.Monad 
GHC.IOBase.IO
|  |   Sound.IterateeM.= @ GHC.IOBase.IO @ el $dMonad
|  |   = Sound.IterateeM.a
|  |   `cast`
|  | (forall el1 a b.
|  |  Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO a
|  |  - (a - Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO b)
|  |  - trans
|  | (sym ((GHC.IOBase.:CoIO)
|  |   (Sound.IterateeM.IterateeG el1 GHC.IOBase.IO b)))
|  | (sym ((Sound.IterateeM.:CoIterateeGM) el1 GHC.IOBase.IO b)))
|  |   @ el
|  | SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO] ALWAYS forall @ el
|  |  $dMonad ::
|  | GHC.Base.Monad GHC.IOBase.IO
|  |   Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
|  |   = Sound.IterateeM.$s$f2 @ el
|  | SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO] ALWAYS forall @ el
|  |  $dMonad ::
|  | GHC.Base.Monad GHC.IOBase.IO
|  |   Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
|  |   = Sound.IterateeM.$s$f21 @ el
|  | SPEC Sound.IterateeM.liftI [GHC.IOBase.IO] ALWAYS forall @ el
|  |@ a
|  |$dMonad ::
|  | GHC.Base.Monad GHC.IOBase.IO
|  |   Sound.IterateeM.liftI @ GHC.IOBase.IO @ el @ a $dMonad
|  |   = Sound.IterateeM.$sliftI @ el @ a
|  | SPEC return [GHC.IOBase.IO] ALWAYS forall @ el
|  | $dMonad :: GHC.Base.Monad
|  | GHC.IOBase.IO
|  |   Sound.IterateeM.return @ GHC.IOBase.IO @ el $dMonad
|  |   = Sound.IterateeM.a7
|  |   

Re: cross module optimization issues

2008-11-28 Thread pepe


On 28/11/2008, at 15:46, Simon Peyton-Jones wrote:


The $f2 comes from the instance Monad (IterateeGM ...).
print_lines uses a specialised version of that instance, namely
   Monad (IterateeGM el IO)
The fact that print_lines uses it makes GHC generate a specialised  
version of the instance decl.


Even in the absence of print_lines you can generate the specialised  
instance thus


instance Monad m = Monad (IterateeGM el m) where
   {-# SPECIALISE instance Monad (IterateeGM el IO) #-}
   ... methods...

does that help?



Once Simon and Neil dig the issue and analyze it, the reason seems  
evident.
But this thread reminds of why writing high performance Haskell code  
is regarded as a black art outside the community (well, and sometimes  
inside too).


Wouldn't a JIT version of GHC be a great thing to have?
Or would a backend for LLVM be already beneficial enough?


Cheers
pepe
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: cross module optimization issues

2008-11-28 Thread John Lato
Yes, this does help, thank you.  I didn't know you could generate
specialized instances.  In fact, I was so sure that this was some
arcane feature I immediately went to the GHC User Guide because I
didn't believe it was documented.

I immediately stumbled upon Section 8.13.9.

Thanks to everyone who helped me with this.  I think I've achieved a
small bit of enlightenment.

Cheers,
John

On Fri, Nov 28, 2008 at 2:46 PM, Simon Peyton-Jones
[EMAIL PROTECTED] wrote:
 The $f2 comes from the instance Monad (IterateeGM ...).
 print_lines uses a specialised version of that instance, namely
Monad (IterateeGM el IO)
 The fact that print_lines uses it makes GHC generate a specialised version of 
 the instance decl.

 Even in the absence of print_lines you can generate the specialised instance 
 thus

 instance Monad m = Monad (IterateeGM el m) where
{-# SPECIALISE instance Monad (IterateeGM el IO) #-}
... methods...

 does that help?

 Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: cross module optimization issues

2008-11-28 Thread Neil Mitchell
Hi

 instance Monad m = Monad (IterateeGM el m) where
   {-# SPECIALISE instance Monad (IterateeGM el IO) #-}

 does that help?

Yes. With that specialise line in, we get identical performance
between the two results.

So, in summary:

The print_lines function uses the IterateeGM with IO as the underlying
monad, which causes GHC to specialise IterateeGM with IO. If
print_lines is not exported, then it is deleted as dead code, and the
specialisation is never generated. The specialisation is crucial for
performance later on. In this way, by keeping unused code reachable,
GHC does better optimisation.

 Once Simon and Neil dig the issue and analyze it, the reason seems evident.
 But this thread reminds of why writing high performance Haskell code is
 regarded as a black art outside the community (well, and sometimes inside
 too).

 Wouldn't a JIT version of GHC be a great thing to have?
 Or would a backend for LLVM be already beneficial enough?

I don't think either would have the benefits offered by
specialisation. If GHC exported more information about instances, it
could do more specialisations later, but it is a trade off. If you ran
GHC in some whole-program mode, then you wouldn't have the problem,
but would gain additional problems.

I always hoped Supero (http://www-users.cs.york.ac.uk/~ndm/supero/)
would remove some of the black art associated with program
optimisation - there are no specialise pragmas, and I'm pretty sure in
the above example it would have done the correct thing. In some ways,
whole-program and fewer special cases gives a much better mental model
of how optimisation might effect a program. Of course, its still a
research prototype, but perhaps one day...

Thanks

Neil
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHCi debugger status

2008-11-28 Thread Isaac Dupree

Simon Marlow wrote:

Claus Reinke wrote:

Then how about my suggestion for selectively adding lexical scope to
breakpoints? I'd like to be able to say

   :break loc/name {names}

and have GHCi make the necessary changes to keep {names} available
for inspection when it hits that breakpoint. 


The only easy way to do that is to recompile the module that contains 
the breakpoint.  To do it without recompiling is about as hard as doing 
what I suggested above, because it involves a similar mechanism (being 
able to selectively retain the values of free variables).


sure, but GHCi recompiling the module maybe takes less than 
a second, whereas going and modifying the source-code in an 
editor takes orders of magnitude more time!  Is there 
something fundamentally wrong with recompiling an 
interpreted module?


-Isaac
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users