Re: Is there a non-blocking version of hGetArray?

2004-10-03 Thread Peter Simons
Simon Marlow writes:

  I'm surprised if pointer access to memory is slower
  than unsafeRead.

You were right. Now that I have made some tests, the
problem turned out to be elsewhere. Pointer access is
not to blame. ;-)

Peter

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Bools are not unboxed

2004-10-03 Thread Tomasz Zielonka
Hello!

I was playing with monadic looping a'la replicateM_ and I created this
function:

for :: Int - IO () - IO ()
for 0 _ = return ()
for n x = x  for (n - 1) x

Compiled with -O2, it is really fast and makes no unnecessary
allocations. Tested with this main

main = for 1000 (return ())

it gives the following stats

ghc: 1024 bytes, 0 GCs, 0/0 avg/max bytes residency (0 samples),
1M in use, 0,00 INIT (0,00 elapsed), 0,33 MUT (0,33 elapsed), 0,00 GC
(0,00 elapsed) :ghc

Cool! 

( this is still 10 times slower than g++ -O3, but a similar pure function
  is only 3 times slower, and I am satisfied with such results (at this
  moment ;) )

Unfortunately, the program I was playing with could call 'for' with
negative n, for which it was supposed to make 0 iterations, and this
version definitely makes too many iterations.

So I made another version:

for :: Int - IO () - IO ()
for n x | n  0 = x  for (n - 1) x
| otherwise = return ()

To my surprise, it was much slower and made many allocations:

ghc: 240927488 bytes, 920 GCs, 1036/1036 avg/max bytes residency (1
samples), 1M in use, 0,00 INIT (0,00 elapsed), 2,48 MUT (2,50 elapsed),
0,04 GC (0,05 elapsed) :ghc

I checked in -ddump-simpl that Ints are getting unboxed in both
versions. 

Then I noticed the cause: 
GHC.Prim.# returns a boxed, heap allocated Bool, and so do other
primitive comparison operators.

Would it be difficult to add Bool unboxing to GHC?
Maybe it would suffice to use preallocated False and True?

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Bools are not unboxed

2004-10-03 Thread Tomasz Zielonka
On Sun, Oct 03, 2004 at 03:07:01PM +0200, Tomasz Zielonka wrote:
 Then I noticed the cause: 
 GHC.Prim.# returns a boxed, heap allocated Bool, and so do other
 primitive comparison operators.
 
 Would it be difficult to add Bool unboxing to GHC?
 Maybe it would suffice to use preallocated False and True?

I forgot about some questions:

Do you think that many applications could benefit from such an improvement?

IMO, yes, for example, there are many Int comparisons waiting for this
optimisation in io, networking and posix libraries. But I am not sure
how big would that benefit be in a non-toy application.

Best regards,
Tom

-- 
.signature: Too many levels of symbolic links
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Bools are not unboxed

2004-10-03 Thread Carsten Schultz
Hi Tomasz!

On Sun, Oct 03, 2004 at 03:07:01PM +0200, Tomasz Zielonka wrote:
 Hello!
 
 I was playing with monadic looping a'la replicateM_ and I created this
 function:
 
 for :: Int - IO () - IO ()
 for 0 _ = return ()
 for n x = x  for (n - 1) x
 
 Compiled with -O2, it is really fast and makes no unnecessary
 allocations.

Yes, good code:

T.$wfor =
\r [ww w w1]
case ww of ds {
  __DEFAULT -
  case w w1 of wild {
GHC.Prim.(#,#) new_s a41 -
case -# [ds 1] of sat_s1ZG {
  __DEFAULT - T.$wfor sat_s1ZG w new_s;
};
  };
  0 - GHC.Prim.(#,#) [w1 GHC.Base.()];
};
SRT(T.$wfor): []
T.for =
\r [w w1 w2] case w of w3 { GHC.Base.I# ww - T.$wfor ww w1 w2; };
SRT(T.for): []

 So I made another version:
 
 for :: Int - IO () - IO ()
 for n x | n  0 = x  for (n - 1) x
   | otherwise = return ()
 
 To my surprise, it was much slower and made many allocations:
[...
 Then I noticed the cause: 
 GHC.Prim.# returns a boxed, heap allocated Bool, and so do other
 primitive comparison operators.

That's not really the cause.  A function returning a boxed value does
not necessarily have to allocate it, it is just a vectored return
afaik.

The code is:

T.$wfor' =
\r [ww w]
case # [ww 0] of wild {
  GHC.Base.True -
  let {
k = \u []
case -# [ww 1] of sat_s1Z9 {
  __DEFAULT - T.$wfor' sat_s1Z9 w;
}; } in
  let {
sat_s20d =
\r [eta]
case w eta of wild1 
{ GHC.Prim.(#,#) new_s a41 - k new_s; };
  } in  sat_s20d;
  GHC.Base.False - lvl4;
};
SRT(T.$wfor'): []
T.for' =
\r [w w1] case w of w2 { GHC.Base.I# ww - T.$wfor' ww w1; };
SRT(T.for'): []

The culprit is `let { k = \u ... }'.  The cause seems to be that eta
expansion is done at the wrong place, I do not know why.  The code we
would want is

T.$wfor4 =
\r [ww w w1]
case # [ww 0] of wild {
  GHC.Base.True -
  case w w1 of wild1 {
GHC.Prim.(#,#) new_s a41 -
case -# [ww 1] of sat_s1Y0 {
  __DEFAULT - T.$wfor4 sat_s1Y0 w new_s;
};
  };
  GHC.Base.False - GHC.Prim.(#,#) [w1 GHC.Base.()];
};
SRT(T.$wfor4): []
T.for4 =
\r [w w1 w2] case w of w3 { GHC.Base.I# ww - T.$wfor4 ww w1 w2; };
SRT(T.for4): []

(Notice that $wfor again take three arguments, the last one being the
state.)

Actually, this is produced by the following, although I have no idea
why.  Just the optimizer working unpredictably, I guess.

for4 :: Int - IO () - IO ()
for4 n x = if n `gt` 0 == 0 then return () else x  (for4 (n-1) x)

gt :: Int - Int - Int
gt x y = if x  y then 1 else 0

If you test it, it should be fast.

BTW, although counting upwards (and not solving the problem
generally), the following is ok too:

for2 :: Int - IO () - IO ()
for2 n x = sequence_ [x | i - [1..n]]

T.lvl = \r [s] GHC.Prim.(#,#) [s GHC.Base.()];
SRT(T.lvl): []
T.$wfor2 =
\r [ww w]
case # [1 ww] of wild {
  GHC.Base.True - T.lvl;
  GHC.Base.False -
  let {
go10 =
\r [x1 eta]
case w eta of wild1 {
  GHC.Prim.(#,#) new_s a41 -
  case ==# [x1 ww] of wild11 {
GHC.Base.True - 
GHC.Prim.(#,#) [new_s GHC.Base.()];
GHC.Base.False -
case +# [x1 1] of sat_s1XA {
  __DEFAULT - go10 sat_s1XA new_s;
};
  };
};
  } in  go10 1;
};
SRT(T.$wfor2): []

T.for2 =
\r [w w1] case w of w2 { GHC.Base.I# ww - T.$wfor2 ww w1; };
SRT(T.for2): []


Playing with the code generated by ghc is a great way to waste time
for me.  Wait until you have found the RULES-pragma :-)

Have fun,

Carsten

-- 
Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin
http://carsten.codimi.de/
PGP/GPG key on the pgp.net key servers, 
fingerprint on my home page.


pgpbPCeYXZmU3.pgp
Description: PGP signature
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Bools are not unboxed

2004-10-03 Thread Tomasz Zielonka
On Sun, Oct 03, 2004 at 04:03:55PM +0200, Carsten Schultz wrote:
 Hi Tomasz!

Hi Carsten!

  To my surprise, it was much slower and made many allocations:
 [...
  Then I noticed the cause: 
  GHC.Prim.# returns a boxed, heap allocated Bool, and so do other
  primitive comparison operators.

I should have asked one fundamental question first: am I right? ;)

 That's not really the cause.  A function returning a boxed value does
 not necessarily have to allocate it, it is just a vectored return
 afaik.

I haven't heard about 'vectored return' before. I will try to find
more information about it. Maybe you can recommend something for me
to read?

 (Notice that $wfor again take three arguments, the last one being the
 state.)

Hmmm, I noticed that the number of arguments differed, there were even
some quiet alarm bells in my head, but I ignored it.

 for4 :: Int - IO () - IO ()
 for4 n x = if n `gt` 0 == 0 then return () else x  (for4 (n-1) x)
 
 gt :: Int - Int - Int
 gt x y = if x  y then 1 else 0
 
 If you test it, it should be fast.

It is even slightly faster than my fastest version :)

 BTW, although counting upwards (and not solving the problem
 generally), the following is ok too:
 
 for2 :: Int - IO () - IO ()
 for2 n x = sequence_ [x | i - [1..n]]

This one is amazing. It's 3 times faster than the previous one in spite
of being written in high level style.

I guess it's worth checking idiomatic Haskell style first, because there
is a big chance that GHC was optimised for it :)

However, it would be nice if all versions were as efficient... 

 Playing with the code generated by ghc is a great way to waste time
 for me.

Well, but you seem to be very good at it. Maybe it won't be such a waste
of time in the long term :)

 Wait until you have found the RULES-pragma :-)

I've already found it some time ago. I even tried to use them to
optimise vector/matrix expressions (to eliminate intermediate vectors),
but I remember that sometimes the rules didn't fire and I didn't
understand why.

 Have fun,
 
 Carsten

Hope this will teach me to avoid premature conclusions :-/

Thanks,
Tom

-- 
.signature: Too many levels of symbolic links
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


hWaitForInput and timeouts

2004-10-03 Thread Peter Simons
Hi,

I have another I/O problem. I need to time out when a Handle
blocks forever. I am using hWaitForInput anyway, so that
shouldn't be a problem, but the documentation says that
using this feature will block all IO threads? Is it much
work to fix this? I _could_ forkIO a racer thread myself, of
course, but it feels wrong to do that around a function that
has an explicit timeout argument. :-)

Peter

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users