Re: ghc and signal processing

2004-02-23 Thread Jeremy Shaw
Hrm,

Okay, it seems that my problems maybe be due to using ghc 6.3.

Here are the results of running test under different compiler versions
(see end of message for code):


Athlon 600MHz + FreeBSD + GHC 6.0.1

real0m0.414s
user0m0.361s
sys 0m0.016s

Athlon 600MHz + FreeBSD + GHC 6.3 (built from CVS HEAD on Feb 15, 2004)

real0m2.517s
user0m2.289s
sys 0m0.069s

Pentium III 1.13GHz + Debian + GHC 6.2

real0m0.305s
user0m0.196s
sys 0m0.027s

Pentium III 1.13GHz + Debian + GHC 6.3 (built from CVS HEAD on Feb 1, 2004)


real0m1.302s
user0m1.196s
sys 0m0.044s


So it seems like maybe GHC 6.3's performance for this particular test
is around 3-5 slower?

Jeremy Shaw.


module Main where

import Data.Array
import Data.Array.IO

import System.IO

main = do h <- openFile "test.b" WriteMode
  a <- newArray_ (1,180)
  b <- mapArray id a
  c <- mapArray id b
  hPutArray h c 180


At Mon, 23 Feb 2004 13:37:45 -0800,
Mike Gunter wrote:
> 
> 
> Hmmm.  With -O2 on GHC 6.2, I get 0.177s, 0.217s, and 0.348s for your
> three Haskell examples and 0.187s (with gcc -O2) for your C example.
> The output of -ddump-simpl for the looks perfect for the second
> Haskell example.  My GHC seems to be doing a bang-up job here.  What's
> wrong with yours?  (For the third example GHC's code could be improved
> by additional inlining or hoisting of a constant array outside of the
> loop.)
> 
>   mike
> 
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: optimization question

2004-02-23 Thread ajb
G'day all.

Quoting Peter Simons <[EMAIL PROTECTED]>:

> If you don't mind using FFI, the tool of choice would
> probably be .

Perfect hash functions are actually not that much better than
"imperfect" hash functions for the case where you have keys to
search for which are not in the set.

"Imperfect" hashing requires that a key be scanned at least twice:
Once to compute the hash and once for the final equality test.  The
equality test may need to be performed more than once if two keys
hash to the same value.  A good rule of thumb is that hash collisions
are unlikely until you reach around sqrt(N) keys, where N is the size
of the hash space.  So, for example, for 32-bit hash values, you
almost certainly won't get a collision unless you insert 65,000 or so
keys, which is much more than Hal's 1,300 or so.

In the absence of hash collisions "perfect" hashing is pretty much the
same.  The only differences are a) the hash function is more expensive to
compute, and b) the equality test is guaranteed to happen at most once.

Perfect hashing is best when you know that you're not going to search for
keys that are not in the set.  For example, an algorithm which requires
two passes over the words in some set of documents could easily benefit
from perfect hashing, since the words that you'll find in the second pass
will only be those found in the first pass.

Radix-based searching, on the other hand, requires only one pass through
the key and no arithmetic.

Cheers,
Andrew Bromage
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: optimization question

2004-02-23 Thread ajb
G'day.

Quoting Hal Daume III <[EMAIL PROTECTED]>:

> Finally, I implemented a version which reads data into a finitemap.

I'd be curious about the relative performance in using a ternary
search trie:

http://cvs.sourceforge.net/viewcvs.py/hfl/hfl/edison/Assoc/

Cheers,
Andrew Bromage
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: ghc and signal processing

2004-02-23 Thread Jeremy Shaw
Hrm,

I am going to do some new test tonight. I think my test environment
may have been bad...

Jeremy Shaw.


At Mon, 23 Feb 2004 13:37:45 -0800,
Mike Gunter wrote:
> 
> 
> Hmmm.  With -O2 on GHC 6.2, I get 0.177s, 0.217s, and 0.348s for your
> three Haskell examples and 0.187s (with gcc -O2) for your C example.
> The output of -ddump-simpl for the looks perfect for the second
> Haskell example.  My GHC seems to be doing a bang-up job here.  What's
> wrong with yours?  (For the third example GHC's code could be improved
> by additional inlining or hoisting of a constant array outside of the
> loop.)
> 
>   mike
> 
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: ghc and signal processing

2004-02-23 Thread Mike Gunter

Hmmm.  With -O2 on GHC 6.2, I get 0.177s, 0.217s, and 0.348s for your
three Haskell examples and 0.187s (with gcc -O2) for your C example.
The output of -ddump-simpl for the looks perfect for the second
Haskell example.  My GHC seems to be doing a bang-up job here.  What's
wrong with yours?  (For the third example GHC's code could be improved
by additional inlining or hoisting of a constant array outside of the
loop.)

mike

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Hugs98 - GHC

2004-02-23 Thread gaby82
Hi,
I've an application in Hugs98 and i've to integrate it with another application
using GHC.
I don´t know how to do this. I tried to compiled the hugs98 application
using GHC but i couldn´t
Another question: can i compiled many hs at the same time?
Thanks a lot
Gabriela

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: optimization question

2004-02-23 Thread Peter Simons
Max Kirillov writes:

 >> [...] I will generate large case statements like the
 >> first form and want to know if I should bother
 >> pre-optimizing it to the second.

 > I suppose such things should be made by flex-style
 > generators. 

If you don't mind using FFI, the tool of choice would
probably be .

Peter

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: optimization question

2004-02-23 Thread Hal Daume III
I have some numbers on this.

I have a list of first names for males from the census data.  I have a 
function 'male :: String -> Maybe Double' which returns (maybe) the 
probability of a person being given that name.  I have two versions, one 
based on string matching, the other based on building the data into a trie 
and then converting the trie into haskell source and the using that to 
match).  For example, we have:

> module Male where
> 
> male "james" = Just 3.318
> male "john" = Just 3.271
> male "robert" = Just 3.143
> male "michael" = Just 2.629
> male "william" = Just 2.451
> male "david" = Just 2.363
> male "richard" = Just 1.703
> ...
> male _ = Nothing

and in the other version, we have

> male = male_start
>   where
> male_start ('a':xs) = male_66 xs
> male_start ('b':xs) = male_67 xs
> male_start ('c':xs) = male_68 xs
> male_start ('d':xs) = male_69 xs
> ...
> male_start _ = Nothing
> male_66 ('a':xs) = male_66_66 xs
> male_66 ('b':xs) = male_66_67 xs
> male_66 ('d':xs) = male_66_69 xs
> male_66 ('g':xs) = male_66_72 xs
> ...


Finally, I implemented a version which reads data into a finitemap.

the original database contains 1219 names.

i test this by taking all male names, all female names, randomizing them, 
and repeating this data 20 times.  this leads to 109880 runs.

i ran each implementation 5 times; the results are:

using trie:

0.890u 0.020s 0:00.90 101.1%0+0k 0+0io 327pf+0w
0.910u 0.000s 0:00.90 101.1%0+0k 0+0io 327pf+0w
0.870u 0.030s 0:00.90 100.0%0+0k 0+0io 327pf+0w
0.910u 0.020s 0:00.93 100.0%0+0k 0+0io 327pf+0w
0.920u 0.020s 0:00.95 98.9% 0+0k 0+0io 327pf+0w

using string-matching:

10.280u 0.060s 0:10.51 98.3%0+0k 0+0io 280pf+0w
10.340u 0.030s 0:10.86 95.4%0+0k 0+0io 279pf+0w
10.310u 0.040s 0:10.72 96.5%0+0k 0+0io 281pf+0w
10.330u 0.040s 0:10.55 98.2%0+0k 0+0io 280pf+0w
10.420u 0.020s 0:10.63 98.2%0+0k 0+0io 280pf+0w

for finitemap:

1.110u 0.020s 0:01.14 99.1% 0+0k 0+0io 195pf+0w
1.110u 0.010s 0:01.14 98.2% 0+0k 0+0io 195pf+0w
1.100u 0.030s 0:01.14 99.1% 0+0k 0+0io 195pf+0w
1.120u 0.010s 0:01.15 98.2% 0+0k 0+0io 195pf+0w
1.190u 0.010s 0:01.24 96.7% 0+0k 0+0io 195pf+0w

so string-matching is terribly slow; using the finitemap is actually 
surprisingly fast, though still about 30% slower than the trie version.

perhaps a better-optimized trie version would do better, but it's hard to 
say.

On Mon, 23 Feb 2004, Simon Peyton-Jones wrote:

> The trouble is that you probably *don't* want to expand this
>   case x of { "foogle" -> e1; _ -> e2 }
> to this
> 
>   case x of
>  c1:x1 -> case c1 of
>   'f' -> case x1 of
>   c2:x2 -> case c2 of 
>   'o' -> of 
> 
> So GHC generates a series of equality tests instead.  A decent
> alternative might be:
>   
>   generate case expressions when there is more
>   than one string in the list, otherwise use an equality test
> 
> That would not be hard to do.  If it becomes important to you, I'd have
> a go. But before doing so, could you do the work by hand and see if it
> makes a useful performance difference?
> 
> Simon
> 
> | -Original Message-
> | From: [EMAIL PROTECTED]
> [mailto:glasgow-haskell-users-
> | [EMAIL PROTECTED] On Behalf Of Sven Panne
> | Sent: 22 February 2004 15:32
> | To: John Meacham
> | Cc: [EMAIL PROTECTED]
> | Subject: Re: optimization question
> | 
> | John Meacham wrote:
> | > I was wondering if:
> | >
> | > case x of
> | > "foo" -> Foo
> | > "bar" -> Bar
> | > "fuzz" -> Fuzz
> | > "fuzo" -> Fuzo
> | > x -> other .. thing
> | >
> | > would optimize to
> | >
> | > let z = other .. thing in
> | > case x of
> | > ('f':x) -> case x of
> | > ('u':'z': x) ->
> | > "z" -> Fuzz
> | > "o" -> Fuzo
> | > _ -> z
> | > "oo" -> Foo
> | > _ -> z
> | > "bar" -> Bar
> | > _ -> z
> | 
> | String literals are handled in a special way in GHC, so your example
> is
> | essentially converted into an if-cascade, which is not what you want.
> | OTOH, if you write the strings in their expanded form like
> ['f','o','o'],
> | you get your optimized version automatically. Perhaps Simon^2 can
> comment
> | on the rationale behind this, I can't remember the reason...
> | 
> | Cheers,
> | S.
> | 
> | ___
> | Glasgow-haskell-users mailing list
> | [EMAIL PROTECTED]
> | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
> ___
> Glasgow-haskell-users mailing list
> [EMAIL PROTECTED]
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
> 

-- 
 Hal Daume III   | [EMAIL PROTECTED]
 "Arrest this man, he talks in maths."  

Re: optimization question

2004-02-23 Thread Lennart Augustsson
Simon Peyton-Jones wrote:
generate case expressions when there is more
than one string in the list, otherwise use an equality test
Oh, you mean like hbc does? ;-)
Sorry, couldn't resist.
	-- Lennart

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: optimization question

2004-02-23 Thread Max Kirillov
On Sun, Feb 22, 2004 at 12:20:35AM -0800, John Meacham wrote:
> case x of 
> "foo" -> Foo
> "bar" -> Bar
> "fuzz" -> Fuzz
> "fuzo" -> Fuzo
> x -> other .. thing

> The reason I ask is I am writing someting which will generate large case
> statements like the first form and want to know if I should bother
> pre-optimizing it to the second.
> John

I suppose such things should be made by flex-style generators. Haskell
has one (called Happy); I wonder how efficient it is compared to
hand-written "case .. of" or "if .. = .." scanners. I've heard flex is
at least not worse than hand-written C code.

-- 
Max
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GC behaviour in GHC

2004-02-23 Thread Carsten Schultz
On Sat, Feb 21, 2004 at 08:43:56PM -0300, Nivia Q. wrote:
> I'm a Computational Engineering student in the University of
> Pernambuco (Brazil). In my research, I'm working with applications
> written in Haskell , where time is a critical factor. But there is a
> pretty high GC overhead I can't understand . Can someone indicate
> some paper or reference about the GC behaviour in GHC?

Just let me add to the good answers that before understanding GC
behaviour, it is important to understand the space behaviour of a
program.  I think that it is possible that your problems are more in
this area.

Greetings,

Carsten

-- 
Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin
http://carsten.codimi.de/
PGP/GPG key on the pgp.net key servers, 
fingerprint on my home page.


pgp0.pgp
Description: PGP signature
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GC behaviour in GHC

2004-02-23 Thread Andrew Cheadle
Nivia,

Don has mentioned most of the documentation about GC work in GHC except
for:

Generational garbage collection for Haskell -  Sansom, Peyton Jones
http://citeseer.nj.nec.com/sansom93generational.html

There isn't really a paper on the performance of GHC's current
generational / compacting collector. I do have one in progress
which relates to further work I've been doing on the implementation of
the incremental collector. (The incremental collector is not
currently available in a GHC release).

If you have specific questions feel free to ask and I'll try to answer
the to my best ability.

What kind of applications are you working with that make them time
critical - Do you want to increase user responsiveness and reduce pause
times or have the programs run as fast as possible?

I'm assuming you want the latter. Have you tried playing with GHC's
various tunable GC parameters?

Cheers

Andy

On Sun, 22 Feb 2004, Donald Bruce Stewart wrote:

>niviaquental:
>>
>>Hello, GHC users,
>>
>>I'm a Computational Engineering student in the University of
>>Pernambuco (Brazil). In my research, I'm working with
>>applications written in Haskell , where time is a critical
>>factor. But there is a pretty high GC overhead I can't
>>understand . Can someone indicate some paper or
>>reference about the GC behaviour in GHC?
>
>Well, I know of a couple:
>
>The workings of the incremental garbage collector:
>http://research.microsoft.com/~simonpj/Papers/inc-gc.htm
>
>The runtime system document also mentions GC:
>http://www.haskell.org/ghc/docs/papers/run-time-system.ps.gz
>
>There are probably more references at:
>http://www.haskell.org/ghc/documentation.html
>
>-- Don
>___
>Glasgow-haskell-users mailing list
>[EMAIL PROTECTED]
>http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>

*
*  Andrew Cheadleemail:  [EMAIL PROTECTED] *
*  Department of Computing   http://www.doc.ic.ac.uk/~amc4/ *
*  Imperial College *
*  University of London *
*
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: ghc and signal processing

2004-02-23 Thread Wolfgang Thaller
On 23.02.2004, at 13:32, MR K P SCHUPKE wrote:

b <- mapArray id a
The reason it is slow is because the array type is copied every time
a member is assigned.
The array in question is already a mutable array, and even for 
non-mutable arrays, mapArray would use mutable arrays internally.

The problem here is that mapArray's implementation isn't perfect, and 
that GHC doesn't generate perfect code for it.
I was able to get a 16% performance increase by using the following 
instead of GHC's built-in mapArray:

import Data.Array.Base
myMapArray :: (MArray a e' m, MArray a e m, Ix i) => (e' -> e) -> a i 
e' -> m (a i e)
myMapArray f marr = case Data.Array.IO.bounds marr of
  (l,u) -> do
marr' <- newArray_ (l,u)
let loop i n | i == n = return ()
 | otherwise = do
e <- unsafeRead marr i
unsafeWrite marr' i (f e)
loop (i+1) n
loop 0 (rangeSize (l,u))
return marr'

The difference is that I use an explicit loop rather than an 
intermediate list of indices ([1 .. rangeSize (l,u) - 1]) that GHC 
fails to optimize away.

There are two solutions:

1) Use a mutable-array in the StateMonad then freeze it.
This won't help in this case (see above).

2) In this particular case where processing is sequential (IE you
are only altering values based on *nearby* values, you can use streams.
One of the nicest features of Haskell is how lists (being lazy) operate
just like streams...
[...]
This should be fast, and also use very little memory.
I second that. You might need to use arrays for input and output, but 
for everything in between, lists are probably a very good choice. Using 
lists this way in Haskell is probably more efficient that using an 
equivalent data structure in C.
When you use arrays, GHC often ends up using lists internally, as we 
saw above. So the luxury of using lists really shouldn't cost too much.

Cheers,

Wolfgang

___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Array optimisation...

2004-02-23 Thread Josef Svenningsson
On Mon, 23 Feb 2004, MR K P SCHUPKE wrote:

>
> Was just thinking about GHC's implementation of arrays, and their
> poor performance. I know little about GHC's internal workings, but
> I was thinking about how array performance could be improved.
>
> What if when writing an array you instead construct a function:
>
> f :: (Ix x,Ix y) => Array a -> Ix x -> a -> Ix y -> a
> f a x b y | x==y = b
>   | otherwise = a!y
>
> Then the update in place operator // becomes a curried application
> of 'f' above.
>
> You could then define a a series of 'overlays' for a base array.
> The clever bit would be to get the garbage collector to merge
> the two as soon as any reference to the original array is
> discarded.
>
> Does GHC already do anything like this?
>
No it doesn't. But if you want this behaviour you should look at
Data.Array.Diff . I think that library does what you want.

Cheers,

/Josef
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Array optimisation...

2004-02-23 Thread MR K P SCHUPKE

Was just thinking about GHC's implementation of arrays, and their 
poor performance. I know little about GHC's internal workings, but
I was thinking about how array performance could be improved.

What if when writing an array you instead construct a function:

f :: (Ix x,Ix y) => Array a -> Ix x -> a -> Ix y -> a
f a x b y | x==y = b
  | otherwise = a!y

Then the update in place operator // becomes a curried application
of 'f' above.

You could then define a a series of 'overlays' for a base array.
The clever bit would be to get the garbage collector to merge
the two as soon as any reference to the original array is
discarded.

Does GHC already do anything like this?

Regards,
Keean Schupke.
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: ghc and signal processing

2004-02-23 Thread MR K P SCHUPKE
>b <- mapArray id a

The reason it is slow is because the array type is copied every time
a member is assigned.

There are two solutions:

1) Use a mutable-array in the StateMonad then freeze it.

2) In this particular case where processing is sequential (IE you
are only altering values based on *nearby* values, you can use streams.
One of the nicest features of Haskell is how lists (being lazy) operate
just like streams... 

So read the data into a list and define a filter of the type

filter :: [a] -> [a]

Your program then becomes:

main :: IO ()
main = do
s <- getMyDataAsList()
putMyList (filter s)

where the commands getMyDataAsList and putMyList have the types:

getMyDataAsList :: IO [a]

putMyList :: [a] -> IO ()


This should be fast, and also use very little memory.

Regards,
Keean Schupke.
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: optimization question

2004-02-23 Thread Simon Peyton-Jones
The trouble is that you probably *don't* want to expand this
case x of { "foogle" -> e1; _ -> e2 }
to this

  case x of
 c1:x1 -> case c1 of
'f' -> case x1 of
c2:x2 -> case c2 of 
'o' -> of 

So GHC generates a series of equality tests instead.  A decent
alternative might be:

generate case expressions when there is more
than one string in the list, otherwise use an equality test

That would not be hard to do.  If it becomes important to you, I'd have
a go. But before doing so, could you do the work by hand and see if it
makes a useful performance difference?

Simon

| -Original Message-
| From: [EMAIL PROTECTED]
[mailto:glasgow-haskell-users-
| [EMAIL PROTECTED] On Behalf Of Sven Panne
| Sent: 22 February 2004 15:32
| To: John Meacham
| Cc: [EMAIL PROTECTED]
| Subject: Re: optimization question
| 
| John Meacham wrote:
| > I was wondering if:
| >
| > case x of
| > "foo" -> Foo
| > "bar" -> Bar
| > "fuzz" -> Fuzz
| > "fuzo" -> Fuzo
| > x -> other .. thing
| >
| > would optimize to
| >
| > let z = other .. thing in
| > case x of
| > ('f':x) -> case x of
| > ('u':'z': x) ->
| > "z" -> Fuzz
| > "o" -> Fuzo
| > _ -> z
| > "oo" -> Foo
| > _ -> z
| > "bar" -> Bar
| > _ -> z
| 
| String literals are handled in a special way in GHC, so your example
is
| essentially converted into an if-cascade, which is not what you want.
| OTOH, if you write the strings in their expanded form like
['f','o','o'],
| you get your optimized version automatically. Perhaps Simon^2 can
comment
| on the rationale behind this, I can't remember the reason...
| 
| Cheers,
| S.
| 
| ___
| Glasgow-haskell-users mailing list
| [EMAIL PROTECTED]
| http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
___
Glasgow-haskell-users mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users