Re: darcs switchover

2006-01-18 Thread Malcolm Wallace
John Goerzen [EMAIL PROTECTED] writes:

  Meanwhile, I noted that the HaXml repo on darcs.haskell.org seems
  to be a verbatim copy of the darcs repo at York.
 
 Ahh.  You are correct.
 
 Re-converting now, since you've presumably committed patches to the
 darcs side, is probably not going to be practical.

Actually, the way I have been working is to commit changes to CVS
first, then to (forget to) propagate them into darcs.  AFAIK, the
two repositories are in synch right now, but if there is ever any
discrepancy, I always treat the CVS one as correct.

Thus, now would be an excellent time to re-convert, and I would
then throw away my own darcs repo and switch to treating the
darcs.haskell.org repo as the master.

I can't remember where the rest of the ghc conversion process has
reached - is it also about ready to switch over to darcs-as-master yet?

Regards,
Malcolm
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


GHC vs. GCC on raw vector addition

2006-01-18 Thread Sven Moritz Hallberg
Hi List,

I'm running GHC and GCC head-to-head on the task of adding a bunch of
long IOUArray-Vectors really fast. My machine is a Linux-ppc PowerBook
and gets a runtime for the GHC-compiled binary that's about 10x as long
as for GCC. Simon M. tells me this should be much better. Here are the
precise command lines:

$ gcc -O3 -o b-addvec-gcc b-addvec-c.c
$ ghc -O2 -fasm --make -o b-addvec-ghc b-addvec-hs.hs

$ time ./b-addvec-gcc 10 elements 1000

real0m5.130s
user0m4.466s
sys 0m0.061s

$ time ./b-addvec-ghc 10 elements 1000

real0m49.701s
user0m43.466s
sys 0m0.586s

(compiling with -fvia-C -optc-O3 runs only about 1 second longer)


Can somebody shed some light on this?

Regards,
Sven Moritz

#include stdio.h
#include stdlib.h

int main(int argc, char **argv)
{
int nelems, niterations;
int i;
double *x, *v;

if(strcmp(argv[2], elements) == 0) {
nelems = atoi(argv[1]);
niterations = atoi(argv[3]);
}
else if(strcmp(argv[2], iterations) == 0) {
nelems = atoi(argv[3]);
niterations = atoi(argv[1]);
}
else {
exit(-1);
}

x = malloc(nelems * sizeof(double));
v = malloc(nelems * sizeof(double));
srand(time());
for(i=0; inelems; i++) {
x[i] = 0.0;
v[i] = ((double)rand()/RAND_MAX)*2 - 1.0;
}

for(i=0; initerations; i++) {
int j;
for(j=0; jnelems; j++)
x[j] = x[j] + v[j];
}

/*
for(i=0; inelems; i++)
printf(%f , x[i]);
printf(\n);
*/

return 0;
}

import Data.Array.IO
import System.Environment (getArgs)
import System.Random

type Vector = IOUArray Int Double

main = do  (n:f:m:_) - getArgs
   let  (nelems,niterations) = case f of
elements- (read n, read m)
iterations  - (read m, read n)

   x - newArray (0,nelems-1) 0 :: IO Vector
   v - newArray_ (0,nelems-1) :: IO Vector

   for 0 nelems $ \i -
 do  r - randomRIO (-1,1)
 writeArray v i r
   
   for 0 niterations $ \_ -
 for 0 nelems $ \i -
   do  xi - readArray x i
   vi - readArray v i
   writeArray x i (xi+vi)
   
   --for 0 nelems $ \i -
   --  do  xi - readArray x i
   --  putStr (show xi)
   --  putChar ' '
   --putChar '\n'

for :: Int - Int - (Int - IO a) - IO ()
--for i n f
--  | i  n  = do  f i
-- for (i+1) n f
--  | otherwise  = return ()
for i n f = mapM_ f [i..n-1]


signature.asc
Description: OpenPGP digital signature
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread Malcolm Wallace
Sven Moritz Hallberg [EMAIL PROTECTED] writes:

 I'm running GHC and GCC head-to-head on the task of adding a bunch of
 long IOUArray-Vectors really fast. My machine is a Linux-ppc PowerBook
 and gets a runtime for the GHC-compiled binary that's about 10x as long
 as for GCC.

Is it possible that gcc is making use of the ppc AltiVec instructions,
and ghc is not?

Regards,
Malcolm
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread Bulat Ziganshin
Hello Sven,

Wednesday, January 18, 2006, 3:33:40 PM, you wrote:

SMH and gets a runtime for the GHC-compiled binary that's about 10x as long
SMH as for GCC. Simon M. tells me this should be much better. Here are the

attached version is only 5 times slower :)  please note that

1) unsafeRead/Write indexes from 0 and don't checks bounds (just C-like :)
2) generating random values takes about 1.5-2 seconds by itself.
Haskell's RNG is very different from C's one



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

new.hs
Description: Binary data
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread Simon Marlow

Bulat Ziganshin wrote:


Wednesday, January 18, 2006, 3:33:40 PM, you wrote:

SMH and gets a runtime for the GHC-compiled binary that's about 10x as long
SMH as for GCC. Simon M. tells me this should be much better. Here are the

attached version is only 5 times slower :)  please note that

1) unsafeRead/Write indexes from 0 and don't checks bounds (just C-like :)
2) generating random values takes about 1.5-2 seconds by itself.
Haskell's RNG is very different from C's one


I squeezed a bit more out (see attached).  I think the main bottleneck 
is now the random number generator, in particular it is supplying boxed 
Doubles which have to be unboxed again before storing in the array.


Cheers,
Simon
import Data.Array.IO
import Data.Array.Base
import System.Environment (getArgs)
import System.Random

type Vector = IOUArray Int Double

main = do  (n:f:m:_) - getArgs
   let  (nelems,niterations) = case f of
elements- (read n, read m)
iterations  - (read m, read n)

   x - newArray (0,nelems-1) 0 :: IO Vector
   v - newArray_ (0,nelems-1) :: IO Vector

   x `seq` v `seq` return ()

   for 0 nelems $ \i -
 do  r - randomRIO (-1,1)
 unsafeWrite v i r

   for 0 niterations $ \_ -
 for 0 nelems $ \i -
   do  xi - unsafeRead x i
   vi - unsafeRead v i
   unsafeWrite x i (xi+vi)

   --for 0 nelems $ \i -
   --  do  xi - unsafeRead x i
   --  putStr (show xi)
   --  putChar ' '
   --putChar '\n'

for :: Int - Int - (Int - IO a) - IO ()
-- Faster equivalent of mapM_ action [from..to-1]
for from to action | from `seq` to `seq` False = undefined
for from to action  = go from
  where
go i | i=to  = return ()
 | otherwise = do action i
  go $! (i+1)
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re[2]: GHC vs. GCC on raw vector addition

2006-01-18 Thread Bulat Ziganshin
Hello Malcolm,

Wednesday, January 18, 2006, 4:22:23 PM, you wrote:
 I'm running GHC and GCC head-to-head on the task of adding a bunch of
 long IOUArray-Vectors really fast. My machine is a Linux-ppc PowerBook
 and gets a runtime for the GHC-compiled binary that's about 10x as long
 as for GCC.

MW Is it possible that gcc is making use of the ppc AltiVec instructions,
MW and ghc is not?

:)  even C version performs only 20 millions of additions in one second
because this program is most limited by memory throughput - it access
to 24 memory bytes per each addition. GHC just can't produce simple
loops even for imperative code. JHC can be much better in that area,
i strongly recommend Sven to try it


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Lexically scoped type variables

2006-01-18 Thread Christian Maeder

Simon Peyton-Jones wrote:

I'm very interested to know whether you like it or hate it.
In the latter case, I'd also like to know whether you also 
have programs that will be broken by the change.


I don't use GADTs yet and I assume this change will not (seriously) 
break our code, but let me/us know which compiler we should use for testing.


Cheers Christian
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re[2]: GHC vs. GCC on raw vector addition

2006-01-18 Thread Bulat Ziganshin
Hello Simon,

Wednesday, January 18, 2006, 5:31:25 PM, you wrote:

 2) generating random values takes about 1.5-2 seconds by itself.
 Haskell's RNG is very different from C's one

SM I squeezed a bit more out (see attached).
   x `seq` v `seq` return ()

it's new trick for me :)   now the difference is less than 3x

btw, i also use return $! length xs trick to ensure that all xs
elements will be evaluated

for from to action | from `seq` to `seq` False = undefined

and this changes nothing, at least with 6.4.1/mingw32

btw, using mapM_ action [n..m] is very common operation. can it be
automatically substituted with my code by using some RULE pragmas in ghc
libraries? that will automatically improve many ghc-compiled programs


too, i use the following code instead of replicateM:

myReplicateM n action = if (n=5*10^4)
  then sequence (replicate n action)
  else goLarge n [] = return.reverse
  where
goLarge 0 xs = return xs
goLarge n xs = do x - action
  (goLarge $! n-1) $! x:xs


it doesn't overflow stack and works much faster for the large n. that
is my testbed for this function:

import Control.Monad
main = do a - replicateM 1 $ myReplicateM (1*10^6) (return 1)
  return $! sum (map last a)


and also, how about adding to GHC strictness annotations?

x - newArray (0,nelems-1) 0 :: IO !Vector
v - newArray_ (0,nelems-1) :: IO !Vector
for :: !Int - !Int - (!Int - IO a) - IO ()

it's SO common source of performance problems...


SM I think the main bottleneck 
SM is now the random number generator, in particular it is supplying boxed 
SM Doubles which have to be unboxed again before storing in the array.

as i say, it uses 1.5-2 seconds, i.e. only 10% of time when you run
1000 iterations (may be you not noticed that it used only in
initialization?). so, while RNG itself runs 150 times slower (!), it
doesn't make so much difference when you run 1000 iterations after
initial filling the array


and about using Altivec instructions. the code produced for new.hs
contains only one `fadd` operation, so it is easy to find entire cycle
as it is compiled by GHC. that is one: 

movl(%ebp), %eax
cmpl12(%esi), %eax
jge L81
movl8(%esi), %edx
leal8(%edx,%eax,8), %eax
movl(%eax), %edx
movl%edx, 16(%esp)
movl4(%eax), %eax
movl%eax, 20(%esp)
fldl16(%esp)
fstpl   24(%esp)
fldl24(%esp)
fstpl   48(%esp)
movl(%ebp), %eax
movl4(%esi), %edx
leal8(%edx,%eax,8), %eax
movl(%eax), %edx
movl%edx, 8(%esp)
movl4(%eax), %eax
movl%eax, 12(%esp)
fldl8(%esp)
fstpl   24(%esp)
fldl24(%esp)
fstpl   40(%esp)
fldl48(%esp)
faddl   40(%esp)
fstpl   32(%esp)
movl(%ebp), %ecx
movl8(%esi), %eax
leal8(%eax,%ecx,8), %ecx
fldl32(%esp)
fstpl   24(%esp)
movl24(%esp), %eax
movl28(%esp), %edx
movl%eax, (%ecx)
movl%edx, 4(%ecx)
incl(%ebp)
movl$_s3IY_info, %eax
L85:
jmp *%eax
L81:


good work, yes? ;-)  the C source is also amateur :)

IF_(s3IY_entry) {
W_ _c3MF;
StgDouble _s3IP;
StgDouble _s3IQ;
StgDouble _s3IS;
W_ _s3IW;
FB_
_c3MF = (I_)(*Sp) = (I_)(R1.p[3]);
if (_c3MF = 0x1U) goto _c3MI;
_s3IP = PK_DBL((P_)(((R1.p[2]) + 0x8U) + ((*Sp)  0x3U)));
_s3IQ = PK_DBL((P_)(((R1.p[1]) + 0x8U) + ((*Sp)  0x3U)));
_s3IS = _s3IP + _s3IQ;
ASSIGN_DBL((P_)(((R1.p[2]) + 0x8U) + ((*Sp)  0x3U)),_s3IS);
_s3IW = (*Sp) + 0x1U;
*Sp = _s3IW;
JMP_((W_)s3IY_info);
_c3MI:
R1.p = (P_)(W_)GHCziBase_Z0T_closure;
Sp=Sp+1;
JMP_(*Sp);
FE_
}

the only cause that this code is only 3 times slower is that C version
is really limited by memory speed. when tested on 1000-element
arrays, it is 20 times slower. i'm not yet tried SSE optimization for
gcc ;)

-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Error in GHC

2006-01-18 Thread Tays Soares
I'm trying to run the following sequence on ghc 6.4:   ghc -fglasgow-exts --make Main ghc -o exec Main.o Exemplo1.oBut I always get this error message after the second command:/usr/lib/ghc-6.4/libHSrts.a(Main.o)(.text+0xe): In function `main':: undefined reference to `__stginit_ZCMain'/usr/lib/ghc-6.4/libHSrts.a(Main.o)(.text+0x28): In function `main':: undefined reference to `ZCMain_main_closure'collect2: ld returned 1 exit statusPlease, does anybody know what can I do to fix it?ThanksTays Cristina do Amaral Pales Soares
		 
Yahoo! doce lar. Faça do Yahoo! sua homepage.___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re[3]: GHC vs. GCC on raw vector addition

2006-01-18 Thread Bulat Ziganshin
Hello Bulat,

Wednesday, January 18, 2006, 8:34:54 PM, you wrote:

BZ the only cause that this code is only 3 times slower is that C version
BZ is really limited by memory speed. when tested on 1000-element
BZ arrays, it is 20 times slower. i'm not yet tried SSE optimization for
BZ gcc ;)

sorry, with the gcc -O3 -ffast-math -fstrict-aliasing -funroll-loops
the C version is 50 times faster than best Haskell one... it's the
loop from C version:

L18:
fldl (%edx)
faddl (%ecx)
fstpl (%edx)
fldl 8(%edx)
faddl 8(%ecx)
fstpl 8(%edx)
fldl 16(%edx)
faddl 16(%ecx)
fstpl 16(%edx)
fldl 24(%edx)
faddl 24(%ecx)
addl $4,%ebx
addl $32,%ecx
fstpl 24(%edx)
addl $32,%edx
cmpl -4(%ebp),%ebx
jl L18



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Error in GHC

2006-01-18 Thread Lemmih
On 1/18/06, Tays Soares [EMAIL PROTECTED] wrote:
  I'm trying to run the following sequence on ghc 6.4:
   ghc -fglasgow-exts --make Main

  ghc -o exec Main.o Exemplo1.o

 But I always get this error message after the second command:
 /usr/lib/ghc-6.4/libHSrts.a(Main.o)(.text+0xe): In function `main':
 : undefined reference to `__stginit_ZCMain'
 /usr/lib/ghc-6.4/libHSrts.a(Main.o)(.text+0x28): In function `main':
 : undefined reference to `ZCMain_main_closure'
 collect2: ld returned 1 exit status

 Please, does anybody know what can I do to fix it?

Does your main module contain a function called main? Does ghc
-fglasgow-exts --make Main -o exec work?

--
Friendly,
  Lemmih
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread John Meacham
On Wed, Jan 18, 2006 at 06:18:29PM +0300, Bulat Ziganshin wrote:
 :)  even C version performs only 20 millions of additions in one second
 because this program is most limited by memory throughput - it access
 to 24 memory bytes per each addition. GHC just can't produce simple
 loops even for imperative code. JHC can be much better in that area,
 i strongly recommend Sven to try it

Jhc doesn't have 'true' arrays yet, partially because I have not decided
how points-to analysis should work for them. (I will probably just union
all their points-to information since they most likely will be filled
by the same routine). 

GHCs indirect calls are really killing its performance in tight loops. I
think there is room for collaboration between the various compilers
there, since we are all moving to a c-- back end (in theory) we could
work on a common c-- - C translator that searches out such uneeded
indirections and zaps them before they get to gcc which doesn't handle
them well at all.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈ 
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread John Meacham
On Wed, Jan 18, 2006 at 08:54:43PM +0300, Bulat Ziganshin wrote:
 sorry, with the gcc -O3 -ffast-math -fstrict-aliasing -funroll-loops
 the C version is 50 times faster than best Haskell one... it's the
 loop from C version:

I believe something similar to what I noted here is the culprit:
http://www.haskell.org//pipermail/glasgow-haskell-users/2005-October/009174.html

it is fixable, but not without modifying ghc.
John

-- 
John Meacham - ⑆repetae.net⑆john⑈ 
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users