Here is haskell version that is faster than python, almost as fast as c++.You 
need to install bytestring-lexing package for readDouble.
bmaxa@maxa:~/haskell$ time ./printMatrixDecay - < output.txtread 16384 matrix 
elements (128x128 = 16384)[0.00e0, 1.00e-8) = 0 (0.00%) 0[1.00e-8, 1.00e-7) = 0 
(0.00%) 0[1.00e-7, 1.00e-6) = 0 (0.00%) 0[1.00e-6, 1.00e-5) = 0 (0.00%) 
0[1.00e-5, 1.00e-4) = 1 (0.01%) 1[1.00e-4, 1.00e-3) = 17 (0.10%) 18[1.00e-3, 
1.00e-2) = 155 (0.95%) 173[1.00e-2, 1.00e-1) = 1434 (8.75%) 1607[1.00e-1, 
1.00e0) = 14777 (90.19%) 16384[1.00e0, 2.00e0) = 0 (0.00%) 16384
real    0m0.031suser    0m0.028ssys     0m0.000sbmaxa@maxa:~/haskell$ time 
./printMatrixDecay.py - < output.txt(-) read 16384 matrix elements (128x128 = 
16384)[0.00e+00, 1.00e-08) = 0 (0.00%) 0[1.00e-08, 1.00e-07) = 0 (0.00%) 
0[1.00e-07, 1.00e-06) = 0 (0.00%) 0[1.00e-06, 1.00e-05) = 0 (0.00%) 0[1.00e-05, 
1.00e-04) = 1 (0.00%) 1[1.00e-04, 1.00e-03) = 17 (0.00%) 18[1.00e-03, 1.00e-02) 
= 155 (0.00%) 173[1.00e-02, 1.00e-01) = 1434 (0.00%) 1607[1.00e-01, 1.00e+00) = 
14777 (0.00%) 16384[1.00e+00, 2.00e+00) = 0 (0.00%) 16384
real    0m0.081suser    0m0.080ssys     0m0.000s
Program follows...
import System.Environmentimport Text.Printfimport Text.Regex.PCREimport 
Data.Maybeimport Data.Array.IOimport Data.Array.Unboxedimport qualified 
Data.ByteString.Char8 as Bimport Data.ByteString.Lex.Double (readDouble)
strataBounds :: UArray Int DoublestrataBounds = listArray (0,10) [ 0.0, 1.0e-8, 
1.0e-7, 1.0e-6, 1.0e-5, 1.0e-4, 1.0e-3, 1.0e-2, 1.0e-1, 1.0, 2.0 ]
newStrataCounts :: IO(IOUArray Int Int)newStrataCounts = newArray (bounds 
strataBounds) 0
main = do    l <- B.getContents    let a = B.lines l    strataCounts <- 
newStrataCounts    n <- calculate strataCounts a 0    let        
printStrataCounts :: IO ()        printStrataCounts = do            let s = 
round $ sqrt (fromIntegral n::Double) :: Int            printf "read %d matrix 
elements (%dx%d = %d)\n" n s s n            printStrataCounts' 0 0        
printStrataCounts' :: Int -> Int -> IO ()        printStrataCounts' i total     
        | i < (snd $ bounds strataBounds) = do                count <- 
readArray strataCounts i                let                     p :: Double     
               p = (100.0*(fromIntegral count) :: Double)/(fromIntegral n :: 
Double)                printf "[%1.2e, %1.2e) = %i (%1.2f%%) %i\n" 
(strataBounds ! i) (strataBounds ! (i+1))                                       
                          count p (total + count)                
printStrataCounts' (i+1) (total+count)            | otherwise = return ()    
printStrataCounts
calculate :: IOUArray Int Int -> [B.ByteString] -> Int -> IO Intcalculate _ [] 
n = return ncalculate counts (l:ls) n = do    let         a = case 
getAllTextSubmatches $ l =~ B.pack "matrix.*= ([0-9eE.+-]+)$" :: [B.ByteString] 
of                [_,v] -> Just (readDouble v) :: Maybe (Maybe 
(Double,B.ByteString))                _ -> Nothing        b = 
(fst.fromJust.fromJust) a        loop :: Int -> IO()        loop i            | 
i < (snd $ bounds strataBounds) =                 if (b >= (strataBounds ! i)) 
&& (b < (strataBounds ! (i+1)))                then do                    c <- 
readArray counts i                    writeArray counts i (c+1)                
else                     loop (i+1)            | otherwise = return ()    if 
isNothing a        then             calculate counts ls n        else do        
    loop 0            calculate counts ls (n+1)

From: nicolasb...@gmail.com
Date: Fri, 8 Feb 2013 12:26:09 -0700
To: haskell-cafe@haskell.org
Subject: [Haskell-cafe] performance question

Hi list,
I wrote a script that reads matrix elements from standard input, parses the 
input using a regular expression, and then bins the matrix elements by 
magnitude. I wrote the same script in python (just to be sure :) ) and find 
that the python version vastly outperforms the Haskell script.


To be concrete:
$ time ./createMatrixDump.py -N 128 | ./printMatrixDecayreal    0m2.655s
user    0m2.677ssys     0m0.095s


$ time ./createMatrixDump.py -N 128 | ./printMatrixDecay.py -
real    0m0.445suser    0m0.615ssys     0m0.032s
The Haskell script was compiled with "ghc --make printMatrixDecay.hs".


Could you have a look at the script and give me some pointers as to where I 
could improve it, both in terms of performance and also generally, as I am very 
new to Haskell.


Thanks already,
nick


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe                            
          
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to