[Haskell-cafe] GHCi (7.4.2) is working on ARM

2013-02-06 Thread kenny lu
I've got a Cubieboard a few weeks back. I started it off with Linaro
(Ubuntu) 12.06.
Today I started to upgrade the OS to 12.11, which surprisingly came with
ghc 7.4.2.

And ghci magically works too.

Here are the links

http://www.cubieboard.org

sudo apt-get install update-manager-core
sudo apt-get update
sudo apt-get upgrade
sudo do-release-upgrade

Regards,
Kenny
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Regular Expression Parsing via derivatives

2011-09-02 Thread kenny lu
Hi,

In regex-pderiv, we gave a practical implementation of regex matching using
partial derivative.
But of course we could easy write one which using brzozoski's derivative,
but some simplification is required other wise
there are infinitely many states. Probably Martin has an implementation
somewhere.

Regards,
Kenny


On Tue, Aug 2, 2011 at 12:38 AM, Alex Clemmer
clemmer.alexan...@gmail.comwrote:

 Hmm. Not sure how I missed that. And, I also inquired about developing a
 core featre instead of a library -- implying disparity where in retrospect
 there doesn't appear to be any.

 That's too bad, but thanks for the helpful response!


 On Mon, Aug 1, 2011 at 12:26 PM, Antoine Latter aslat...@gmail.comwrote:

 On Mon, Aug 1, 2011 at 10:51 AM, Alex Clemmer
 clemmer.alexan...@gmail.com wrote:
  Hi Haskell people,
 
  I've been snooping through various mailing lists and the current Haskell
  implementation of regular expressions and I was wondering if there has
 been
  a discussion about implementing regex parsing with derivatives. If so, I
  haven't seen it. If not, I'd like to have a discussion about it -- if
 for no
  other reason than to decide whether I should implement it as a library,
 or
  (to attempt to implement it) as a core feature.
 
  For those of you who don't know, recent work by Might and Darais
 indicates
  that parsing CFGs can be done better (i.e., significantly faster) than
 more
  traditional approaches. Might's presenting at ICFP later in September
  about it.
 
  I guess the first thing I should ask is, which mailing list is actually
 the
  right place to field this inquiry. I considered dropping it in the main
  haskell list, but wasn't sure how people would respond.
 

 This is probably the right list to ask.

 I don't know much about the topic, a a quick Google search turned up:

 http://hackage.haskell.org/package/regex-pderiv

 which has the right keywords.

 More discussion on related (or not!) here:


 http://www.google.com/search?q=Regular+Expression+derivative+haskellie=utf-8oe=utf-8aq=trls=org.mozilla:en-US:unofficialclient=firefox-a

 Antoine

  --
  Alex
 
 
  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe
 
 




 --
 Alex


 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] multi-line regex

2009-11-04 Thread kenny lu
Michael,

Here is how I do it.


 module Main where

 import Text.Regex.Posix.ByteString
 import Data.Maybe
 import qualified Data.ByteString.Char8 as S

 text = S.pack 11\n abcd \n22
 p = S.pack 11\n(.*)\n22


 main :: IO ()
 main =
  do  { (Right pat) - compile compExtended execBlank p
  ; res - regexec pat text
  ; case res of
  { (Right (Just (_,_,_,m))) - putStrLn (show m)
  ; _- putStrLn not matched.
  }
  }

You may swap out ByteString with String,
PCRE should be similar, too.

Regards,
Kenny


On Wed, Nov 4, 2009 at 2:04 PM, Michael Mossey m...@alumni.caltech.eduwrote:



 kenny lu wrote:

 Hi Michael,

 Could you give an example of what patterns you want to write?

 Regards,
 Kenny


 Something like

 text = 11\n abcd \n22
 answer = text =~ 11.*22 :: various possibilities

 and have it find the entire string. The default behavior is to stop
 matching when it encounters a newline. There is mention in the
 Text.Regex.Posix docs of a flag to control this behavior, but it is not easy
 to figure out from the docs how to provide flags. The left-hand side of the
 =~ is a very complex type.





 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] multi-line regex

2009-11-03 Thread kenny lu
Hi Michael,

Could you give an example of what patterns you want to write?

Regards,
Kenny

On Wed, Nov 4, 2009 at 1:35 PM, Michael Mossey m...@alumni.caltech.eduwrote:

 I have some very simple regex-matching needs, and Text.Regex.Posix will
 work fine, EXCEPT I need to match multi-line patterns, and/or find all
 occurrences of text that may occur several times on different lines. So I
 need to turn on some kind of flag. Can someone show me how to do that? I
 have worked the examples in RWH so I basically know how to run the thing.
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Singapore Functional Programmer Group First Meetup

2009-10-30 Thread kenny lu
Sorry for the late notice.

We are organizing an informal meeting for the Functional Programmer Group in
Singapore on 2 Nov 2009.

Since this is the first meeting, the theme will be mainly 'meet and greet'
and
discuss our interests in functional programming languages.

We have participants coming from both the industry and the academia.

If you happen to be in Singapore and interested in sharing your experience
and interests, please contact Kenny at luzhuomi_AT_gmail_DOT_com.


Best Regards,
Kenny Zhuo Ming Lu
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Network.Socket error in MacOS 10.5?

2009-08-26 Thread kenny lu
Hi,

I encountered a problem with Network.Socket in MacOS 10.5
Here is the code that I am testing,

-
-
module Main where

import qualified Network.Socket as Socket

main :: IO ()
main =
do { (hostname, _) - Socket.getNameInfo [] True False
(Socket.SockAddrUnix localhost)
   -- (hostname, _) - Socket.getNameInfo [] True False
(Socket.SockAddrInet 9000  (127 + 0 * 256 + 0 * 256^2 + 1 * 256^3))
   ; putStrLn (show hostname)
   }


Running the above code yields the following error
ghc --make -O2 TestSocket.hs
[1 of 1] Compiling Main ( TestSocket.hs, TestSocket.o )
Linking TestSocket ...
$ ./TestSocket
TestSocket: getNameInfo: does not exist (ai_family not supported)

If I switch to SockAddrInet instead, the error is gone.

I am using GHC 6.10.3 and Network 2.2.1

Regards,
Kenny
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Network.Socket error in MacOS 10.5?

2009-08-26 Thread kenny lu
Thanks for the pointers. I will take a look.

Kenny

On Thu, Aug 27, 2009 at 2:20 AM, Ross Mellgren rmm-hask...@z.odi.ac wrote:

 I don't think getNameInfo should work for for AF_UNIX -- the name given to
 SockAddrUnix is a file path, there is no name resolution. From the man page
 for getnameinfo(3) on OS X:

 NAME
   getnameinfo -- socket address structure to hostname and service name

 ...

 DESCRIPTION

 ...

   The sockaddr structure sa should point to either a sockaddr_in or
 sockaddr_in6 structure
 (for IPv4 or IPv6 respectively) that is salen bytes long.



 Similarly, from the man page for getnameinfo on my linux box:

 ...

 The sa argument is a pointer to a generic socket address structure (of type
 sockaddr_in or sockaddr_in6) of size salen that holds the input IP address
 and port number.

 -Ross
 On Aug 26, 2009, at 2:07 PM, Johan Tibell wrote:

  On Wed, Aug 26, 2009 at 6:33 PM, kenny luhaskellm...@gmail.com wrote:

 Hi,

 I encountered a problem with Network.Socket in MacOS 10.5
 Here is the code that I am testing,

 -
 -
 module Main where

 import qualified Network.Socket as Socket

 main :: IO ()
 main =
do { (hostname, _) - Socket.getNameInfo [] True False
 (Socket.SockAddrUnix localhost)
   -- (hostname, _) - Socket.getNameInfo [] True False
 (Socket.SockAddrInet 9000  (127 + 0 * 256 + 0 * 256^2 + 1 * 256^3))
   ; putStrLn (show hostname)
   }


 Running the above code yields the following error
 ghc --make -O2 TestSocket.hs
 [1 of 1] Compiling Main ( TestSocket.hs, TestSocket.o )
 Linking TestSocket ...
 $ ./TestSocket
 TestSocket: getNameInfo: does not exist (ai_family not supported)

 If I switch to SockAddrInet instead, the error is gone.

 I am using GHC 6.10.3 and Network 2.2.1


 Is SockAddrUnix supposed to work on Mac OS X? Could you test it by
 e.g. writing a small C program that uses it?

 -- Johan
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] A problem with bytestring 0.9.1.4 hGetBuf: invalid argument

2009-08-04 Thread kenny lu
Hi all,

I've recently came across a problem when processing a large text file
(around 2G in size).

I wrote a Haskell program to count the number of lines in the file.


module Main where

import System
import qualified Data.ByteString.Char8 as S
-- import Prelude as S

main :: IO ()
main = do { args - getArgs
  ; case args of
  { [ filename ] -
do { content - S.readFile filename
   ; let lns = S.lines content
   ; putStrLn (show $ length lns)
   }
  ; _ - error Usage : Wc file
  }
  }


I get this error, if I use the ByteString module,
./Wc a.out
Wc: {handle: a.out}: hGetBuf: invalid argument (illegal buffer size
(-1909953139))
Otherwise, it returns me the result.

Another observation is that if I reduce the size of the file, the ByteString
version works too.

Is it a known limitation?

Regards,
Kenny


A generator program that generate large file. (Warning, it is very slow, I
don't know how to speed it up)

-- generate a file

module Main where

import System
import qualified Data.ByteString.Char8 as S


l :: S.ByteString
l = S.pack All work, no fun, make Kenny a dull boy. 

main :: IO ()
main = do { args - getArgs
  ; case args of
  { [ n, fn ] - do { let i = read n
; mapM_ (\s - S.appendFile fn s) (take i $
repeat l)
}
  ; _ - return ()
  }
  }
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A problem with bytestring 0.9.1.4 hGetBuf: invalid argument

2009-08-04 Thread kenny lu
Oh right. Thanks for pointing out. :)

On Wed, Aug 5, 2009 at 10:06 AM, Don Stewart d...@galois.com wrote:

 haskellmail:
  Hi all,
 
  I've recently came across a problem when processing a large text file
 (around
  2G in size).
 
  I wrote a Haskell program to count the number of lines in the file.
 
 
  module Main where
 
  import System
  import qualified Data.ByteString.Char8 as S
  -- import Prelude as S
 
  main :: IO ()
  main = do { args - getArgs
; case args of
{ [ filename ] -
  do { content - S.readFile filename
 ; let lns = S.lines content
 ; putStrLn (show $ length lns)
 }
; _ - error Usage : Wc file
}
}
 
 
  I get this error, if I use the ByteString module,
  ./Wc a.out
  Wc: {handle: a.out}: hGetBuf: invalid argument (illegal buffer size
  (-1909953139))
  Otherwise, it returns me the result.
 
  Another observation is that if I reduce the size of the file, the
 ByteString
  version works too.
 
  Is it a known limitation?
 

 Yes, you need to use Data.ByteString.Lazy.Char8 to process files larger
 than this on a 32 bit machine (you'll have more space on a 64 bit
 machine).

 -- Don

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] exporting Data.ByteString functions via FFI

2009-06-12 Thread kenny lu
Hi,

I was trying to write a FFI wrapper for my Haskell program which manipulates

ByteString. But I am unable to compile/link it.


Here is the toy program.

{-# LANGUAGE ForeignFunctionInterface #-}

module B where

import Foreign.C.Types
import Foreign.C.String
import qualified Data.ByteString as BS

rev :: BS.ByteString - BS.ByteString
rev bstr = BS.reverse bstr

rev_hs :: CString - IO CString
rev_hs cstr =
do { bstr - BS.packCString cstr
   ; let bstr' = rev bstr
   ; cstr' - newCString (show bstr')
   ; return cstr'
   }

foreign export ccall rev_hs :: CString - IO CString


And here is the C counter-part.

#include B_stub.h
#include stdio.h

int main(int argc, char *argv[]) {
  char *str;
  hs_init(argc, argv);

  str = rev_hs(it works.);
  printf(Rev: %s\n, str);

  hs_exit();
  return 0;
}

Compiling B.hs alone seems fine, but errors popped up when I was trying to
compile/link it with C.

$ ghc -c -O B.hs

$ ghc -optc-O test_b.c B.o B_stub.o -o test_b
Undefined symbols:
  ___stginit_bytestringzm0zi9zi1zi4_DataziByteString_, referenced from:
  ___stginit_Lib_ in B.o
  _bytestringzm0zi9zi1zi4_DataziByteString_zdwreverse_info, referenced
from:
  _s19w_info in B.o
  _bytestringzm0zi9zi1zi4_DataziByteStringziInternal_zdwshowsPrec_info,
referenced from:
  _s19v_info in B.o
  _bytestringzm0zi9zi1zi4_DataziByteStringziInternal_zdwshowsPrec_closure,
referenced from:
  _Lib_zdwa_srt in B.o
  _bytestringzm0zi9zi1zi4_DataziByteString_zdwa4_info, referenced from:
  _Lib_zdwa_info in B.o
  _bytestringzm0zi9zi1zi4_DataziByteString_reverse_info, referenced from:
  _Lib_rev_info in B.o
ld: symbol(s) not found
collect2: ld returned 1 exit status


If I replace ByteString with the ordinary String, the above programs can be
compiled and linked.

Can someone tell me what I did wrong here?

-Kenny
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] exporting Data.ByteString functions via FFI

2009-06-12 Thread kenny lu
It works indeed. Thanks.

-Kenny

On Fri, Jun 12, 2009 at 3:50 PM, Jochem Berndsen joc...@functor.nl wrote:

 kenny lu wrote:
  Hi,
 
  I was trying to write a FFI wrapper for my Haskell program which
 manipulates
 
  ByteString. But I am unable to compile/link it.
 
 
  Here is the toy program.
 
  {-# LANGUAGE ForeignFunctionInterface #-}
 
  module B where
 
  import Foreign.C.Types
  import Foreign.C.String
  import qualified Data.ByteString as BS
 
  rev :: BS.ByteString - BS.ByteString
  rev bstr = BS.reverse bstr
 
  rev_hs :: CString - IO CString
  rev_hs cstr =
  do { bstr - BS.packCString cstr
 ; let bstr' = rev bstr
 ; cstr' - newCString (show bstr')
 ; return cstr'
 }
 
  foreign export ccall rev_hs :: CString - IO CString
 
 
  And here is the C counter-part.
 
  #include B_stub.h
  #include stdio.h
 
  int main(int argc, char *argv[]) {
char *str;
hs_init(argc, argv);
 
str = rev_hs(it works.);
printf(Rev: %s\n, str);
 
hs_exit();
return 0;
  }
 
  Compiling B.hs alone seems fine, but errors popped up when I was trying
 to
  compile/link it with C.
 
  $ ghc -c -O B.hs
 
  $ ghc -optc-O test_b.c B.o B_stub.o -o test_b
  Undefined symbols:
___stginit_bytestringzm0zi9zi1zi4_DataziByteString_, referenced from:
___stginit_Lib_ in B.o
_bytestringzm0zi9zi1zi4_DataziByteString_zdwreverse_info, referenced
  from:
_s19w_info in B.o
_bytestringzm0zi9zi1zi4_DataziByteStringziInternal_zdwshowsPrec_info,
  referenced from:
_s19v_info in B.o
 
 _bytestringzm0zi9zi1zi4_DataziByteStringziInternal_zdwshowsPrec_closure,
  referenced from:
_Lib_zdwa_srt in B.o
_bytestringzm0zi9zi1zi4_DataziByteString_zdwa4_info, referenced from:
_Lib_zdwa_info in B.o
_bytestringzm0zi9zi1zi4_DataziByteString_reverse_info, referenced
 from:
_Lib_rev_info in B.o
  ld: symbol(s) not found
  collect2: ld returned 1 exit status
 
 
  If I replace ByteString with the ordinary String, the above programs can
 be
  compiled and linked.
 
  Can someone tell me what I did wrong here?

 Add -package bytestring to the ghc command line options. I believe that
 adding --make also may work.

 Regards,
 --
 Jochem Berndsen | joc...@functor.nl
 GPG: 0xE6FABFAB

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] implementing python-style dictionary in Haskell

2008-11-18 Thread kenny lu
Dear all,

I am trying to implement the python-style dictionary in Haskell.

Python dictionary is a data structure that maps one key to one value.
For instance, a python dictionary
d = {'a':1, 'b':2 }
maps key 'a' to 1, 'b' to 2.
Python dictionary allows for update. e.g. the statement
d['a'] = 3
changes the value pointed by 'a' from 1 to 3.

Internally, python dictionary is implemented using hash table.

My first attempt is to use Data.HashTable. However it was immediately
abandoned, as I realize the memory usage is unreasonably huge.


== SECOND ATTEMPT ==
My second attempt is to use Data.Map


{-# OPTIONS_GHC -fglasgow-exts #-}
module Main where

import qualified Data.HashTable as HT
import qualified Data.IntMap as IM
import qualified Data.Map as DM
import qualified Data.ByteString.Char8 as S
import Data.Char


-- the Dict type class
class Dict d k v | d - k v where
empty :: d
insert :: k - v - d - d
lookup :: k - d - Maybe v
update :: k - v - d - d

-- Let's use string as key
type Key = String

-- insert key-value pairs into a dictionary
fromList :: Dict d k a = [(k,a)] - d
fromList l =
foldl (\d (key,val) - insert key val d) empty l


instance Dict (DM.Map S.ByteString a) Key a where
empty = DM.empty
insert key val dm =
let packed_key = S.pack key
in DM.insert packed_key val dm
lookup key dm =
let packed_key = S.pack key
in DM.lookup packed_key dm
update key val dm =
let packed_key = S.pack key
in DM.update (\x - Just val) packed_key dm


Which kinda works, however since Map is implemented using a balanced tree,
therefore,
when as the dictionary grows, it takes a long time to insert new key-value
pair.


== THIRD ATTEMPT ==
My third attempt is to use Data.IntMap

-- an implementation of Dict using IntMap
instance Dict (IM.IntMap a) Key a where
empty = IM.empty
insert key val im =
let int_key = fromIntegral (HT.hashString key)
in IM.insert int_key val im
lookup key im =
let int_key = fromIntegral (HT.hashString key)
in IM.lookup int_key im
update key val im =
let int_key = fromIntegral (HT.hashString key)
in IM.update (\x - Just val) int_key im


This implementation is faster than the Map approach, however this
implementation
can't handle collision well, two keys which are hashed into the same integer
will overwrite each other.


== FOURTH ATTEMPT ==

My fourth implementation is to use Trie. The idea is to split a string (a
key) into
a list of 4-character chunks. Each chunk can be mapped into a 32-bit integer
without collision.
We then insert the value with this list of chunks into the Trie.


-- an implementation of Dict using Trie
instance Dict (Trie a) Key a where
empty = emptyTrie
insert key val trie =
let key_chain = chain key
in insertTrie key_chain val trie
lookup key trie =
let key_chain = chain key
in lookupTrie key_chain trie
update key val trie =
let key_chain = chain key
in updateTrie key_chain val trie


-- an auxillary function that splits string into small pieces,
-- 4 characters per piece, 4 chars = 32 bit
chain :: Key - [Key]
chain k | length k  4 = let (k',ks) = splitAt 4 k
 in (k':chain ks)
| otherwise= [k]

-- a collision-free hash function which turns four chars into Int32
safehash :: [Char] - Int
safehash cs | length cs  4 = error safehash failed.
| otherwise =
sum [ (ord c)*(256^i)   | (c,i) - zip cs [0..3] ]


-- a trie datatype
data Trie a = Trie [a] (IM.IntMap (Trie a))

-- the empty trie
emptyTrie = Trie [] (IM.empty)

-- insert value into the trie
insertTrie :: [String] - a - Trie a - Trie a
insertTrie [] i (Trie is maps) = Trie (i:is) maps
insertTrie (word:words) i (Trie is maps) =
let key = safehash word
in case IM.lookup key maps of
 { Just trie - let trie' = insertTrie words i trie
maps' = IM.update (\x - Just trie') key maps
in Trie is maps'
 ; Nothing - let trie = emptyTrie
  trie' = insertTrie words i trie
  maps' = IM.insert key trie' maps
  in Trie is maps'
 }

-- lookup value from the trie
lookupTrie :: [String] - Trie a - Maybe a
lookupTrie [] (Trie vs _) =
case vs of
  [] - Nothing
  (x:_) - Just x
lookupTrie (word:words) (Trie is maps) =
let key = safehash word
in case IM.lookup key maps of
   Just trie - lookupTrie words trie
   Nothing   - Nothing

-- update the trie with the given value.
updateTrie :: [String] - a - Trie a - Trie a
-- we only update the first value and leave the rest unchanged.
updateTrie [] y (Trie (x:xs) maps) = Trie (y:xs) maps
updateTrie (word:words) v  (Trie is maps) =
let key = safehash word
in case IM.lookup key maps of
   Just trie - let trie' = updateTrie words v trie
maps'  = IM.update (\x - Just trie') key maps
in Trie is maps'
   Nothing   - Trie is maps



== BENCH MARK ==

I have a 

Re: [Haskell-cafe] implementing python-style dictionary in Haskell

2008-11-18 Thread kenny lu
Hi Bulat,



 1. why you think that your code should be faster? pythob
 implementation is probably written in C ince it's one of its core data
 structures


I am not hoping that my code should be faster, but at least not as slow as
what it gets.
Basically I am looking for an implementation which is close to the one in
python.



 2. you can solve IntMap problem by storing list of values with the
 same hash in tree's nodes


Yeah, that would probably speed up the building time of the dictionary.
However, storing the list of values in the tree nodes requires storing their
original keys,
so that it maintains the one-key-to-one-value semantics. This would takes up
more space
compared to the Trie approach.

Regards,
Kenny
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] implementing python-style dictionary in Haskell

2008-11-18 Thread kenny lu
Dear Don,
I am using GHC 6.8.1

Regards,
Kenny

On Tue, Nov 18, 2008 at 11:33 PM, Don Stewart [EMAIL PROTECTED] wrote:

 Which version of GHC and which version of the Data.ByteString library?
 There was an inlining bug related to Data.Map /Data.IntMap performance
 fixed between the 6.8.x release and the current bytestring release.

 In testing, Data.Map with strict bytestring keys matched the python (C
 implemented) dictionary, after I fixed the inlining for word lookups.

 You'll need to be using bytestring 0.9.1.x though.

 -- Don

 haskellmail:
 Dear all,
 
 I am trying to implement the python-style dictionary in Haskell.
 
 Python dictionary is a data structure that maps one key to one value.
 For instance, a python dictionary
 d = {'a':1, 'b':2 }
 maps key 'a' to 1, 'b' to 2.
 Python dictionary allows for update. e.g. the statement
 d['a'] = 3
 changes the value pointed by 'a' from 1 to 3.
 
 Internally, python dictionary is implemented using hash table.
 
 My first attempt is to use Data.HashTable. However it was immediately
 abandoned, as I realize the memory usage is unreasonably huge.
 
 == SECOND ATTEMPT ==
 My second attempt is to use Data.Map
 
 {-# OPTIONS_GHC -fglasgow-exts #-}
 module Main where
 
 import qualified Data.HashTable as HT
 import qualified Data.IntMap as IM
 import qualified Data.Map as DM
 import qualified Data.ByteString.Char8 as S
 import Data.Char
 
 -- the Dict type class
 class Dict d k v | d - k v where
 empty :: d
 insert :: k - v - d - d
 lookup :: k - d - Maybe v
 update :: k - v - d - d
 
 -- Let's use string as key
 type Key = String
 
 -- insert key-value pairs into a dictionary
 fromList :: Dict d k a = [(k,a)] - d
 fromList l =
 foldl (\d (key,val) - insert key val d) empty l
 
 instance Dict (DM.Map S.ByteString a) Key a where
 empty = DM.empty
 insert key val dm =
 let packed_key = S.pack key
 in DM.insert packed_key val dm
 lookup key dm =
 let packed_key = S.pack key
 in DM.lookup packed_key dm
 update key val dm =
 let packed_key = S.pack key
 in DM.update (\x - Just val) packed_key dm
 
 Which kinda works, however since Map is implemented using a balanced
 tree,
 therefore,
 when as the dictionary grows, it takes a long time to insert new
 key-value
 pair.
 
 == THIRD ATTEMPT ==
 My third attempt is to use Data.IntMap
 
 -- an implementation of Dict using IntMap
 instance Dict (IM.IntMap a) Key a where
 empty = IM.empty
 insert key val im =
 let int_key = fromIntegral (HT.hashString key)
 in IM.insert int_key val im
 lookup key im =
 let int_key = fromIntegral (HT.hashString key)
 in IM.lookup int_key im
 update key val im =
 let int_key = fromIntegral (HT.hashString key)
 in IM.update (\x - Just val) int_key im
 
 This implementation is faster than the Map approach, however this
 implementation
 can't handle collision well, two keys which are hashed into the same
 integer will overwrite each other.
 
 == FOURTH ATTEMPT ==
 
 My fourth implementation is to use Trie. The idea is to split a string
 (a
 key) into
 a list of 4-character chunks. Each chunk can be mapped into a 32-bit
 integer without collision.
 We then insert the value with this list of chunks into the Trie.
 
 -- an implementation of Dict using Trie
 instance Dict (Trie a) Key a where
 empty = emptyTrie
 insert key val trie =
 let key_chain = chain key
 in insertTrie key_chain val trie
 lookup key trie =
 let key_chain = chain key
 in lookupTrie key_chain trie
 update key val trie =
 let key_chain = chain key
 in updateTrie key_chain val trie
 
 -- an auxillary function that splits string into small pieces,
 -- 4 characters per piece, 4 chars = 32 bit
 chain :: Key - [Key]
 chain k | length k  4 = let (k',ks) = splitAt 4 k
  in (k':chain ks)
 | otherwise= [k]
 
 -- a collision-free hash function which turns four chars into Int32
 safehash :: [Char] - Int
 safehash cs | length cs  4 = error safehash failed.
 | otherwise =
 sum [ (ord c)*(256^i)   | (c,i) - zip cs [0..3] ]
 
 -- a trie datatype
 data Trie a = Trie [a] (IM.IntMap (Trie a))
 
 -- the empty trie
 emptyTrie = Trie [] (IM.empty)
 
 -- insert value into the trie
 insertTrie :: [String] - a - Trie a - Trie a
 insertTrie [] i (Trie is maps) = Trie (i:is) maps
 insertTrie (word:words) i (Trie is maps) =
 let key = safehash word
 in case IM.lookup key maps of
  { Just trie - let trie' = insertTrie words i trie