Re: [Haskell-cafe] fundata1 -- Karmic Social Capital Benchmark and Shootout

2010-10-30 Thread Johan Tibell
On Fri, Oct 29, 2010 at 1:17 PM, Daryoush Mehrtash dmehrt...@gmail.com wrote:
 In the lessons you say:

 Haskell proved too slow with String Map, so we ended up interning strings
 and working with an IntMap and a dictionary to disintern back to strings as
 a last step.  Daniel Fisher was instrumental in bringing Haskell up to speed
 with OCaml and then beating it.  Don Stewart provided awesome leadership and
 amazing modification of Haskell's core data structured before your very
 eyes.

A quick tip for anyone who needs to work on large maps with string
keys in the future: try Milan Straka's hashmap package. String
comparison is expensive and using a size balanced tree causes O(log n)
of them. The HashMap data type uses an IntMap internally and typically
(in the absence of collisions) needs to traverse the string only once,
to compute the hash, and can then perform O(log n) cheap Int
comparisons.

I'm working on a more efficient hash function for ByteString and Text:
an implementation of MurmurHash. The HashMap data type and the need
hash function together should give us good performance for maps with
string keys.

Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] fundata1 -- Karmic Social Capital Benchmark and Shootout

2010-10-29 Thread Daryoush Mehrtash
In the lessons you say:

Haskell proved too slow with String Map, so we ended up interning strings
 and working with an IntMap and a dictionary to disintern back to strings as
 a last step.  Daniel Fisher was instrumental in bringing Haskell up to speed
 with OCaml and then beating it.  Don Stewart provided awesome leadership and
 amazing modification of Haskell's core data structured before your very
 eyes.



Can you elaborate on this?

and What do you mean by: modification of Haskell's core data structured  ?

Daryoush

On Thu, Oct 28, 2010 at 5:53 PM, Alexy Khrabrov alexy.khrab...@gmail.comwrote:

 I am happy to announce fundata1 -- the largest-ever program per RAM
 allocation in Haskell, originally implemented in Clojure and then OCaml and
 Haskell for social network modeling.

 http://github.com/alexy/fundata1

 It has now become the first large-scale social networking benchmark with a
 real dynamic social graph built from the actual Twitter gardenhose, with the
 data OK'd by Twitter and supplied along with the benchmark.

 I wrote three reference implementations, all on github as well.  Clojure
 and OCaml are quite basic, while Haskell community had a chance to optimize
 its data structures and in fact fix a GC integer overflow while working on
 it.  You're welcome to fork and improve all of these implementations, and
 supply others!

 There's a Google Group,

 http://groups.google.com/group/fundata/

 to discuss the shootout.  There's also a blog about it and other functional
 things at

 http://functional.tv/

 Let the fun begin!

 -- Alexy Khrabrov
 firstname.lastnameATgmaildotcom

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe




-- 
Daryoush

Weblog:  http://perlustration.blogspot.com/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] fundata1 -- Karmic Social Capital Benchmark and Shootout

2010-10-29 Thread Don Stewart
dmehrtash:
 In the lessons you say:
 
 
 Haskell proved too slow with String Map, so we ended up interning strings
 and working with an IntMap and a dictionary to disintern back to strings 
 as
 a last step.  Daniel Fisher was instrumental in bringing Haskell up to
 speed with OCaml and then beating it.  Don Stewart provided awesome
 leadership and amazing modification of Haskell's core data structured
 before your very eyes.
 
 
 
 Can you elaborate on this?
 
 and What do you mean by: modification of Haskell's core data structured  ?

I think he means some of the stuff Daniel and I did with specializing
data structures (like IntMap) to their monomorphic key / elem types.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] fundata1 -- Karmic Social Capital Benchmark and Shootout

2010-10-28 Thread Alexy Khrabrov
I am happy to announce fundata1 -- the largest-ever program per RAM allocation 
in Haskell, originally implemented in Clojure and then OCaml and Haskell for 
social network modeling.  

http://github.com/alexy/fundata1

It has now become the first large-scale social networking benchmark with a real 
dynamic social graph built from the actual Twitter gardenhose, with the data 
OK'd by Twitter and supplied along with the benchmark.

I wrote three reference implementations, all on github as well.  Clojure and 
OCaml are quite basic, while Haskell community had a chance to optimize its 
data structures and in fact fix a GC integer overflow while working on it.  
You're welcome to fork and improve all of these implementations, and supply 
others!

There's a Google Group, 

http://groups.google.com/group/fundata/

to discuss the shootout.  There's also a blog about it and other functional 
things at

http://functional.tv/

Let the fun begin!

-- Alexy Khrabrov
firstname.lastnameATgmaildotcom

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] fundata1 -- Karmic Social Capital Benchmark and Shootout

2010-10-28 Thread Johan Tibell
Great stuff! I have an improvements to HashMaps that I'm working on
that will hopefully work well here.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe