Re: [Haskell-cafe] Optimizing cellular automata evaluation (round 2)

2007-11-30 Thread Jon Harrop
On Friday 30 November 2007 00:31, Justin Bailey wrote:
 I represent the automata as an array of integers, where each bit
 represents a cell.

Mathematica uses a single arbitrary-precision integer to represent each 
generation of a 1D automaton. The rules to derive the next generation are 
compiled into arithmetic operations on the integer. The offloads all such 
work onto your big number library and, with GMP, will be as fast in Haskell 
as most other languages.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing cellular automata evaluation (round 2)

2007-11-30 Thread Luke Palmer
On Nov 30, 2007 6:03 PM, Justin Bailey [EMAIL PROTECTED] wrote:
 On Nov 29, 2007 9:11 PM, Jon Harrop [EMAIL PROTECTED] wrote:
  Mathematica uses a single arbitrary-precision integer to represent each
  generation of a 1D automaton. The rules to derive the next generation are
  compiled into arithmetic operations on the integer. The offloads all such
  work onto your big number library and, with GMP, will be as fast in Haskell
  as most other languages.

 Does GHC already use the GMP library for Integer? It looks that way
 but I'm not positive. That'd be ironic, if the higher-level Integer
 representation is faster than a low-level bitwise one ... Still, I
 suspect accessing individual bits might kill me if I'm not able to
 move most of the calculation into a call to the library.

 Do you mind elaborating on how rules are compiled into 'arithmetic'
 operations or providing a link?

Integer is an instance of Bits, so you can just use bitshifts and
bitops to do any 1-D rule.  There might be a more efficient way.

For rule 110:

 111 110 101 100 011 010 001 000
  0   1   1   0   1   1   1   0

Algebraically, that's not (a  b  c || a  not b  not c || not a
 not b  not c)

So just translate that to:

rule z =
 complement ((a .. b .. c) .|. (a .. b' .. c') .|. (a' .. b' .. c'))
  where
  a = z `shift` (-1)
  b = z
  c = z `shift` 1
  a' = complement a
  b' = complement b
  c' = complement c

Modulo some algebra to make it even better, but this should be pretty
darn fast...

Luke
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing cellular automata evaluation (round 2)

2007-11-30 Thread Justin Bailey
On Nov 29, 2007 9:11 PM, Jon Harrop [EMAIL PROTECTED] wrote:
 Mathematica uses a single arbitrary-precision integer to represent each
 generation of a 1D automaton. The rules to derive the next generation are
 compiled into arithmetic operations on the integer. The offloads all such
 work onto your big number library and, with GMP, will be as fast in Haskell
 as most other languages.

Does GHC already use the GMP library for Integer? It looks that way
but I'm not positive. That'd be ironic, if the higher-level Integer
representation is faster than a low-level bitwise one ... Still, I
suspect accessing individual bits might kill me if I'm not able to
move most of the calculation into a call to the library.

Do you mind elaborating on how rules are compiled into 'arithmetic'
operations or providing a link?

Justin
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing cellular automata evaluation (round 2)

2007-11-30 Thread Justin Bailey
On Nov 29, 2007 4:45 PM, Felipe Lessa [EMAIL PROTECTED] wrote:
 Why don't you use an UArray of Bools? They're implemented as bit
 arrays internally, AFAIK (e.g. see
 http://www.haskell.org/haskellwiki/Shootout/Nsieve ). And then you
 would get rid of a lot of shifts and masks in your code -- clearer and
 faster, the Haskell Way (TM).

fill, fillE and fillS are all looking at a integer value, and not
doing any array access. I'm skeptical that an array of bools, even if
its internally a mask on an int, could be faster.

 Well, I guess it would be faster, but I can't really tell. Could you
 provide a main function implementing an use case for benchmark?

Sure, here is one

 http://hpaste.org/4151#a1

Compile with:

  ghc -O2 --make Test.hs

It runs a random rule for 1000 steps.

Justin
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing cellular automata evaluation (round 2)

2007-11-30 Thread Laurent Deniau

Justin Bailey wrote:

On Nov 29, 2007 9:11 PM, Jon Harrop [EMAIL PROTECTED] wrote:

Mathematica uses a single arbitrary-precision integer to represent each
generation of a 1D automaton. The rules to derive the next generation are
compiled into arithmetic operations on the integer. The offloads all such
work onto your big number library and, with GMP, will be as fast in Haskell
as most other languages.


Does GHC already use the GMP library for Integer? It looks that way
but I'm not positive. That'd be ironic, if the higher-level Integer
representation is faster than a low-level bitwise one ... Still, I
suspect accessing individual bits might kill me if I'm not able to
move most of the calculation into a call to the library.

Do you mind elaborating on how rules are compiled into 'arithmetic'
operations or providing a link?


This would mean that you are able to extract a global rule from your 
local rules (plural if you include topological dependencies) which is 
not possible in the general case. Global behavior was characterized in 
term of information propagation (classification), which was far not 
enough to deduce a global rule. But I haven't look at the domain for a 
decade now ;-)


a+, ld.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Optimizing cellular automata evaluation (round 2)

2007-11-29 Thread Justin Bailey
I posted awhile back asking for help improving my cellular automata
program. I am competing with a C program which evolves CAs using a
fairly simple genetic algorithm. The algorithm involves evaluating 100
rules on 100 CAs, 100 times. In C this takes about 1 second. In my
Haskell version, it takes about 20 minutes. That's an improvement from
my first attempts, which took 7 hours.

All the time is spent in the function that evaluates one iteration of
a given CA. It examines each cell and its neighbors and, based on the
values found, updates the cell to be white or black. Here is the code,
followed by an explanation of what it's doing. I'm looking for any
suggestions on making this faster. Prettier helps too but not at the
expense of speed.

  http://hpaste.org/4151#a0

I represent the automata as an array of integers, where each bit
represents a cell. Automata 'wrap' around, so marching through the
array and determing the value of all neighbors is tricky around the
edges. My goal was to minimize conditionals as much as possible and to
avoid any modulo arithmetic. I wanted to use straightforward array
access.

My implemention uses three functions: fill, fillS and fillE. Each
evalutaion of fill will determine the new values for cells represented
by the current array element. fill handles two conditions - when the
array element being examined is the last element or when it is not. It
calls fillS with appropriate arguments, the results of which become
the value for the integer in the current array position.

fillS then handles two conditions - when the neighborhood of the bit
being examined crosses to the next array element and when it does
not. When the neighborhood crosses to the next array element, fillE
is called. fillE finishes determining the values for the remaining
cells in the integer and returns, ending the current evaluation.

There are some other details in the code (for example firstRule) which
could be made faster and/or cleaner, but their contribution is far
outweighed by the fill function.

If there is one thing I'm happy about, its the memory performance of
this code. It runs in a constant 7 - 10MB of memory, even when I'm
doing 100 generations with 100 rules and 100 CAs. Each CA is evaluated
for 250 steps, which means every generation does 2.5M steps. That kind
of memory usage is pretty sweet.

Thanks in advance to those willing to take the time to look at this code!

Justin
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing cellular automata evaluation (round 2)

2007-11-29 Thread Felipe Lessa
On Nov 29, 2007 10:31 PM, Justin Bailey [EMAIL PROTECTED] wrote:
 I represent the automata as an array of integers, where each bit
 represents a cell.

Why don't you use an UArray of Bools? They're implemented as bit
arrays internally, AFAIK (e.g. see
http://www.haskell.org/haskellwiki/Shootout/Nsieve ). And then you
would get rid of a lot of shifts and masks in your code -- clearer and
faster, the Haskell Way (TM).

Well, I guess it would be faster, but I can't really tell. Could you
provide a main function implementing an use case for benchmark?

Good luck.

-- 
Felipe.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe