Thanks to all again for your great advice. On Devon's advice, I made a stochastic version. For the empty game it calculates all relevant outcomes for 800000 random shuffles of 13 cards, in 117 seconds. I then got it working with four simultaneous processes (my processor has four cores), each doing 200000 random shuffles, and the whole thing finishes in 30 seconds. I didn't use any kind of locking or mutex. I made a shared jmf with an array of boxes, one for each process to puts its result in. I then had a main process launch the threads and check every second whether the every box had been filled. I don't know how safe this is, but it seems to work. Two processes never write to the same place in the array at the same time, so I thought it would be okay.
I have a copy of Mathematica 8, which has a package called CUDALink. It allows one to easily run computations on one's NVidia graphics card. It comes with a lot of in-built things like dot products on arrays, folding and mapping with simple arithmetic, etc and it allows one to write C libraries and run them as well. What's more, it comes with some good random number generation facilities to run on the graphics card. It would be nice if J could provide something I can run on my graphics card instead of having to port things to C. I don't know much about this subject or C in general. Has anyone had experience of running J on graphics cards, or making J produce code which I might be able to use for this purpose? ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
