Scrabble seems trivially parallelizable (one thread per sim iteration - if you want to sim 1000 iterations, you can make perfect use a 1000 cores). I don't understand your statement regarding dependent memory accesses (a GPU is really good at scattered reads, even dependent ones, btw). At least in the case of generating 10 choices and simming them, I don't see how any of them are interdependent. Am I misunderstanding how Quackle works?
________________________________ From: David Jones <[email protected]> To: [email protected] Sent: Thursday, April 9, 2009 11:36:25 AM Subject: Re: [quackle] GPU acceleration for Quackle. Scrabble is memory-intensive due to dictionary lookup. Furthermore, each memory access is dependent on the previous access, so parallelization isn't really possible. Far better is to code up a DAWG version of the algorithm such that the move generator and its working set can fit entirely within the L2 data cache of your processor (could be as small as 512KB). Then run this on a multi-core processor (both AMD and Intel are shipping quad-core processors) with one thread per processor. If you can keep the cores from overwhelming the L3/DRAM controller (DAWG...) then you may actually get some performance. On Wed, Apr 8, 2009 at 11:11 AM, John O'Laughlin <olaugh...@gmail. com> wrote: Zax, General purpose computing on GPUs is interesting but even if it were possible to apply it to Quackle it sounds painful. To the extent that I'm working on Scrabble programming these days, I'm more interested in getting the right answer slowly than the wrong answer quickly. I rarely play against the simming computer players, and I prefer to restrict my analysis to tougher positions where I wouldn't mind running a smarter version of Quackle overnight to get something useful. John On Wed, Apr 8, 2009 at 10:22 AM, <zax_...@yahoo. com> wrote: > > Anyone familiar with NVIDIA Cuda? Seems like a crazy way to accelerate > simulation. > > ------------ --------- --------- ------ Yahoo! Groups Links
