[computer-go] Lockless hash table and other parallel search ideas

Rémi Coulom Fri, 21 Mar 2008 14:48:53 -0700

Hi,

I have got a lockless hash table to work, and I thought I'd share theresults.

A lockless hash table is very important, because the usual approach thatconsists in using a global lock during tree search and update does notscale well, especially on 9x9. But it is possible to create a completelylockless hash table data structure that works with multiple threads.

Here are some links that give some indications of how such a thing canbe done:

http://video.google.com/videoplay?docid=2139967204534450862
http://blogs.azulsystems.com/cliff/2007/03/a_nonblocking_h.html
http://www.cs.rug.nl/~wim/mechver/hashtable/index.html

Some of those links may look intimidating, but that is because theresize part of the algorithm is complicated. In my implementation, I donot resize the table, so it is very simple. Also, I update counter ineach node with atomic increments and decrements (no need to lock).

Here is some preliminary experimental data for 9x9 up to 8 cores,running 840,000 playouts, from a tactical middle-game position:

(Cores / Playouts per second with spinlock / Playouts per second withlockless hash table)

1  22,477.9  22,447.9
2  37,769.8  43,076.9
3  55,888.2  60,825.5
4  61,448.4  79,470.2
5  64,665.1  95,346.2
6  62,407.1 110,092.0
7  66,508.3 130,638.0
8  59,196.6 146,341.0

BTW, using a pthread mutex is a lot worse than a spinlock, in myexperience. I used the fair spinlock from the Linux kernel. But anyimplementation should work.

So, it is pretty cool. This was measured on only one run. Since it isnot deterministic, performance may vary from one run to the other(especially since it does not always select the same best move). But itstill clearly shows the superiority of the lockless hash table, andseems to indicate that it would still scale beyond 8 cores.

I believe I could improve further by reducing the number of atomicoperations. Also, thinking about how to reduce atomic operations led meto imagine a scheme that may works as a distributed hash table over anetwork of PCs.

A simple scheme that would work on a single PC would consist in storingone counter per thread in the table. This way, it would not be necessaryto use atomic operations to increment counters, and the cache coherencymechanism of the CPUs would handle sending data from core to core. Thecost would be that in order to get the node counters, it would benecessary to add N values. Also, some information may arrive a littlelate to some threads (but I believe it is better to go run a playoutrather than wait for data).

This scheme is a bit equivalent to using a separate hash table for eachthread, and could be generalized to a distributed hash table over anetwork: each PC would have its own hash table, and each node of thetree would contain two counters: my_counter and other_counter. Every nowand then, for instance when my_counter reaches a threshold, this PCwould broadcast my_counter to the whole network, so that everybodyupdates other_counter.


I have not implemented this yet, but I will probably try it.

Right now, I will test the lockless hash table more, and will probablyconnect to 9x9 CGOS with that machine sometime during the week-end.

If you wish to implement your own lockless hash table, you should readIntel's documentation about its memory architecture. It can be found there:

http://www.intel.com/products/processor/manuals/index.htm

In particular, it is important to read "Architecture Memory OrderingWhite Paper", and about the lock prefix, the cmpxchg operation, sfence,lfence, and mfence.

The primitives I used in my algorithm are a store fence, atomicincrement, atomic decrement, atomic compare and swap. If you understandwhat they do, you should be able to make your own lockless hash table.


Have fun,

Rémi
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Lockless hash table and other parallel search ideas

Reply via email to