[Computer-go] Scaling, randomness and long thinking times Part II

valkyria Tue, 21 Jun 2011 02:09:50 -0700

In the first part I tried to explain what problems I see with searchin Valkyria playing 9x9 games. It may or may not be valid for otherprograms. And some of these solutions may not work so well for 19x19.

I do not consider Valkyria to have buggy playouts when it plays 9x9.Only in one game on Little Golem has it had severe problems evaluatingthe game correctly (note I might be blind to the fact that theopponents have not been able to punish mistakes by Valkyria)


****

In part I I explained that Valkyria simply prune the oldest nodes inthe tree, but then longer search tends to revisit parts of the treethat has been pruned putting a severe limit on how effective theprogram can search.

MCTS needs a lot of memory for every position expanded to store AMAFinformation and Win Rate information for the children.

But if we come back to this position after pruning it, the mostimportant information is which child was the best move.

So Valkyria has a hash table which only stores the best move for theposition with a 64 bit hash (32 bits are not enough!) and log2 of thenumber of visits (measuring quality of move) and log2 of the searchdepth. Entries in the table is overwritten if quality is bad and depthis deep. There are 4 million entries in the table and written to afile the size is 50 MB.

When a node is revisited and created fresh again the best move in thetable get a gentle prior bias to guide search in that direction.

Still the selective search will get stuck on a single move too long inmany parts of the tree. If reset the program will search completelydifferent lines.

Therefore I use a "iterative deepening" approach. The idea is that aslong as the search hit a lot of old nodes the search continues, butwhen the expanded nodes are all almost always not in the hash tablethen it stops search and deletes the tree. But the best moves are allstored in in the table so in a research it will search old portions ofthe tree much more effectively. So far my heuristics for starting newiterations are pretty crude. One could also consider having a fixedscheme effectively always doubling the number of playouts for example.


Here is an example


***

Fixed Playouts=100000000
Iterative Deepening is activated
Threads=4

..0.. <d6/484/3855> Hits: 313 5462/5462 Misses: 2508
..1.. <f6/455/9042> Hits: 1197 21479/26941 Misses: 9585
..2.. <d5/457/24193> Hits: 1558 42968/69909 Misses: 19069
..3.. <c7/457/43613> Hits: 3825 85948/155857 Misses: 38202
..4.. <d6/471/61782> Hits: 5714 171907/327764 Misses: 76177
..5.. <h6/458/138701> Hits: 11967 343857/671621 Misses: 151474
..6.. <d6/466/376412> Hits: 17061 687722/1359343 Misses: 301809
..7.. <e6/456/671038> Hits: 49457 1375474/2734817 Misses: 606451
..8.. <d6/457/847556> Hits: 96751 2750953/5485770 Misses: 1215890
..9.. <d6/470/5261864> Hits: 152031 5501923/10987693 Misses: 2428660
..10.. <d6/466/10592489> Hits: 284004 11003862/21991555 Misses: 4907851
..11.. <d6/467/21085360> Hits: 693586 22007730/43999285 Misses: 9615665
..12.. <d6/471/42756103> Hits: 1169491 44015484/88014769 Misses: 19772148

***

In this case the search converges on d6. In the last iteration itsearched over 42 million playouts. And in this search 1 169 491 bestmoves were found in the hash table from previous iterations although19 772 148 moves were not found.


The principal variation look like this

>d6 >D5 >e6 !F6 >c6 !F3 >f4 >G3 >f7 >C5 >e4 !B6 -e3 >C7 >d8 >G7 >b7!B4 !d7 E2 d2 F2 h3 H2 h4 G1

A move was found in the hash table and it is still the best move.

! A move that is better than the move in the hash table has been found.

- A move may be better than the move in the has table but has not beenvisited as much as the current PV.

The moves with no prefix was not found in the table.

I am not claiming that this solution is the most effective way ofsearching go. But at least I think it plays very strongly and it doesscale well in situation where 1 Gb of memory would prevent Valkyriafrom searching deep. (V. is a 32-bit program). But I think thesemethods are good for creating opening books where one focuses a lot ofcomputational power on specific moves.

The hash table in itself is very good for 9x9 opening play becausedeeply searched moves near the root is stored permanently in a file.So it searches common opening lines stronger every time it is played.(I guess chess programs do this all the time so it is not so original).

Valkyria3.5.9_P4Bx is an old version of Valkyria on CGOS which hasplayed 36150 games now. It has a ranking of 2678 and it is running onsingle core P4 which is 10 times slower than a modern 4 core computer.

I am thinking of running the same version again without the table andsee how it does.


It would also be interesting to know if other people do something similar.


Best
Magnus


_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

[Computer-go] Scaling, randomness and long thinking times Part II

Reply via email to