Re: [Computer-go] Scaling, randomness and long thinking times Part II

Brian Sheppard Fri, 24 Jun 2011 22:44:53 -0700

I implemented a similar system in Pebbles. I call it a "memento" system,
because "memento" means "an object that reminds you of something."


Same idea as yours: when you repurpose a node then you memorize something
inexpensive (the memento) so that you can reconstruct the tree quickly.
Store mementos between searches to develop an opening library. Use a
symmetry invariant hash function, etc.

I actually disable mementos in Pebbles. They are too effective. An opening
library with many million positions is easily possible, and they gain
strength rapidly. (My opinion: much, much more rapidly than the Mogo opening
library system calculated using "grid" computing, but the Mogo system is
more scalable because it is database-bound rather than memory bound. You can
use both.) A program using mementos wins too often. It sounds odd. Let me
explain.

My goal differs from yours. You want to play 9x9 perfectly, so mementos fit.
I view 9x9 as an experimental platform from which I want to learn as much as
possible, and I learn more when Pebbles loses. So I deliberately use a
simplistic opening strategy that plays "well enough" and creates some
variety.

(BTW, I would activate mementos if I believed that move-by-move opening
libraries were beneficial in 19x19.)

Brian

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of [email protected]
Sent: Wednesday, June 22, 2011 4:46 AM
To: [email protected]
Subject: Re: [Computer-go] Scaling,randomness and long thinking times Part
II

Quoting Ingo Althöfer <[email protected]>:
> Did you check that at 12-hour searches?

Here are some statistic from 3 different tables. For the first two tables I
cleaned them out so they were empty and then I ran 1M and 100M playouts on
the same position and printed the debug information for the tables. The last
table is the current CGOS table that has been filled for over 6 months of
so. basically all versions of Valkyria having _4cx has been using this
table.

How to read the debug dumps of the hash tables below:
"Level 4 15564" means that there are 15564 entries in the table of positions
that were searched at least 2^4=2*2*2*2=16 times. I do not store lover level
entries because I think it will just be noise or simply reflect my usual
move ordering anyway so storing them would not give any benefits.
"Depth 4 15900" means that 15900 entries was searched to a depth between 2^3
to 2^4 counting from the empty board. This information is only used to
overwrite entries that are too deep to be encountered ever again. I am using
(Level - Depth) as the criterion for which entries should be overwritten.

****************

With 1M playouts only 30k entries out 4M were used. This is equivalent of
about 5 minutes search which is a typical game on CGOS. To fill the table I
thus have to to play 138 games on CGOS! And most of that information will be
from the middle gamed and endgame which will never be encountered again.


Fixed Playouts=1000000
Iterative Deepening is activated

   ***  300.9 s  *** 1000053 T

Free 4164275 Taken 30029
Level 4 15564
Level 5 7320
Level 6 3572
Level 7 1801
Level 8 866
Level 9 439
Level 10 218
Level 11 130
Level 12 50
Level 13 30
Level 14 23
Level 15 3
Level 16 9
Level 17 1
Level 18 3
Depth 2 30
Depth 3 14090
Depth 4 15900
Depth 5 9


*********

With 100M playouts the table almost filled. Which means that for every long
search all nodes visited are also stored in the table. Note that it does
overwrite low level entries first. So often a Level 4 entry will overwrite a
Level 4 entry that already exist and was just entered. The hash table looks
up 20 entries to see what it will replace.

Now I think this is absolutely perfect for the Little Golem games!  
Basically all the information of best moves during the 100M search is stored
in the table!
This search took over 7 hours with 4 threads to run. Also because of 32bit
integers Valkyria cannot search more than 1000M playouts. So searching more
than 100M there will be losses of information but because the table to 50%
contains Level 4 entries I really do not think it it will affect playing
strength to a degree that could be even measured were I given the largest
computer cluster on earth to test it the difference in playing strength.
Someone else have to prove me wrong on that :)

To conclude: all moves with high quality information is easily stored in the
table even if the search time is very long. Also very little of this
information will be searched again. The very point of using the table is to
sort out the main lines near the root of the search and overcome the fact
that the selective search often visit weird sequences. And if it works most
of the stored low level moves will never be visited again!

I think the difference here to chess is that chess evaluations are to some
degree quite accurate with low search depth. The entries in this table only
get good when a move has been searched at least 40 times or higher. With 100
playouts Valkyria play as strong as Gnugo on 9x9 which then is Level 6
already.
Thus given the kind of information I store in the table and at the rate the
program produces this information (many orders of magnitude less than a
chess program do) a 4 million entries table is good enough. Thinking of it I
think it is even too large for correspondence games. I keep it large because
CGOS games will slowly fill it up.

Fixed Playouts=100000000
Iterative Deepening is activated

   ***  26048.0 s  *** 100000049 T

Free 8749 Taken 4185555
Level 4 2409398
Level 5 933871
Level 6 417182
Level 7 207551
Level 8 105182
Level 9 53171
Level 10 27845
Level 11 14612
Level 12 7878
Level 13 4089
Level 14 2201
Level 15 1210
Level 16 654
Level 17 340
Level 18 196
Level 19 87
Level 20 56
Level 21 8
Level 22 21
Level 24 2
Level 25 1
Depth 2 36
Depth 3 160408
Depth 4 3073259
Depth 5 951469
Depth 6 383

**************

This is a the current table for CGOS which have been running using 4 threads
on I7-860 for many versions of Valkyria for about half a year.

There are two bumps (local maximum of entries) in the distribution here.
Level 4 is one bump which are entries entered in the last few games played.
But note that there is a much larger bump around Level 14. For level
14 and larger the entries have never been overwritten so this is high
quality move ordering information that has been stored from several 1000s of
games on CGOS.
This means that Valkyria as soon as leaves it book in a place where it left
the book many times can search that position very effectively and can
explore new variation deeper.

Level 14 means that at least 2^14=16384 playouts has been searched for this
positions. Valkyria with 4000 playouts is rated 2400 ELO so the table will
provide a lot

Table Size = 50331648 bytes
# Positions = 4194304
Free 0 Taken 4194304
Level 4 256279
Level 5 140378
Level 6 91368
Level 7 68036
Level 8 56307
Level 9 54079
Level 10 62383
Level 11 125751
Level 12 374840
Level 13 753358
Level 14 806033
Level 15 632319
Level 16 559939
Level 17 193557
Level 18 19108
Level 19 539
Level 20 20
Level 21 10
Depth 0 22
Depth 1 2471
Depth 2 114001
Depth 3 1054120
Depth 4 2206622
Depth 5 768505
Depth 6 48554
Depth 7 9

Is the table useful?
I picked the a position I thought would be a little odd but not completely
odd:


After 10 playouts
> f6 >F5 >g7 >C6 >e6 >D5 >g4 >G3 >d8 >B7 >d2 >B3 >b4 >B2 >c2 >B1 >e2
> >H6 >h7 >A5 >a4  C3

this basically probably the entire principle variation from last time this
position was searched by Valkyria on CGOS and it is 22 ply deep!

A few hundred playouts later however the PV is different and two alternative
lines (marked with *) seems better than the PV (one line is the previous PV
it has now not been search as deep but the win RATE for the move BC6 is
better than -H6). The second alternative line is completely new after We7.

  >f6 >F5 >g7 -H6 -h7  F7  e6  G4  f8  D5  c5  D6

*>f6 >F5 >g7 BC6 ?e6 ?D5 >g4 >G3 >d8 >B7 >d2 >B3 >b4 >B2 >c2 >B1 >e2
 >H6 >h7 >A5 >a4  C3

*>f6 >F5 >g7 -H6 We7  E6  f7  C6  h5

I show this because there are almost always a lot of hits in the table when
the programs leaves the opening book. And for several moves it will benefit
from it. But it will never contain all position that could be searched. For
CGOS I see it as a boost to the opening library and a gentle way of learning
that does not learn bad moves that trick weak programs, which automatic book
learning tend to do. It is an improvement to search but I admit I do not how
big it is and the parameters I use are sort of arbitrary.

In the example above we also saw that the search do not stubbornly stick to
moves in the table but actually discover and search new line all the time.
just by watching how the PV changes over the course of a long search I get a
strong feeling that the program really explore all possibilities. That is
the selective search in different iteration try out new things and learn
from them and this exploration is near the root which is a good thing. And
when all moves has been explored it kind of continues exploring moves even
deeper in the main variations.

This is my subjective experience of how this system works.

As said earlier this may not be the most effective way of searching, but I
am pretty sure it overcomes at least the "limited memory threat"  
to scaling, which is a problem for Valkyria which is programmed in Delphi
and has to be ported to 64-bit compiler like Free Pascal to overcome the 1GB
memory barrier for 32-bit programs. Unfortunately Valkyria has been using
libraries which has not been ported to Free Pascal so it would be painful to
move the code.

For LG tournament games I also run several copies of Valkyria in parallel so
each program cannot use much memory anyway (I got 6GB on my best machine).
Next time I buy a computer this will probably be no problem.

Best
Magnus
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Scaling, randomness and long thinking times Part II

Reply via email to