Re: [computer-go] The effect of the UCT-constant on Valkyria

2008-05-04 Thread Michael Williams

David Fotland wrote:

So I'm curious then.  With simple UCT (no rave, no priors, no progressive
widening), many people said the best constant was about 0.45.  What are the
new concepts that let you avoid the constant?


Actually it's closer to 0.46.

Just kidding, I have no idea.  But great questions.  Looking forward to the 
answers.



Is it RAVE, because the information gathered during the search lets you
focus the search accurately without the UCT term?  Many people have said
that RAVE has no benefit for them.

Do most of the strongest programs use RAVE?  I think from Crazystone's
papers, that it does not use RAVE.  Gnugomc does not use rave.

Is it the prior values from go knowledge, like opening books, reading
tactics before the search etc?  Do all of the top programs have opening
books now?  I know mogo does.

Do most of the top programs read tactics before the search?  I know Aya
does.

Does it matter how prior values are used to guide the search?  I think mogo
uses prior knowledge to initialize the RAVE values.  Do other programs
include it some other way, by initializing the FPU value, or by initializing
the UCT visits and confidence, or some extra, prior term in the equation?

Are there other techniques (not RAVE) that people are using to get
information from the search to guide the move ordering?  I think crazystone
estimates ownership of each point and uses it to set prior values in some
way.

Regards,

David 


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] The effect of the UCT-constant on Valkyria

2008-05-04 Thread Gunnar Farnebäck

David Fotland wrote:
 So I'm curious then.  With simple UCT (no rave, no priors, no progressive
 widening), many people said the best constant was about 0.45.  What 
are the

 new concepts that let you avoid the constant?

Whatever concepts are used it must indirectly be a question of
improved move ordering. The better the move ordering, the smaller the
need to do exploration.

 Is it RAVE, because the information gathered during the search lets you
 focus the search accurately without the UCT term?  Many people have said
 that RAVE has no benefit for them.

 Do most of the strongest programs use RAVE?  I think from Crazystone's
 papers, that it does not use RAVE.  Gnugomc does not use rave.

I've never had success with RAVE but I might make a new attempt for
GNU Go some time.

 Is it the prior values from go knowledge, like opening books, reading
 tactics before the search etc?  Do all of the top programs have opening
 books now?  I know mogo does.

The MonteGNU account on CGOS (9x9) has a self-learnt opening book with
currently slightly more than 16000 moves. Over the last 1000 games it
has played on average 4 moves (own moves that is, opponent moves not
counted) from the book. The record is 22 moves from book.

 Do most of the top programs read tactics before the search?  I know Aya
 does.

GNU Go in Monte Carlo mode reads lots of tactics before the MC search.
But it doesn't use the tactics for the MC search. :-/

 Does it matter how prior values are used to guide the search?  I 
think mogo

 uses prior knowledge to initialize the RAVE values.  Do other programs
 include it some other way, by initializing the FPU value, or by 
initializing
 the UCT visits and confidence, or some extra, prior term in the 
equation?


 Are there other techniques (not RAVE) that people are using to get
 information from the search to guide the move ordering?  I think 
crazystone

 estimates ownership of each point and uses it to set prior values in some
 way.

GNU Go uses a global move ordering shared by all nodes in the tree and
initialized from the results of the normal move generation.

/Gunnar
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] The effect of the UCT-constant on Valkyria

2008-05-03 Thread Magnus Persson
I have already posted the following results. The results shows the  
winrates of Valkyria 3.2.0 against gnugo at default strength.



512 Simulations per move
UCT_K   Winrate SERR
0   58.82.1 (Winrate only)
0.0156.82.2
0.1 60.92.2
0.5 54.22.2
1   50.62.2

With 512 simulations there is not much work done in the tree. So I  
extend the test to 2048 simulations and also added the parameter value  
2 to see what happens when search get really wide.


2048 Simulations per move
UCT_K   Winrate SERR
0   80.72.3 (Winrate only)
0.0183.32.2
0.1 83.72.1
0.5 77.32.4
1   71.33
2   62  4.9

The number of games are 300 for parameters 0 to 0.5 and a little less  
for parameter values 1 and 2


The results confirm that Valkyria still benefits from using confidence  
bounds with UCT, although the effect is really small.


Also the effect of the constant might be a little greater with 2048  
simulations rather than for 512. Still the curves look more or less  
the same. Does anyone have experience doing a test with different  
amounts of simulations where the best parameter value depend on the  
number of simulations?


I prefer to use a low amount of simulations since it is simply faster,  
and also if the winrate of Valkyria gets to close to 100% it becomes  
harder to measure the effect of different parameter settings. Maybe I  
should quit testing against gnugo, and try something stronger.


-Magnus


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] The effect of the UCT-constant on Valkyria

2008-05-03 Thread Olivier Teytaud
The results confirm that Valkyria still benefits from using confidence bounds 
with UCT, although the effect is really small.


The standard deviation is a bit large for concluding.

I'll try to get similar numbers for mogo. For the moment
everything leads to 0 as the best constant, but perhaps
it will be different with larger numbers of sims/second.
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


RE: [computer-go] The effect of the UCT-constant on Valkyria

2008-05-03 Thread David Fotland
So I'm curious then.  With simple UCT (no rave, no priors, no progressive
widening), many people said the best constant was about 0.45.  What are the
new concepts that let you avoid the constant?

Is it RAVE, because the information gathered during the search lets you
focus the search accurately without the UCT term?  Many people have said
that RAVE has no benefit for them.

Do most of the strongest programs use RAVE?  I think from Crazystone's
papers, that it does not use RAVE.  Gnugomc does not use rave.

Is it the prior values from go knowledge, like opening books, reading
tactics before the search etc?  Do all of the top programs have opening
books now?  I know mogo does.

Do most of the top programs read tactics before the search?  I know Aya
does.

Does it matter how prior values are used to guide the search?  I think mogo
uses prior knowledge to initialize the RAVE values.  Do other programs
include it some other way, by initializing the FPU value, or by initializing
the UCT visits and confidence, or some extra, prior term in the equation?

Are there other techniques (not RAVE) that people are using to get
information from the search to guide the move ordering?  I think crazystone
estimates ownership of each point and uses it to set prior values in some
way.

Regards,

David 

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:computer-go-
 [EMAIL PROTECTED] On Behalf Of Olivier Teytaud
 Sent: Saturday, May 03, 2008 3:10 AM
 To: computer-go
 Subject: Re: [computer-go] The effect of the UCT-constant on Valkyria
 
  The results confirm that Valkyria still benefits from using
 confidence bounds
  with UCT, although the effect is really small.
 
 The standard deviation is a bit large for concluding.
 
 I'll try to get similar numbers for mogo. For the moment
 everything leads to 0 as the best constant, but perhaps
 it will be different with larger numbers of sims/second.
 Olivier
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


RE: [computer-go] The effect of the UCT-constant on Valkyria

2008-05-03 Thread Magnus Persson

Quoting David Fotland [EMAIL PROTECTED]:


So I'm curious then.  With simple UCT (no rave, no priors, no progressive
widening), many people said the best constant was about 0.45.  What are the
new concepts that let you avoid the constant?

Is it RAVE, because the information gathered during the search lets you
focus the search accurately without the UCT term?  Many people have said
that RAVE has no benefit for them.


Yes, it is RAVE, and mor specifil as it was last presented here  
recently in the mailing list by the Mogo team, and not how it is was  
originally presented in the mogo paper. Also there may be several  
minor details that are peculiar to my implementation. Actually I did  
not understand some aspects of the Mogo method mailed here and just  
guessed some details. It suddenly worked and I could feel that the  
search was unusually strong and selective, and since then I just  
adjusted some parameters.


I used to do progressive widening but that is now turned off. RAVE is  
free to pick any move that is not pruned right away.


Currently I believe that RAVE is only effective if one gets other  
parameters right. For me it meant changing the uct parameter from 0.8  
into 0.1. I also know of many pathological situations where Valkyria  
currently will not find the best move, but rather the second best. It  
is possible that other programs suffers even more than Valkyria from  
similar problems and that this to some extent has to do with that the  
nature of the playouts may interfere with AMAF. For example V either  
plays forced moves or uniformly random among moves that are not  
pruned. Other programs may rely on patterns to pick all moves in the  
playouts and this might be bad for AMAF (this is a wild speculation).



Do most of the strongest programs use RAVE?  I think from Crazystone's
papers, that it does not use RAVE.  Gnugomc does not use rave.


You might not need it if you have strong pattern matching priors for  
the tree part similar to Crazystone. RAVE makes it possible to ignore  
most bad moves in a given positions. The weakness is that often some  
good (with a chance of being the best possible move) are also ignored  
completely.



Is it the prior values from go knowledge, like opening books, reading
tactics before the search etc?  Do all of the top programs have opening
books now?  I know mogo does.


Valkyria has just 4 moves in a hardcoded openingbook. Previous  
versions used a book with several 1000's of positions that was both  
self learned and modified by hand, but as long as the program changes  
the book tend become inaccurate, so right now I do not use it and is  
planning to write something more efficient than the old one which kept  
each position as file on the harddrive.



Do most of the top programs read tactics before the search?  I know Aya
does.


Valkyria only does some simple tactics in the playouts. It is stronger  
than anything I ever programmed (on 9x9 at least) so currently I  
cannot see how to integrate precomputed tactical results in the later  
search. I think Aya is special because it was very strong doing search  
before it went MC.



Does it matter how prior values are used to guide the search?  I think mogo
uses prior knowledge to initialize the RAVE values.  Do other programs
include it some other way, by initializing the FPU value, or by initializing
the UCT visits and confidence, or some extra, prior term in the equation?


Right know Valkyria sets priors for AMAF so that moves that are a good  
local response to the last move have a prior 100% winrate with 20-100  
visits depending on the priority of the triggered pattern. I think  
Mogo has a fixed number of visits for the priorities but modifies the  
winrate, but I never saw this described in a way that made it clear.


Previously I biased the UCT values after everyting else was computed  
but found that this led to some bad behavior. By biasing the AMAF  
values these biases will get less influential as the true winrate has  
more weight than the AMAF-scores.




Are there other techniques (not RAVE) that people are using to get
information from the search to guide the move ordering?  I think crazystone
estimates ownership of each point and uses it to set prior values in some
way.


I used to do that long time ago in Viking (the precursor to Valkyria)  
that used alphabeta + MC-eval. As I remember it then it had a great  
impact on move ordering that was quite bad (or even nonexistent) for  
Viking.


I have tried it in Valkyria but was never able to see an improvement.  
But I did not try hard enough to tell for sure. Both ownership and  
AMAF use the same information (playouts), so trying to use it twice is  
perhaps partially a waste of effort.


-Magnus

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/