Re: [computer-go] A problem with selectivity and AMAF bias

2008-04-11 Thread steve uurtamo
magnus,

I hate to ask the obvious, but how much time does each simulation
take?

If 100 simulations would be enough information to get it out of its
rut, why not just check every legal move for (say) 100 simulations before doing
anything else?

on another note, i think that it's cool that you have a board situation
that exhibits such borderline behavior (i.e. that it takes relatively few
simulations to fix the problem, but that the code as it stood would never
find the problem on its own).

steve.

On Fri, Apr 11, 2008 at 4:25 AM, Magnus Persson [EMAIL PROTECTED] wrote:
 Quoting Michael Williams [EMAIL PROTECTED]:


  Magnus Persson wrote:
 
   Yesterday I noticed an odd phenomena in Valkyria wich was caused by
 high selectivity and AMAF.
  
 


 
   Then since there is nothing in the AMAF scores that indicate that  move
 2 is any good it is never searched, since the search is so  selective that
 the bounds will not grow large to compensate for the  bias in a reasonable
 time.
  
 


  Isn't this fixed by never straying from a move in the tree until it
  loses and then trying an untried move?
  Or something like that.  It wasn't my idea and I don't remember the
  details, but it seems like it fixes what you describe.
 

  No, it does not because the AMAF score for move 2 is strictly lower than
 the evaluation for move 1 and all other moves for some reason. It will try
 other moves deeper in the tree instead and the position is sufficently
 complex to generate a very large tree until it gives up on the move at the
 root level. The thing is that if it searches move 2 for at least 100
 simulations it will discover it is a good move. But because of the AMAF
 score is so low and all other moves are indeed losing moves it sticks to
 move 1 because it at least makes it into a fight although at bad odds.

  Otherwise I am quite happy with the current implementation since it is
 strong in testing, this only happens when there are two hot candidates and
 the first one is searched first because of a limitation in move ordering,
 and a particularly strong bias works against the second best.

  The annoying thing is that is can suddenly lose a game it was winning.

  But I found a better fix. I also tried to enter my pattern priorities as
 the priors of the AMAF scores. And I now strongly believe that this is also
 better in general and not for this particular situation. In the position I
 wrote about yesterday this version get the right move almost immediately. I
 tested as Valkyria 3.2.0 overnight on CGOS and it seems to be just as strong
 as the previous version, and I can still tune the parameters of it.

  This means that the search of position will be guided in order by

  1) My pattern priorities disguised as priors of AMAF scores.
  2) Then AMAF score will take over
  3) If a move is searched, then the true winrates for those moves will be
 used and there is no bias from the pattern priorities except very weakly
 from the AMAF scores.

  -Magnus



  --
  Magnus Persson
  Berlin, Germany


  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] A problem with selectivity and AMAF bias

2008-04-11 Thread Magnus Persson


Quoting steve uurtamo [EMAIL PROTECTED]:


magnus,

I hate to ask the obvious, but how much time does each simulation
take?

If 100 simulations would be enough information to get it out of its
rut, why not just check every legal move for (say) 100 simulations   
before doing

anything else?

My program is doing perhaps 2000 simulations per second in the endgame.
Your suggestion would kill performance completely, since it would  
apply to all nodes in the tree. If there are 16 candidates moves in a  
node it would spend 1600 simulations just to search the root and one  
would get only 2-3 ply with normal thinking times, this simply almost  
reverts into simple MC eval without tree search.


This is how my first MC program worked before UCT, it did alpha-beta  
search evaluating each move with a constant number of moves unless a  
cut was enabled early.


Here is an example of how wonderfully selective the search is right now:

After 500 simulations it has switched to the correct move 2 which is  
searched 292 times wheras as move 1 was searched 184 times. With my  
first fix the switch would take at least 2 seconds.


The program was allowed to search for 15 seconds but stopped after 2.8  
seconds because of superiority of the best move. Here is how the  
search tree looks like.


Root:
Move 1 is searched 6163 times WR=67%, move 2 575 times WR=48%, all 16  
moves that were not pruned at move generations they were searched 2-14  
times each.


1 ply 3568/1149/785/272/181/156/... and then 1-7 times
2 ply 2432/641/297/59/48/39/16/... and then 1-3 times
3 ply 1016/799/501/98/... and then 1-2 times
4 ply 766/166/25/16/... and then 1-5 times

for each ply I listed the number of simulations spent on all candidate  
moves that got any attention.


The principal variation is longer than this but my programs only  
prints those positions that have been searched above some threshold.  
Also the move ordering of this principal variations is very strong.


As you can see the tree which is something like 16x15x14x13x12x... has  
been reduced to something like 2x3x3x2 which is higly selective. And  
the quality of moves are still excellent. This is of course only  
possible in the endgame, in the opening the width of the search is  
reduced from about 20-25 candidate moves to 3-7 moves.


I probably still have some strong biases in AMAF in my program, and I  
guess there will be more positions whith bad behavior but from now on  
this will do it for me because as long as it does not locks into  
searching only one move the bias is probably mixed up and cannot  
become that extreme.


But any ideas on how to remove bad bias from AMAF is wellcome.

The only thing I do right now is that I only use the first 70% of the  
moves played in the simulations for AMAF, following the idea that  
moves at the end of each simulations probably is only noise anyway.


-Magnus


--
Magnus Persson
Berlin, Germany
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Scalability study request

2008-04-11 Thread Gian-Carlo Pascutto

Hi all,

the result of the scalability study at

http://cgos.boardspace.net/study/13/index.html

seems to look a lot like 2 parallel lines over the entire range, which I 
find very surprising, since I'd have expected at least some differences 
caused by different playout strategies.


I am now wondering if scalability could be unaffected by playouts (just 
adding a constant offset) and only depend on the UCT/search 
implementation. From the publications of the MoGo team it seems likely 
that the programs are very similar there.


I am able to provide a Leela version that is identical to the one 
currently used in the study, but with light (uniform random) playouts 
(and being about 3 times as fast).


I think it would be very interesting to see the behaviour of this. IMHO 
it would provide deeper understanding in how the combination of MC/UCT 
works. The critical question is if we would get another parallel line, 
indicating that the search is the key to further progress, or if we 
would get a line with a different steepness, indicating playouts are the 
key. Or maybe the two are more strongly linked and both contribute. 
Either way I believe more data would be usefull.


--
GCP
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread Olivier Teytaud
I am now wondering if scalability could be unaffected by playouts (just 
adding a constant offset) and only depend on the UCT/search implementation. 
From the publications of the MoGo team it seems likely that the programs are 
very similar there.


Leela and mogo are probably quite similar.
On the other hand, we know that some playouts have different scalings than 
others:

- the nakade patch on the playouts does not change anything for small
  numbers of simulations per move
- it is very efficient for high number of simulations per move.

(By the way, this implies that optimizing parameters on small numbers of
simulations is perhaps not always a good idea.)

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread terry mcintyre
If we start up another scalability test, I'd be
delighted to offer up a few computer cores. 

It would be real nice to not only have the
light-playout variant of leela, but perhaps the
nakade-patch version of mogo and maybe even some third
program. ( if wishes were horses ... )


Terry McIntyre lt;[EMAIL PROTECTED]gt;

“Wherever is found what is called a paternal government, there is found state 
education. It has been discovered that the best way to insure implicit 
obedience is to commence tyranny in the nursery.”

Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread Gian-Carlo Pascutto

Olivier Teytaud wrote:
I am now wondering if scalability could be unaffected by playouts 
(just adding a constant offset) and only depend on the UCT/search 
implementation. From the publications of the MoGo team it seems likely 
that the programs are very similar there.


Leela and mogo are probably quite similar.
On the other hand, we know that some playouts have different scalings 
than others:

- the nakade patch on the playouts does not change anything for small
  numbers of simulations per move
- it is very efficient for high number of simulations per move.

(By the way, this implies that optimizing parameters on small numbers of
simulations is perhaps not always a good idea.)


Is the patch in some way parameterized by the number of simulations?

--
GCP
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread Olivier Teytaud

simulations is perhaps not always a good idea.)


Is the patch in some way parameterized by the number of simulations?


No. Perhaps it should :-)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread terry mcintyre

--- Olivier Teytaud [EMAIL PROTECTED] wrote:

  light-playout variant of leela, but perhaps the
  nakade-patch version of mogo and maybe even some
 third
 
 no problem for the nakade-patch version of mogo, but
 results
 are only known in 9x9, no idea for 13x13. Maybe it
 is better,
 maybe it is worse :-)

Good to find out, no?


Terry McIntyre lt;[EMAIL PROTECTED]gt;

“Wherever is found what is called a paternal government, there is found state 
education. It has been discovered that the best way to insure implicit 
obedience is to commence tyranny in the nursery.”

Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread Olivier Teytaud

light-playout variant of leela, but perhaps the
nakade-patch version of mogo and maybe even some third


no problem for the nakade-patch version of mogo, but results
are only known in 9x9, no idea for 13x13. Maybe it is better,
maybe it is worse :-)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread Olivier Teytaud

Good to find out, no?


we have validated that:
- it is good in 9x9;
- it is bad in 19x19 (unless perhaps for very large number of
  simulations).

we did not have a look at 13x13.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread Gian-Carlo Pascutto

Olivier Teytaud wrote:

light-playout variant of leela, but perhaps the
nakade-patch version of mogo and maybe even some third


no problem for the nakade-patch version of mogo, but results
are only known in 9x9, no idea for 13x13. Maybe it is better,
maybe it is worse :-)


At 9x9 you see a diminishing return in the previous study already, 
perhaps because you search so deep without the UCT exploration term.


So in this case I'm not surprised that improving the playouts changes 
the steepness.


It's not so clear to me what happens in the area where the search itself 
is still scaling so well (as is the case in the 13x13 study area).


My playouts should be quite different from yours, so the parallel lines 
in the study were very surprising to me.


--
GCP
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread Don Dailey
If we can get some consensus on what to test,  we can do more.Or we
can add 1 program version to the current study.  

Any ideas?(Or we could do a 19x19 study.)

- Don


terry mcintyre wrote:
 If we start up another scalability test, I'd be
 delighted to offer up a few computer cores. 

 It would be real nice to not only have the
 light-playout variant of leela, but perhaps the
 nakade-patch version of mogo and maybe even some third
 program. ( if wishes were horses ... )


 Terry McIntyre lt;[EMAIL PROTECTED]gt;

 “Wherever is found what is called a paternal government, there is found state 
 education. It has been discovered that the best way to insure implicit 
 obedience is to commence tyranny in the nursery.”

 Benjamin Disraeli, Speech in the House of Commons [June 15, 1874]

 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around 
 http://mail.yahoo.com 
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

   
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Scalability study request

2008-04-11 Thread Don Dailey
Gian-Carlo,

We could probably add this new version to the mix and extend the
study.But what kind of data has your own testing produced?   Do you
have an indication that it is roughly as strong at the same basic time
setting (because of it's being 3X faster or so?)

Even if it isn't,  it would still be interesting to see if the line is
parallel.It might indicate, for instance, that some simplified
hardware implementation of play-outs could be competitive. 

- Don


Gian-Carlo Pascutto wrote:
 Hi all,

 the result of the scalability study at

 http://cgos.boardspace.net/study/13/index.html

 seems to look a lot like 2 parallel lines over the entire range, which
 I find very surprising, since I'd have expected at least some
 differences caused by different playout strategies.

 I am now wondering if scalability could be unaffected by playouts
 (just adding a constant offset) and only depend on the UCT/search
 implementation. From the publications of the MoGo team it seems likely
 that the programs are very similar there.

 I am able to provide a Leela version that is identical to the one
 currently used in the study, but with light (uniform random) playouts
 (and being about 3 times as fast).

 I think it would be very interesting to see the behaviour of this.
 IMHO it would provide deeper understanding in how the combination of
 MC/UCT works. The critical question is if we would get another
 parallel line, indicating that the search is the key to further
 progress, or if we would get a line with a different steepness,
 indicating playouts are the key. Or maybe the two are more strongly
 linked and both contribute. Either way I believe more data would be
 usefull.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/