[computer-go] Call for participation: Next Computer Olympiad in Beijing

2008-05-13 Thread Chaslot G (MICC)
Dear Go programmers,

The 13th Computer Olympiad (CO) will be held from September 28 to
October 5 in Beijing, China.

This event will be held together with the Conference on Computers and
Games 2008 (CG 2008) (September 29 - October 1), and the 16th World
Computer-Chess Championship (WCCC) (September 28 - October 5) 

Location: the Beijing Golden Century Golf club, Fangshan, Beijing,
China.

For more information:
http://www.grappa.univ-lille3.fr/icga/event_info.php?id=18

I hope to meet you there,

Guillaume

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] 10k UCT bots

2008-05-13 Thread Jason House
I'm testing my bot on CGOS using pure UCT, no pondering, and 10,000  
playouts per move. Can someone put up a comparable bot?


A while back, someone else made a similar request, and I discovered  
that my bot had somehow broken. I've scoured for bugs and I believe I  
have a functional implementation.


There are two hb-739-UCT-10k-# bots because an external library leaks  
memory and it's the easiest way to make cgosGtp restart the bot  
periodically. My bot may still be a bit weaker than 10k playouts would  
imply since it's slow and may be cutting back from time restrictions.


Sent from my iPhone
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Christoph Birk


On May 13, 2008, at 7:25 AM, Jason House wrote:
I'm testing my bot on CGOS using pure UCT, no pondering, and 10,000  
playouts per move. Can someone put up a comparable bot?




I will re-start 'myCtest-10k-UCT' later today.

Christoph
 
___

computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Álvaro Begué
On Tue, May 13, 2008 at 1:04 PM, Jason House
[EMAIL PROTECTED] wrote:
 [,,,]
  I have a list of empty points. I pick one at random and then scan until I
 find a legal one. Others reduce the list size (swap to end?) and repick.

What others do is the right thing to do. Your method will introduce
some biases. I think libego used to do it your way but was then fixed.

Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey



Christoph Birk wrote:


On May 13, 2008, at 10:04 AM, Jason House wrote:
On May 13, 2008, at 12:57 PM, Carter Cheng [EMAIL PROTECTED] 
wrote:


I have a list of empty points. I pick one at random and then scan 
until I find a legal one.


That's not random.
Yes, it's not random at all.   The points near the end of the list are 
much less likely to be chosen for instance.


I have a list of empty points and I choose from this list at random and 
temporarily put away any non-legal choices and try again until I get a 
legal move.


- Don




Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Christoph Birk


On May 13, 2008, at 10:00 AM, Jason House wrote:
On May 13, 2008, at 12:00 PM, David Fotland [EMAIL PROTECTED] 
games.com wrote:


When you say pure uct, what is the playout policy?  Pure random  
moves except

don't fill one point eyes?


That's exactly what I meant. I'd also assume other stuff like the  
UCB1 formula, no RAVE, and no initial move bias. I'm not opposed to  
other variants to see the effects, but I do want to ensure my  
implementation is correct.


'my-Ctest-10k-UCT' uses none of these, except guiding the search by UCT.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey
It's not clear how bad Jason's method is however.   The points near the 
end of the list are LESS likely to be chosen but probably not much less 
likely and this method is probably pretty fast.I wonder how bad it 
really is?   

The point after an illegal move is quite a bit more likely to be 
selected.   If the list had just 1 illegal point, then the point after 
it in the list is twice as likely to be selected as any other point.
Perhaps if you added a little noise after each move (perhaps swapping to 
points at random)  you could minimize any seriously bad effects?  


- Don


Don Dailey wrote:



Christoph Birk wrote:


On May 13, 2008, at 10:04 AM, Jason House wrote:
On May 13, 2008, at 12:57 PM, Carter Cheng [EMAIL PROTECTED] 
wrote:


I have a list of empty points. I pick one at random and then scan 
until I find a legal one.


That's not random.
Yes, it's not random at all.   The points near the end of the list are 
much less likely to be chosen for instance.


I have a list of empty points and I choose from this list at random 
and temporarily put away any non-legal choices and try again until I 
get a legal move.


- Don




Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Mark Boon


On 13-mei-08, at 14:10, Álvaro Begué wrote:


What others do is the right thing to do. Your method will introduce
some biases.


Could you elaborate what bias it could lead to? I also do the same as  
Jason. I did consider the possibility of a bias but couldn't  
immediately think of one.


What good does moving it to the end of the list do? Next time around  
it's just as likely to be picked as when left in place. Or do you  
only process the 'end' after the 'front' moves are all tried?


Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Álvaro Begué
On Tue, May 13, 2008 at 1:51 PM, Mark Boon [EMAIL PROTECTED] wrote:

 On 13-mei-08, at 14:10, Álvaro Begué wrote:


 What others do is the right thing to do. Your method will introduce

 some biases.
 Could you elaborate what bias it could lead to? I also do the same as Jason.
 I did consider the possibility of a bias but couldn't immediately think of
 one.

This has been explained already. After the first eye appears on the
board, the first empty point after the eye has a probability of being
picked that is twice the probability of any other empty point.

 What good does moving it to the end of the list do? Next time around it's
 just as likely to be picked as when left in place. Or do you only process
 the 'end' after the 'front' moves are all tried?

You move it to the end and you reduce the number of candidates by one.
The code is roughly this:

int pick(Move *empties, int num_empties) {
  int num_candidates = num_empties;
  int picked;

  do {
picked = rand()%num_candidates;
num_candidates--;
  } while(!acceptable(empties[picked]);

  return picked;
}


Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey

Hi Mark,

Did you read my last email post?   Using Jason's method,  the point 
immediately AFTER an illegal point (perhaps an eye space) is TWICE as 
likely to be selected because you are scanning sequentially forward.   
Hitting on either point is going to lead to the same move selection.



Occasionally you could get 2 or 3 or more illegal points in a row, 
especially near the end of the game.  In such a case, you could be 3 or 
more times as likely to get the first point immediately after this 
sequence selected as your move choice. 



If your empty list is actually scrambled after every move, I think this 
is uniformly random except that it would be expensive.  



- Don


Mark Boon wrote:


On 13-mei-08, at 14:10, Álvaro Begué wrote:


What others do is the right thing to do. Your method will introduce
some biases.


Could you elaborate what bias it could lead to? I also do the same as 
Jason. I did consider the possibility of a bias but couldn't 
immediately think of one.


What good does moving it to the end of the list do? Next time around 
it's just as likely to be picked as when left in place. Or do you only 
process the 'end' after the 'front' moves are all tried?


Mark




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Álvaro Begué
Ooops! I hit sent before I finished writing the pseudo code. Sorry.

int pick(Move *empties, int num_empties) {
 int num_candidates = num_empties;
 int picked;

 while(1) {
   picked = rand()%num_candidates;
   if(!acceptable(empties[picked])) {
 num_candidates--;
 swap(empties[picked],empties[num_candidates]);
   }
   else
 break;
 }
 return picked;
}



On Tue, May 13, 2008 at 1:57 PM, Álvaro Begué [EMAIL PROTECTED] wrote:
 On Tue, May 13, 2008 at 1:51 PM, Mark Boon [EMAIL PROTECTED] wrote:
  
   On 13-mei-08, at 14:10, Álvaro Begué wrote:
  
  
   What others do is the right thing to do. Your method will introduce
  
   some biases.
   Could you elaborate what bias it could lead to? I also do the same as 
 Jason.
   I did consider the possibility of a bias but couldn't immediately think of
   one.

  This has been explained already. After the first eye appears on the
  board, the first empty point after the eye has a probability of being
  picked that is twice the probability of any other empty point.


   What good does moving it to the end of the list do? Next time around it's
   just as likely to be picked as when left in place. Or do you only process
   the 'end' after the 'front' moves are all tried?

  You move it to the end and you reduce the number of candidates by one.
  The code is roughly this:

  int pick(Move *empties, int num_empties) {
   int num_candidates = num_empties;
   int picked;

   do {
 picked = rand()%num_candidates;
 num_candidates--;
   } while(!acceptable(empties[picked]);

   return picked;
  }


  Álvaro.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Mark Boon


On 13-mei-08, at 14:15, Don Dailey wrote:

Yes, it's not random at all.   The points near the end of the list  
are much less likely to be chosen for instance.


OK, I'm not very good at statistics, but I don't see how the last  
points are much less likely to be picked. At best they are a  
little less likely to be picked but I actually don't see that  
either. I would like to see some logical argument supporting that claim.


Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Jason House

On May 13, 2008, at 1:51 PM, Mark Boon [EMAIL PROTECTED] wrote:



On 13-mei-08, at 14:10, Álvaro Begué wrote:

What others do is the right thing to do. Your method will introduce
some biases.


Could you elaborate what bias it could lead to? I also do the same  
as Jason. I did consider the possibility of a bias but couldn't  
immediately think of one.


And I was thinking let's not repeat this topic...

The probability of picking a point is proportional to 1 + number of  
illegal points before it.


In practice, illegal moves are pretty rare until endgame. At that  
point, it's a trade off between bias and speed. Random number  
generation is not cheap. I have yet to see empirical proof that the  
pick and scan is bad. I've tried both methods in the past and saw no  
measurable difference in strength.



What good does moving it to the end of the list do? Next time around  
it's just as likely to be picked as when left in place. Or do you  
only process the 'end' after the 'front' moves are all tried?


The range of the random number is reduced by one after each failed  
lookup. Shuffled data has no impact on future use of the array of  
empty points.






Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Mark Boon


On 13-mei-08, at 15:08, Jason House wrote:

The range of the random number is reduced by one after each failed  
lookup. Shuffled data has no impact on future use of the array of  
empty points.




OK, I understand now why a point at the end (or beginning) is a  
little less likely to be picked. Although I still have doubts whether  
that will lead to a noticable bias, I'll try to think about it.


I would imagine moving an illegal point towards the end and only  
start including it when the other 'legal' moves run out can lead to  
terrible bias however because they may not remain illegal for very  
long and actually become important points to play. A ko-point is  
probably the most extreme example of that.


Anyway, I don't bother to order the empty-point-list or scramble them  
in any way prior to the game. So the first point is the 1-1 point and  
the last is the 19-19 point (or whatever boardsize you're playing) so  
I have no qualms about those moves being a little less likely to be  
played. Or even a lot less. I think it would actually be beneficial.


If this asymmetry really bothers you, you could very easily fix this  
by wrapping the search around. There's no asymmetry in a circle.


Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey



Mark Boon wrote:


On 13-mei-08, at 15:08, Jason House wrote:

The range of the random number is reduced by one after each failed 
lookup. Shuffled data has no impact on future use of the array of 
empty points.




OK, I understand now why a point at the end (or beginning) is a little 
less likely to be picked. Although I still have doubts whether that 
will lead to a noticable bias, I'll try to think about it.


I don't believe there is any bias based on where a good point is, only 
where the bad points are.   Because you can think of this list of empty 
points as a circular array,  where no particular location is any more 
significant than any other.




I would imagine moving an illegal point towards the end and only start 
including it when the other 'legal' moves run out can lead to terrible 
bias however because they may not remain illegal for very long and 
actually become important points to play. A ko-point is probably the 
most extreme example of that.
I had a bug in my program that may have been an actual improvement.   
I did not always resurrect empty points that were eyes.   I went ahead 
and fixed this and didn't try to assess whether it was a good bug or 
not but I didn't notice any improvement.  The net effect was that if a 
move was an eye move that you shouldn't move to,   and the opponent 
moved to a diagonal,  the program still would not move to it.   That's 
probably sound in some cases and not in others.




Anyway, I don't bother to order the empty-point-list or scramble them 
in any way prior to the game. So the first point is the 1-1 point and 
the last is the 19-19 point (or whatever boardsize you're playing) so 
I have no qualms about those moves being a little less likely to be 
played. Or even a lot less. I think it would actually be beneficial.


If this asymmetry really bothers you, you could very easily fix this 
by wrapping the search around. There's no asymmetry in a circle.


Mark




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey



Don Dailey wrote:



Jason House wrote:

On May 13, 2008, at 1:51 PM, Mark Boon [EMAIL PROTECTED] wrote:



On 13-mei-08, at 14:10, Álvaro Begué wrote:

What others do is the right thing to do. Your method will introduce
some biases.


Could you elaborate what bias it could lead to? I also do the same 
as Jason. I did consider the possibility of a bias but couldn't 
immediately think of one.


And I was thinking let's not repeat this topic...

The probability of picking a point is proportional to 1 + number of 
illegal points before it.


In practice, illegal moves are pretty rare until endgame. At that 
point, it's a trade off between bias and speed. Random number 
generation is not cheap. I have yet to see empirical proof that the 
pick and scan is bad. I've tried both methods in the past and saw no 
measurable difference in strength.
I may perhaps take a look at this myself.I think there is a way to 
deal with this issue and get the best of both worlds or at least 
approach it.   If this method does not degrade the quality, it's a 
moot point.   Otherwise, here is an idea:


  1. For each move list size, pre-compute N tables of move list 
traversal orderings.

  2. At move selection time alternative between them.

So you would have something in the C language like: 



int *order[S][C]


I forget to mention that S is the move list size (number of empty points 
you are interested in) and C is the number of list orderings you want to 
alternate between.   And a list is returns that is of size S-1 (since 
your first move is selected by calling the random number generater.)   A 
number like 8 or 16 is good (a power of two) because each time you using 
the list to select a move, you can increment a counter so that you use a 
different move list ordering the next time you are confronted with a 
move list of this same exact size.



Your first move is selected randomly, then from then on you use the 
array returned to form an offset from this.
This would pay off if you really wanted something very close to 
uniform random and your random number generator was clearly expensive.
I think there is a great deal of chaos that would make this almost 
perfectly uniform.   The first move would be selected as you already 
do, using the random number generator but after this you traverse the 
list in random order as specified by this pre-computed table. Even 
though the bias is still there for any given move,  it's basically 
non-predictable.   On the very next move you will using a different 
table which you choose from in round robin fashion. 
Don't know how flawed this idea is.   I believe it is would come close 
to uniformly random but wouldn't be perfect.  There would be a minor 
bias when measured over a lot of games,  it would probably occur that 
a certain move was being chosen slightly more often,  but nothing like 
the bias in the current method. 
Is it worth it?   Probably not,  but just a crazy idea.   Likely there 
are simpler ways to eliminate most of the bias while still minimizing 
the number of random move generations.

- Don
 







What good does moving it to the end of the list do? Next time around 
it's just as likely to be picked as when left in place. Or do you 
only process the 'end' after the 'front' moves are all tried?


The range of the random number is reduced by one after each failed 
lookup. Shuffled data has no impact on future use of the array of 
empty points.






Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey





If this asymmetry really bothers you, you could very easily fix this by
wrapping the search around. There's no asymmetry in a circle.


That doesn't fix anything.


Why not? The whole argument is about a bias against points towards the 
end. In a circular list there is no 'end'.


I missed this from you.   I assumed that you did this anyway.   If you 
choose a random point and then traverse linearly to the end,  what do 
you do when you reach the end? Do you just pass?I assumed you 
viewed the empty point list as a circular queue.


- Don




Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey




I don't care much about it being noticeable. This thread is about
putting bots on CGOS that use a reproducible algorithm, to help people
detect bugs in their implementations. As part of specifying what these
bots do, we should all pick the next move in a playout using the same
criteria. If we agree to use uniform distribution among empty
non-eyeish points, that's what should be implemented.

I agree with this.   If we get too clever we lose confidence that we 
are comparing apples to apples by risking bugs and making assumptions 
that may not hold.



I take an opposite approach: the data will tell us when we are 
comparing apples and oranges. It will help us identify the major 
factors leading to strength. There's probably more subtle variation 
than we realize...


But when someone is explicitly asking for a direct comparison,   you 
don't want to just throw in your own magic and then try to sort it out 
later. You are probably looking for implementation bugs, not 
implementation differences.   That has it's place but not in this context.


- Don








- Don





I would imagine moving an illegal point towards the end and only start
including it when the other 'legal' moves run out can lead to 
terrible bias

however because they may not remain illegal for very long and actually
become important points to play. A ko-point is probably the most 
extreme

example of that.



I don't think you understood the algorithm. The eyeish point is
removed from the lottery only for picking this particular move, not
for the rest of the playout.


Anyway, I don't bother to order the empty-point-list or scramble 
them in any
way prior to the game. So the first point is the 1-1 point and the 
last is
the 19-19 point (or whatever boardsize you're playing) so I have no 
qualms
about those moves being a little less likely to be played. Or even 
a lot

less. I think it would actually be beneficial.



Reproducibility was the point, not strength of the bot.


If this asymmetry really bothers you, you could very easily fix 
this by

wrapping the search around. There's no asymmetry in a circle.



That doesn't fix anything.


Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Álvaro Begué
On Tue, May 13, 2008 at 3:08 PM, Mark Boon [EMAIL PROTECTED] wrote:

  On 13-mei-08, at 15:44, Álvaro Begué wrote:


  On Tue, May 13, 2008 at 2:28 PM, Mark Boon [EMAIL PROTECTED]
 wrote:
 
  
   On 13-mei-08, at 15:08, Jason House wrote:
  
   The range of the random number is reduced by one after each failed
 lookup.
   Shuffled data has no impact on future use of the array of empty points.
  
   OK, I understand now why a point at the end (or beginning) is a little
 less
   likely to be picked. Although I still have doubts whether that will lead
 to
   a noticable bias, I'll try to think about it.
  
 
  I don't care much about it being noticeable. This thread is about
  putting bots on CGOS that use a reproducible algorithm, to help people
  detect bugs in their implementations. As part of specifying what these
  bots do, we should all pick the next move in a playout using the same
  criteria. If we agree to use uniform distribution among empty
  non-eyeish points, that's what should be implemented.
 
 

  Indeed it seems I misunderstood. I thought the general algorithm needed to
 be the same, not necessarily the exact implementation. Why not publish some
 pseudo-code with the exact statements then? Seems a bit less prone to errors
 then by talking about it.

No, we don't need to all use the same implementations. You can
implement uniformly picking a random non-eyeish point any way you
like. I actually don't use the method we've talked about. The problem
with the other method is that it implements something else.

   I would imagine moving an illegal point towards the end and only start
   including it when the other 'legal' moves run out can lead to terrible
 bias
   however because they may not remain illegal for very long and actually
   become important points to play. A ko-point is probably the most extreme
   example of that.
  
 
  I don't think you understood the algorithm. The eyeish point is
  removed from the lottery only for picking this particular move, not
  for the rest of the playout.
 

  Yes, again I misunderstood. So you do another random lookup each time you
 hit an illegal point? That could turn out very expensive, especially by the
 end of the game.

You can try it. It's actually not that expensive. A long time ago I
modified libego to use this method and it didn't slow it down much
(sorry, I can't remember the fraction of speed lost).

   Anyway, I don't bother to order the empty-point-list or scramble them in
 any
   way prior to the game. So the first point is the 1-1 point and the last
 is
   the 19-19 point (or whatever boardsize you're playing) so I have no
 qualms
   about those moves being a little less likely to be played. Or even a lot
   less. I think it would actually be beneficial.
  
 
  Reproducibility was the point, not strength of the bot.
 

  I'd say, share the source if that's the objective. I doubt you'll get real
 reproducibility any other way.

Why not? The only things we haven't clearly specified are the UCT
formula to use and how we end playouts (although there is only one
natural choice for this last part, I believe).

   If this asymmetry really bothers you, you could very easily fix this by
   wrapping the search around. There's no asymmetry in a circle.
  
 
  That doesn't fix anything.
 

  Why not? The whole argument is about a bias against points towards the end.
 In a circular list there is no 'end'.

No, the whole argument is about bias towards points that happen to be
immediately after eyeish or illegal points.


Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Christoph Birk

On Tue, 13 May 2008, Mark Boon wrote:

If this asymmetry really bothers you, you could very easily fix this by
wrapping the search around. There's no asymmetry in a circle.


That doesn't fix anything.


Why not? The whole argument is about a bias against points towards the end. 
In a circular list there is no 'end'.


No, it was a bias towards moves behind illegal moves.
Those moves are twice as likely to be played than other moves. 
Consider a list with 5 moves:


[Move1] [Move2] [Move3] [Move4] [Move5]

You create a random number between 1 and 5. If Move2 is illegeal
for example, then you will play
 Move1 if random#=1
 Move3 if random#=2 or 3,
 Move4   =4
 Move5   =5

Move3 is twice as likely to be played. Even if you make a circular
list.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Mark Boon


On 13-mei-08, at 16:17, Don Dailey wrote:

I missed this from you.   I assumed that you did this anyway.   If  
you choose a random point and then traverse linearly to the end,   
what do you do when you reach the end? Do you just pass?I  
assumed you viewed the empty point list as a circular queue.


No I didn't. When I  reach the end I traverse in the opposite  
direction, starting from the originally chosen location. Using a  
circular list may actually be simpler.


Mark


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey



Mark Boon wrote:


On 13-mei-08, at 16:17, Don Dailey wrote:

I missed this from you.   I assumed that you did this anyway.   If 
you choose a random point and then traverse linearly to the end,  
what do you do when you reach the end? Do you just pass?I 
assumed you viewed the empty point list as a circular queue.


No I didn't. When I  reach the end I traverse in the opposite 
direction, starting from the originally chosen location. Using a 
circular list may actually be simpler.
It's probably simpler.   You have to choose between using branching or 
division:


   nextLoc = curLoc % LIST_SIZE;

or

  if (nextLoc  MAX_ELEMENT) nextLoc = firstLoc;

- Don




Mark





___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread dhillismail
For those currently coding this up, I think the most important thing about this 
playout algorithm is that it is *temporary*. You will almost certainly 
be?replacing it with something different and better just a little bit down the 
road.

Creating an MC-UCT bot has a well worn path and its kind of an ontology 
recapitulates phylogeny deal. First you implement light playouts (the random 
non-eyefilling moves people are describing), then you implement UCT, then you 
throw away the light playouts and replace them?with heavy playouts, then you 
start extreme modifications to UCT...

So you probably don't want to worry about hair-splitting tweaks except as an 
academic exercise.

- Dave Hillis


-Original Message-
From: Christoph Birk [EMAIL PROTECTED]
To: computer-go computer-go@computer-go.org
Sent: Tue, 13 May 2008 3:40 pm
Subject: Re: [computer-go] 10k UCT bots


On Tue, 13 May 2008, Mark Boon wrote:?
 If this asymmetry really bothers you, you could very easily fix this by?
 wrapping the search around. There's no asymmetry in a circle.?
  That doesn't fix anything.?
?
 Why not? The whole argument is about a bias against points towards the end.  
 In a circular list there is no 'end'.?
?
No, it was a bias towards moves behind illegal moves.?
Those moves are twice as likely to be played than other moves. Consider a list 
with 5 moves:?
?
[Move1] [Move2] [Move3] [Move4] [Move5]?
?
You create a random number between 1 and 5. If Move2 is illegeal?
for example, then you will play?
?Move1 if random#=1?
?Move3 if random#=2 or 3,?
?Move4 =4?
?Move5 =5?
?
Move3 is twice as likely to be played. Even if you make a circular?
list.?
?
Christoph?
?
___?
computer-go mailing list?
[EMAIL PROTECTED]
http://www.computer-go.org/mailman/listinfo/computer-go/?

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] a few more questions

2008-05-13 Thread Carter Cheng
Thanks for all the comments so far. Hopefully you don't mind a few more 
questions. 

1) Do UCT bots check for atari and urgency? my understanding was that first 
generation Mogo did this to some extent IIRC. I am curious if anyone does this 
it seems like it might be important but so far I cannot think of a fast 
detection scheme. 

2) When generating random variables for the case where the values of placing 
a stone on different points on the board are different. Are there good ways to 
throw and determine which point has been selected for the next move by the 
random player? 

Thanks again,

Carter.

  


  
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Don Dailey



[EMAIL PROTECTED] wrote:

For those currently coding this up, I think the most important thing about this 
playout algorithm is that it is *temporary*. You will almost certainly 
be?replacing it with something different and better just a little bit down the 
road.

Creating an MC-UCT bot has a well worn path and its kind of an ontology 
recapitulates phylogeny deal. First you implement light playouts (the random 
non-eyefilling moves people are describing), then you implement UCT, then you 
throw away the light playouts and replace them?with heavy playouts, then you 
start extreme modifications to UCT...

So you probably don't want to worry about hair-splitting tweaks except as an 
academic exercise.
  
Yes,  I agree.   Also  my hair brained scheme of pre-generated tables of 
list traversal orderings was just an academic exercise as you say.  


- Don




- Dave Hillis


-Original Message-
From: Christoph Birk [EMAIL PROTECTED]
To: computer-go computer-go@computer-go.org
Sent: Tue, 13 May 2008 3:40 pm
Subject: Re: [computer-go] 10k UCT bots


On Tue, 13 May 2008, Mark Boon wrote:?
  

If this asymmetry really bothers you, you could very easily fix this by?
wrapping the search around. There's no asymmetry in a circle.?


That doesn't fix anything.?
  

?
Why not? The whole argument is about a bias against points towards the end.  
In a circular list there is no 'end'.?


?
No, it was a bias towards moves behind illegal moves.?
Those moves are twice as likely to be played than other moves. Consider a list 
with 5 moves:?
?
[Move1] [Move2] [Move3] [Move4] [Move5]?
?
You create a random number between 1 and 5. If Move2 is illegeal?
for example, then you will play?
?Move1 if random#=1?
?Move3 if random#=2 or 3,?
?Move4 =4?
?Move5 =5?
?
Move3 is twice as likely to be played. Even if you make a circular?
list.?
?
Christoph?
?
___?
computer-go mailing list?
[EMAIL PROTECTED]
http://www.computer-go.org/mailman/listinfo/computer-go/?


  



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Heikki Levanto
On Tue, May 13, 2008 at 01:34:37PM -0400, Don Dailey wrote:
 The point after an illegal move is quite a bit more likely to be 
 selected.   If the list had just 1 illegal point, then the point after 
 it in the list is twice as likely to be selected as any other point.
 Perhaps if you added a little noise after each move (perhaps swapping to 
 points at random)  you could minimize any seriously bad effects?  

I think the problem becomes more pronounced near the end of the simulation,
when most of the boars is filled with stones or 1-point eyes. If, at that
time, there are two legal moves left, one near the beginning of the list, and
one near the end, then the second one can end up as much more likely to be
chosen. And if the first one is the key point to the position, it may never
even be noticed.

No idea how likely this scenario is in real games, but at least it is
possible. And if your search tree is large enough, once-in-a-thousand
situations occur all the time...

-H

-- 
Heikki Levanto   In Murphy We Turst heikki (at) lsd (dot) dk

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] a few more questions

2008-05-13 Thread Gunnar Farnebäck

Álvaro Begué wrote:
 On Tue, May 13, 2008 at 4:22 PM, Carter Cheng 
[EMAIL PROTECTED] wrote:

 2) When generating random variables for the case where the values
of placing a stone on different points on the board are
different. Are there good ways to throw and determine which
point has been selected for the next move by the random player?

 I can answer this. The best way I know of getting a random point with
 different weights (proportional to probabilities) for each point is to
 keep a sum of weights per row and a total sum of weights for the
 entire board. These are easy to update dynamically when an individual
 weight changes. In order to pick a point, start with a random number
 between 0 and the total sum. Go down row by row subtracting the weight
 of each row, until you would get a negative number. Then walk the row
 subtracting the weight of each point until you would get a negative
 number. The place where you stop is the selected point.

 This method may not be very robust if you use floating-point numbers
 to represent weights (because you can't rely on associativity of
 addition), but it works fine if you use an integer type.

 If the description is not good enough, I can write some C++ code for you.

Or you can read the source code of GNU Go 3.7.12, lines 1629-1658 of
engine/montecarlo.c.

The only difference is that GNU Go doesn't bother about actual rows
but instead makes a two level structure where it uses the N lowest
bits of the board coordinate to determine a partition number. N varies
from 1 to 4 depending on the maximum board size GNU Go is compiled for
(3 for 9x9, 4 for 19x19).

And I agree, don't even think of doing this with floating point
numbers.

/Gunnar
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] a few more questions

2008-05-13 Thread Zach Wegner
This could be extended rather easily to an n-ary tree. With 9x9 a
natural choice is 3, but unfortunately 19 is prime.

It's basically a tradeoff between how many adds and how many compares
you want to do. I suppose you would do one update for every pick
(unless you pick an illegal point and want to remain uniformly random,
hehe). So it seems that it would be best to make n as small as
possible, at least theoretically...
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] a few more questions

2008-05-13 Thread Weston Markham
On Tue, May 13, 2008 at 7:08 PM, Gunnar Farnebäck [EMAIL PROTECTED] wrote:
  And I agree, don't even think of doing this with floating point
  numbers.

This is a bit tangential to computer go.  But you have piqued my curiosity

Offhand, that sounds rather extreme to me.  Perhaps I haven't really
thought this through completely, but it seems as if there is a simple
modification to the stated algorithm, that would ensure that it works
fine with floating point numbers.  Or at least well enough for any
conceivable purpose:

Make sure that you select each random number from a slightly larger
range than prescribed by the exact totals.  (For example, always round
up while calculating the totals.  If you do not have control over the
rounding mode, just add on some small fraction of the largest weight.)
 Then, in the event that subtracting the weights of the
points/rows/buckets never reduced the selected number to a
non-positive value, just select a new random number and start over.

Would this work fine, or is there some flaw in my thinking that Álvaro
and Gunnar have already considered?

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] a few more questions

2008-05-13 Thread Álvaro Begué
On Tue, May 13, 2008 at 8:10 PM, Weston Markham
[EMAIL PROTECTED] wrote:
 On Tue, May 13, 2008 at 7:08 PM, Gunnar Farnebäck [EMAIL PROTECTED] wrote:
And I agree, don't even think of doing this with floating point
numbers.

  This is a bit tangential to computer go.  But you have piqued my 
 curiosity

  Offhand, that sounds rather extreme to me.  Perhaps I haven't really
  thought this through completely, but it seems as if there is a simple
  modification to the stated algorithm, that would ensure that it works
  fine with floating point numbers.  Or at least well enough for any
  conceivable purpose:

  Make sure that you select each random number from a slightly larger
  range than prescribed by the exact totals.  (For example, always round
  up while calculating the totals.  If you do not have control over the
  rounding mode, just add on some small fraction of the largest weight.)
   Then, in the event that subtracting the weights of the
  points/rows/buckets never reduced the selected number to a
  non-positive value, just select a new random number and start over.

  Would this work fine, or is there some flaw in my thinking that Álvaro
  and Gunnar have already considered?

John Tromp and I spent quite a bit of time patching the floating-point
implementation and hunting down the subsequent bugs. I am not saying
that making it robust is impossible, but I ended up so frustrated that
I don't think I will ever try again. Integers are much better behaved
and are sufficient for everything we wanted to do.

As to the optimal branching factor for the tree (assuming there are
many many leaves), I believe it's n=3, if all you care about is
picking random points fast.

Outline of the proof:
  On a particular level of the tree you will make about n/2
comparisons before you pick your branch and descend. The number of
leaves is l = pow(n,k), so the number of levels is k = log(l)/log(n).
The total number of comparisons will be something like
(log(l)/2)*(n/log(n)). The first part is fixed, so we need to minimize
n/log(n). That function reaches its minimum at n=e, but unfortunately
n has to be an integer value. Among the integers, n=3 is the best,
with n=2 and n=4 tied at as close seconds.

Of course, in practice you will also need to do updates to the weights
(actually more often than picking random points), so the number of
levels in the tree becomes more relevant, which favors larger values
of n. My original implementation had n=2 and the tree was beautifully
implemented as an array, in the same style as the usual implementation
of a heap. However, Rémi pointer out that cheaper updates were
important. I think the two-level solution using rows is easy to
implement and probably close to optimal.


Álvaro.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] 10k UCT bots

2008-05-13 Thread Hideki Kato

Álvaro Begué: [EMAIL PROTECTED]:
Ooops! I hit sent before I finished writing the pseudo code. Sorry.

int pick(Move *empties, int num_empties) {
 int num_candidates = num_empties;
 int picked;

 while(1) {
   picked = rand()%num_candidates;

This code introduces few bias unless num_candidates is a power of two,
though.
#Assuming rand() returns [0 .. power of two).

-Hideki

   if(!acceptable(empties[picked])) {
 num_candidates--;
 swap(empties[picked],empties[num_candidates]);
   }
   else
 break;
 }
 return picked;
}



On Tue, May 13, 2008 at 1:57 PM, Álvaro Begué [EMAIL PROTECTED] wrote:
 On Tue, May 13, 2008 at 1:51 PM, Mark Boon [EMAIL PROTECTED] wrote:
  
   On 13-mei-08, at 14:10, Álvaro Begué wrote:
  
  
   What others do is the right thing to do. Your method will introduce
  
   some biases.
   Could you elaborate what bias it could lead to? I also do the same as 
 Jason.
   I did consider the possibility of a bias but couldn't immediately think of
   one.

  This has been explained already. After the first eye appears on the
  board, the first empty point after the eye has a probability of being
  picked that is twice the probability of any other empty point.


   What good does moving it to the end of the list do? Next time around it's
   just as likely to be picked as when left in place. Or do you only process
   the 'end' after the 'front' moves are all tried?

  You move it to the end and you reduce the number of candidates by one.
  The code is roughly this:

  int pick(Move *empties, int num_empties) {
   int num_candidates = num_empties;
   int picked;

   do {
 picked = rand()%num_candidates;
 num_candidates--;
   } while(!acceptable(empties[picked]);

   return picked;
  }


  Álvaro.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
--
[EMAIL PROTECTED] (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/