[computer-go] Negative result on using MC as a predictor

2009-06-05 Thread Brian Sheppard
In a paper published a while ago, Remi Coulom showed that 64 MC trials
(i.e., just random, no tree) was a useful predictor of move quality.

In particular, Remi counted how often each point ended up in possession
of the side to move. He then measured the probability of being the best
move as a function of the frequency of possession. Remi found that if
the possession frequency was around 1/3 then the move was most likely
to be best, with decreasing probabilities elsewhere.

I have been trying to extract more information from each trial, since
it seems to me that we are discarding useful information when we use
only the result of a trial. So I tried to implement Remi's idea in a UCT
program.

This is very different from Remi's situation, in which the MC trials are
done before the predictor is used in a tree search. Here, we will have
a tree search going on concurrently with collecting data about point
ownership.

My implementation used the first N trials of each UCT node to collect
point ownership information. After the first M trials, it would use that
information to bias the RAVE statistics. That is, in the selectBest
routine I had an expression like this:

   for all moves {
  // Get the observed RAVE values:
  nRAVE = RAVETrials[move];
  wRAVE = RAVEWins[move];

  // Dynamically adjust according to point ownership:
  if (trialCount  M) {
   ; // Do nothing.
  }
  else if (Ownership[move]  0.125) {
   nRAVE += ownershipTrialsParams[0];
   wRAVE += ownershipWinsParams[0];
  }
  else if (Ownership[move]  0.250) {
   nRAVE += ownershipTrialsParams[1];
   wRAVE += ownershipWinsParams[1];
  }
  else if (Ownership[move]  0.375) {
   nRAVE += ownershipTrialsParams[2];
   wRAVE += ownershipWinsParams[2];
  }
  else if (Ownership[move]  0.500) {
   nRAVE += ownershipTrialsParams[3];
   wRAVE += ownershipWinsParams[3];
  }
  else if (Ownership[move]  0.625) {
   nRAVE += ownershipTrialsParams[4];
   wRAVE += ownershipWinsParams[4];
  }
  else if (Ownership[move]  0.750) {
   nRAVE += ownershipTrialsParams[5];
   wRAVE += ownershipWinsParams[5];
  }
  else if (Ownership[move]  0.875) {
   nRAVE += ownershipTrialsParams[6];
   wRAVE += ownershipWinsParams[6];
  }
  else {
   nRAVE += ownershipTrialsParams[7];
   wRAVE += ownershipWinsParams[7];
  }

  // Now use nRAVE and wRAVE to order the moves for expansion
   }

The bottom line is that the result was negative. In the test period, Pebbles
won 69% (724 out of 1039) of CGOS games when not using this feature and
less than 59% when using this feature. I tried a few parameter settings.
Far from exhaustive, but mostly in line with Remi's paper.
The best parameter settings showed 59% (110 out of 184, which is 2.4
standard deviations lower). But maybe you can learn from my mistakes
and figure out how to make it work.

I have no idea why this implementation doesn't work. Maybe RAVE does a
good job already of determining where to play, so ownership information
is redundant. Maybe different parameter settings would work. Maybe just
overhead (but I doubt that; the overhead wouldn't account for such a
significant drop).

Anyway, if you try something like this, please let me know how it works out.
Or if you have other ideas about how to extract more information from
trials.

Best,
Brian

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Negative result on using MC as a predictor

2009-06-05 Thread Magnus Persson

Hi Brian!

In my tests with Valkyria I have something like a 4-5% improvement in  
winrate against gnugo using ownership. But I think you need to be much  
more careful in how you test these things.


Testing on CGOS is a no-no for me, because the opposition changes from  
hour to hour, so unless there are large effects in playing strength it  
is very hard to detect them on CGOS. I am currently testing against  
GnuGo or Fuego or Valkyria itself using twogtp.jar from GoGui. The  
nice thing with testing MC programs is that one can set the number of  
playouts low and play a lot of games.


You might also consider simplifying your code. Just doing something  
simple like this:


CombinedWinrate = AMAFWinRate + k*OwnerShip.

I would then vary k from 0, 0.001, 0.01, 0.1, 1.

Here I am expecting really bad performance for k=1, but I always try  
to include some extreme values so that I know for sure that there is  
no bug and the results make sense.


I will run 500-2500 games per parameter because often the effects are  
really small and needs tons of data to be detected. I learned the hard  
way that it is too tempting to make quick conclusion on insufficient  
data.


For every test I also need to think hard about how fast the programs  
should play. Fast play gives more data, but may not generalize to slow  
play on CGOS for example.


When you know how this works, you can start experimenting with more  
complex code and more parameters.


That is the philosophy I try to follow for my testing of Valkyria, and  
I hope some of it could be helpful.


Quoting Brian Sheppard sheppar...@aol.com:


In a paper published a while ago, Remi Coulom showed that 64 MC trials
(i.e., just random, no tree) was a useful predictor of move quality.

In particular, Remi counted how often each point ended up in possession
of the side to move. He then measured the probability of being the best
move as a function of the frequency of possession. Remi found that if
the possession frequency was around 1/3 then the move was most likely
to be best, with decreasing probabilities elsewhere.

I have been trying to extract more information from each trial, since
it seems to me that we are discarding useful information when we use
only the result of a trial. So I tried to implement Remi's idea in a UCT
program.

This is very different from Remi's situation, in which the MC trials are
done before the predictor is used in a tree search. Here, we will have
a tree search going on concurrently with collecting data about point
ownership.

My implementation used the first N trials of each UCT node to collect
point ownership information. After the first M trials, it would use that
information to bias the RAVE statistics. That is, in the selectBest
routine I had an expression like this:

   for all moves {
  // Get the observed RAVE values:
  nRAVE = RAVETrials[move];
  wRAVE = RAVEWins[move];

  // Dynamically adjust according to point ownership:
  if (trialCount  M) {
   ; // Do nothing.
  }
  else if (Ownership[move]  0.125) {
   nRAVE += ownershipTrialsParams[0];
   wRAVE += ownershipWinsParams[0];
  }
  else if (Ownership[move]  0.250) {
   nRAVE += ownershipTrialsParams[1];
   wRAVE += ownershipWinsParams[1];
  }
  else if (Ownership[move]  0.375) {
   nRAVE += ownershipTrialsParams[2];
   wRAVE += ownershipWinsParams[2];
  }
  else if (Ownership[move]  0.500) {
   nRAVE += ownershipTrialsParams[3];
   wRAVE += ownershipWinsParams[3];
  }
  else if (Ownership[move]  0.625) {
   nRAVE += ownershipTrialsParams[4];
   wRAVE += ownershipWinsParams[4];
  }
  else if (Ownership[move]  0.750) {
   nRAVE += ownershipTrialsParams[5];
   wRAVE += ownershipWinsParams[5];
  }
  else if (Ownership[move]  0.875) {
   nRAVE += ownershipTrialsParams[6];
   wRAVE += ownershipWinsParams[6];
  }
  else {
   nRAVE += ownershipTrialsParams[7];
   wRAVE += ownershipWinsParams[7];
  }

  // Now use nRAVE and wRAVE to order the moves for expansion
   }

The bottom line is that the result was negative. In the test period, Pebbles
won 69% (724 out of 1039) of CGOS games when not using this feature and
less than 59% when using this feature. I tried a few parameter settings.
Far from exhaustive, but mostly in line with Remi's paper.
The best parameter settings showed 59% (110 out of 184, which is 2.4
standard deviations lower). But maybe you can learn from my mistakes
and figure out how to make it work.

I have no idea why this implementation doesn't work. Maybe RAVE does a
good job already of determining where to play, so ownership information
is redundant. Maybe different parameter settings would work. Maybe just
overhead (but I doubt that; the overhead wouldn't account for such a
significant drop).

Anyway, if you try something like 

Re: [computer-go] Negative result on using MC as a predictor

2009-06-05 Thread Peter Drake

On Jun 5, 2009, at 6:39 AM, Brian Sheppard wrote:


In a paper published a while ago, Remi Coulom showed that 64 MC trials
(i.e., just random, no tree) was a useful predictor of move quality.

In particular, Remi counted how often each point ended up in  
possession

of the side to move.


Which paper was this?

Peter Drake
http://www.lclark.edu/~drake/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Negative result on using MC as a predictor

2009-06-05 Thread Don Dailey
When I complete the new server, I hope that it will be easier to collect
larger samples of games.   I think this will help the situation a little.

There will be multiple time controls, but they will be in sync, so that your
program can always play in a shorter time control game without missing a
game at the longer time control.The idea is to keep your bot busy while
waiting for future rounds.You play in the longest time control, but when
you are finished you can play fast games while waiting.   I will have 2 or 3
levels of this,  I haven't decided yet. If I have 3 levels, the slowest
time control will probably need to be a little slower than CGOS uses now.

I will also have a test mode for new bots.  The server itself will play test
games with your bot while you debug it.

I haven't decided if each time control should be rated separately,  but I'm
leaning in favor of doing it this way.

- Don


On Fri, Jun 5, 2009 at 11:03 AM, Magnus Persson magnus.pers...@phmp.sewrote:

 Hi Brian!

 In my tests with Valkyria I have something like a 4-5% improvement in
 winrate against gnugo using ownership. But I think you need to be much more
 careful in how you test these things.

 Testing on CGOS is a no-no for me, because the opposition changes from hour
 to hour, so unless there are large effects in playing strength it is very
 hard to detect them on CGOS. I am currently testing against GnuGo or Fuego
 or Valkyria itself using twogtp.jar from GoGui. The nice thing with testing
 MC programs is that one can set the number of playouts low and play a lot of
 games.

 You might also consider simplifying your code. Just doing something simple
 like this:

 CombinedWinrate = AMAFWinRate + k*OwnerShip.

 I would then vary k from 0, 0.001, 0.01, 0.1, 1.

 Here I am expecting really bad performance for k=1, but I always try to
 include some extreme values so that I know for sure that there is no bug and
 the results make sense.

 I will run 500-2500 games per parameter because often the effects are
 really small and needs tons of data to be detected. I learned the hard way
 that it is too tempting to make quick conclusion on insufficient data.

 For every test I also need to think hard about how fast the programs should
 play. Fast play gives more data, but may not generalize to slow play on CGOS
 for example.

 When you know how this works, you can start experimenting with more complex
 code and more parameters.

 That is the philosophy I try to follow for my testing of Valkyria, and I
 hope some of it could be helpful.


 Quoting Brian Sheppard sheppar...@aol.com:

  In a paper published a while ago, Remi Coulom showed that 64 MC trials
 (i.e., just random, no tree) was a useful predictor of move quality.

 In particular, Remi counted how often each point ended up in possession
 of the side to move. He then measured the probability of being the best
 move as a function of the frequency of possession. Remi found that if
 the possession frequency was around 1/3 then the move was most likely
 to be best, with decreasing probabilities elsewhere.

 I have been trying to extract more information from each trial, since
 it seems to me that we are discarding useful information when we use
 only the result of a trial. So I tried to implement Remi's idea in a UCT
 program.

 This is very different from Remi's situation, in which the MC trials are
 done before the predictor is used in a tree search. Here, we will have
 a tree search going on concurrently with collecting data about point
 ownership.

 My implementation used the first N trials of each UCT node to collect
 point ownership information. After the first M trials, it would use that
 information to bias the RAVE statistics. That is, in the selectBest
 routine I had an expression like this:

   for all moves {
  // Get the observed RAVE values:
  nRAVE = RAVETrials[move];
  wRAVE = RAVEWins[move];

  // Dynamically adjust according to point ownership:
  if (trialCount  M) {
   ; // Do nothing.
  }
  else if (Ownership[move]  0.125) {
   nRAVE += ownershipTrialsParams[0];
   wRAVE += ownershipWinsParams[0];
  }
  else if (Ownership[move]  0.250) {
   nRAVE += ownershipTrialsParams[1];
   wRAVE += ownershipWinsParams[1];
  }
  else if (Ownership[move]  0.375) {
   nRAVE += ownershipTrialsParams[2];
   wRAVE += ownershipWinsParams[2];
  }
  else if (Ownership[move]  0.500) {
   nRAVE += ownershipTrialsParams[3];
   wRAVE += ownershipWinsParams[3];
  }
  else if (Ownership[move]  0.625) {
   nRAVE += ownershipTrialsParams[4];
   wRAVE += ownershipWinsParams[4];
  }
  else if (Ownership[move]  0.750) {
   nRAVE += ownershipTrialsParams[5];
   wRAVE += ownershipWinsParams[5];
  }
  else if (Ownership[move]  0.875) {
   nRAVE += ownershipTrialsParams[6];
   wRAVE += 

Re: [computer-go] Negative result on using MC as a predictor

2009-06-05 Thread dhillismail

I took a look at this once, testing?how well ownership maps predicted?the moves 
chosen in a large set of pro games. Ownership maps have some tricky artifacts, 
especially for forced moves. 

Consider a position, with white to move,?where black's previous move put a 
white group in atari, and white has one rescuing move. If you run a bunch of 
heavy playouts, the rescuing move will (almost) always be played and white will 
have a very strong ownership percentage for that space... seemingly indicating 
low urgency for white to move there. A map from light playouts might seem to 
show that the move would be hopeless. What happens for an internal node depends 
on the details of progressive widening, etc..

If you?generate an?ownership map using light playouts and compare it with one 
using heavy playouts, the heavy one will usually show spaces as being more 
strongly owned by one color or the other. The light maps will look fuzzier; the 
heavy maps will look crisper. Maps for internal nodes, using UCT, are crisper 
still. You need to pick your threshold with care.

- Dave Hillis

-Original Message-
From: Brian Sheppard sheppar...@aol.com
To: computer-go@computer-go.org
Sent: Fri, 5 Jun 2009 9:39 am
Subject: [computer-go] Negative result on using MC as a predictor



In a paper published a while ago, Remi Coulom showed that 64 MC trials
(i.e., just random, no tree) was a useful predictor of move quality.

In particular, Remi counted how often each point ended up in possession
of the side to move. He then measured the probability of being the best
move as a function of the frequency of possession. Remi found that if
the possession frequency was around 1/3 then the move was most likely
to be best, with decreasing probabilities elsewhere.

I have been trying to extract more information from each trial, since
it seems to me that we are discarding useful information when we use
only the result of a trial. So I tried to implement Remi's idea in a UCT
program.

This is very different from Remi's situation, in which the MC trials are
done before the predictor is used in a tree search. Here, we will have
a tree search going on concurrently with collecting data about point
ownership.

My implementation used the first N trials of each UCT node to collect
point ownership information. After the first M trials, it would use that
information to bias the RAVE statistics. That is, in the selectBest
routine I had an expression like this:

   for all moves {
  // Get the observed RAVE values:
  nRAVE = RAVETrials[move];
  wRAVE = RAVEWins[move];

  // Dynamically adjust according to point ownership:
  if (trialCount  M) {
   ; // Do nothing.
  }
  else if (Ownership[move]  0.125) {
   nRAVE += ownershipTrialsParams[0];
   wRAVE += ownershipWinsParams[0];
  }
  else if (Ownership[move]  0.250) {
   nRAVE += ownershipTrialsParams[1];
   wRAVE += ownershipWinsParams[1];
  }
  else if (Ownership[move]  0.375) {
   nRAVE += ownershipTrialsParams[2];
   wRAVE += ownershipWinsParams[2];
  }
  else if (Ownership[move]  0.500) {
   nRAVE += ownershipTrialsParams[3];
   wRAVE += ownershipWinsParams[3];
  }
  else if (Ownership[move]  0.625) {
   nRAVE += owner
shipTrialsParams[4];
   wRAVE += ownershipWinsParams[4];
  }
  else if (Ownership[move]  0.750) {
   nRAVE += ownershipTrialsParams[5];
   wRAVE += ownershipWinsParams[5];
  }
  else if (Ownership[move]  0.875) {
   nRAVE += ownershipTrialsParams[6];
   wRAVE += ownershipWinsParams[6];
  }
  else {
   nRAVE += ownershipTrialsParams[7];
   wRAVE += ownershipWinsParams[7];
  }

  // Now use nRAVE and wRAVE to order the moves for expansion
   }

The bottom line is that the result was negative. In the test period, Pebbles
won 69% (724 out of 1039) of CGOS games when not using this feature and
less than 59% when using this feature. I tried a few parameter settings.
Far from exhaustive, but mostly in line with Remi's paper.
The best parameter settings showed 59% (110 out of 184, which is 2.4
standard deviations lower). But maybe you can learn from my mistakes
and figure out how to make it work.

I have no idea why this implementation doesn't work. Maybe RAVE does a
good job already of determining where to play, so ownership information
is redundant. Maybe different parameter settings would work. Maybe just
overhead (but I doubt that; the overhead wouldn't account for such a
significant drop).

Anyway, if you try something like this, please let me know how it works out.
Or if you have other ideas about how to extract more information from
trials.

Best,
Brian

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go

[computer-go] Negative result on using MC as a predictor

2009-06-05 Thread Brian Sheppard
Which paper was this?

Computing Elo Ratings of Move Patterns in the Game of Go
ICGA Journal, Vol. 30, No. 4. (December 2007), pp. 198-208.

http://remi.coulom.free.fr/Amsterdam2007/MMGoPatterns.pdf


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Negative result on using MC as a predictor

2009-06-05 Thread Mark Boon
I've also tried a variety of ways to use point-ownership in
combination with RAVE. By no means was it an exhaustive study, but I
failed to find an intuitive way to improve play this way.

I didn't try enough to be able to come to hard conclusions, but at the
very least it didn't turn out to be obvious.

Mark


On Fri, Jun 5, 2009 at 3:39 AM, Brian Sheppard sheppar...@aol.com wrote:
 In a paper published a while ago, Remi Coulom showed that 64 MC trials
 (i.e., just random, no tree) was a useful predictor of move quality.

 In particular, Remi counted how often each point ended up in possession
 of the side to move. He then measured the probability of being the best
 move as a function of the frequency of possession. Remi found that if
 the possession frequency was around 1/3 then the move was most likely
 to be best, with decreasing probabilities elsewhere.

 I have been trying to extract more information from each trial, since
 it seems to me that we are discarding useful information when we use
 only the result of a trial. So I tried to implement Remi's idea in a UCT
 program.

 This is very different from Remi's situation, in which the MC trials are
 done before the predictor is used in a tree search. Here, we will have
 a tree search going on concurrently with collecting data about point
 ownership.

 My implementation used the first N trials of each UCT node to collect
 point ownership information. After the first M trials, it would use that
 information to bias the RAVE statistics. That is, in the selectBest
 routine I had an expression like this:

   for all moves {
      // Get the observed RAVE values:
      nRAVE = RAVETrials[move];
      wRAVE = RAVEWins[move];

      // Dynamically adjust according to point ownership:
      if (trialCount  M) {
           ; // Do nothing.
      }
      else if (Ownership[move]  0.125) {
           nRAVE += ownershipTrialsParams[0];
           wRAVE += ownershipWinsParams[0];
      }
      else if (Ownership[move]  0.250) {
           nRAVE += ownershipTrialsParams[1];
           wRAVE += ownershipWinsParams[1];
      }
      else if (Ownership[move]  0.375) {
           nRAVE += ownershipTrialsParams[2];
           wRAVE += ownershipWinsParams[2];
      }
      else if (Ownership[move]  0.500) {
           nRAVE += ownershipTrialsParams[3];
           wRAVE += ownershipWinsParams[3];
      }
      else if (Ownership[move]  0.625) {
           nRAVE += ownershipTrialsParams[4];
           wRAVE += ownershipWinsParams[4];
      }
      else if (Ownership[move]  0.750) {
           nRAVE += ownershipTrialsParams[5];
           wRAVE += ownershipWinsParams[5];
      }
      else if (Ownership[move]  0.875) {
           nRAVE += ownershipTrialsParams[6];
           wRAVE += ownershipWinsParams[6];
      }
      else {
           nRAVE += ownershipTrialsParams[7];
           wRAVE += ownershipWinsParams[7];
      }

      // Now use nRAVE and wRAVE to order the moves for expansion
   }

 The bottom line is that the result was negative. In the test period, Pebbles
 won 69% (724 out of 1039) of CGOS games when not using this feature and
 less than 59% when using this feature. I tried a few parameter settings.
 Far from exhaustive, but mostly in line with Remi's paper.
 The best parameter settings showed 59% (110 out of 184, which is 2.4
 standard deviations lower). But maybe you can learn from my mistakes
 and figure out how to make it work.

 I have no idea why this implementation doesn't work. Maybe RAVE does a
 good job already of determining where to play, so ownership information
 is redundant. Maybe different parameter settings would work. Maybe just
 overhead (but I doubt that; the overhead wouldn't account for such a
 significant drop).

 Anyway, if you try something like this, please let me know how it works out.
 Or if you have other ideas about how to extract more information from
 trials.

 Best,
 Brian

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/