Re: [computer-go] Odd results on 19x19

2008-01-07 Thread Vlad Dumitrescu
On Jan 6, 2008 11:37 PM, Don Dailey [EMAIL PROTECTED] wrote:
  I'm not sure I get the whole picture regarding multi-dimensional
  ratings. How can you compare two players with a 2-dimensional rating?
  You can't, so how would one use this rating? In my book, a rating's
  goal is to make things comparable...

 A 2-dimensional or more rating would be used to predict the winner.
 You would be able to say that a given player will beat another specified
 player some percentage of the time. With more than one dimension
 perhaps the formula would be a better predictor since it could take
 playing styles into consideration.

 It could not be used to rank players in the sense of putting them on a
 scale such as CGOS uses.   Since there is an inherently intransitive
 relationship,  you cannot rank players in strict order with more than 1
 dimension.

Thanks all who answered!

I see the point, the confusion in my mind was caused by the use of the
word ranking which for me does imply an ordering.

best regards,
Vlad
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Odd results on 19x19

2008-01-06 Thread David Fotland
The styles of CS (CS-9-17-10k-1CPU), MFGO (mfgo12exp-15), and GNUGO
(gnugo3.7.10_10) are different, and it's generating some odd results.

Many Faces beats GnuGo 70%.  There are not many games, but this is
consistent with over 100 test games I've run.
CS beats GnuGo 55%.  Over 100 games played.
CS beats Many Faces 90%.  Only 20 games, but consistent with earlier
results.

If we look at results against GnuGo, Many Faces seems stronger than CS, but
in games against CS, Many Faces is much weaker.

Many Faces plays a fighting style, and CS plays a territorial style, but I'm
still surprised at the difference.

David

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread steve uurtamo
did you optimize parameters in MFGO by playing against
gnugo?

that'd do it.

s.

- Original Message 
From: David Fotland [EMAIL PROTECTED]
To: computer-go computer-go@computer-go.org
Sent: Sunday, January 6, 2008 12:52:10 PM
Subject: [computer-go] Odd results on 19x19 


The styles of CS (CS-9-17-10k-1CPU), MFGO (mfgo12exp-15), and GNUGO
(gnugo3.7.10_10) are different, and it's generating some odd results.

Many Faces beats GnuGo 70%.  There are not many games, but this is
consistent with over 100 test games I've run.
CS beats GnuGo 55%.  Over 100 games played.
CS beats Many Faces 90%.  Only 20 games, but consistent with earlier
results.

If we look at results against GnuGo, Many Faces seems stronger than CS,
 but
in games against CS, Many Faces is much weaker.

Many Faces plays a fighting style, and CS plays a territorial style,
 but I'm
still surprised at the difference.

David

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/





  

Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ 

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread Rémi Coulom

David Fotland wrote:

The styles of CS (CS-9-17-10k-1CPU), MFGO (mfgo12exp-15), and GNUGO
(gnugo3.7.10_10) are different, and it's generating some odd results.

Many Faces beats GnuGo 70%.  There are not many games, but this is
consistent with over 100 test games I've run.
CS beats GnuGo 55%.  Over 100 games played.
CS beats Many Faces 90%.  Only 20 games, but consistent with earlier
results.

If we look at results against GnuGo, Many Faces seems stronger than CS, but
in games against CS, Many Faces is much weaker.

Many Faces plays a fighting style, and CS plays a territorial style, but I'm
still surprised at the difference.

David

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
  


I noticed that too. My feeling is that is because MF is a classical 
program with a global search, GNU a classical program with no global 
search, and Crazy Stone a MC program. MF beats GNU thanks to global 
search. But MF's strength without the global search (whatever that would 
mean) is inferior to that of GNU. CS also has a global search, so MF's 
global-search advantage does not work against CS.


I guess that KCC Igo had the same problem as MF against Crazy Stone.

I thought about a model for multi-dimensional Elo ratings once (don't 
give only one value to each player, but two or three, with an 
appropriate formula for predicting game outcome). Maybe I'll try it on 
CGOS data when I have time. This would not rate players along a 
one-dimensional line. Here is a reference to a similar idea:


http://dx.doi.org/10.1016/j.jspi.2004.05.008


 Abstract

The Bradley–Terry model is widely and often beneficially used to rank 
objects from paired comparisons. The underlying assumption that makes 
ranking possible is the existence of a latent linear scale of merit or 
equivalently of a kind of transitiveness of the preference. However, in 
some situations such as sensory comparisons of products, this assumption 
can be unrealistic. In these contexts, although the Bradley–Terry model 
appears to be significantly interesting, the linear ranking does not 
make sense. Our aim is to propose a 2-dimensional extension of the 
Bradley–Terry model that accounts for interactions between the compared 
objects. From a methodological point of view, this proposition can be 
seen as a multidimensional scaling approach in the context of a logistic 
model for binomial data. Maximum likelihood is investigated and 
asymptotic properties are derived in order to construct confidence 
ellipses on the diagram of the 2-dimensional scores. It is shown by an 
illustrative example based on real sensory data on how to use the 
2-dimensional model to inspect the lack-of-fit of the Bradley–Terry model.


Rémi

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread Rémi Coulom

steve uurtamo wrote:

did you optimize parameters in MFGO by playing against
gnugo?

that'd do it.

s.


Well, I don't know about David, but I do _all_ my testing and optimizing 
against GNU.


Rémi
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread Don Dailey
My guess is that this is a combination of some intransitivity and low
sample size.   100 games isn't very much data in the CS vs MFGO. 

As far as intransivity,  perhaps Crazy Stone has some particular
strength that works very well against a weakness in MFGO.   The
values do not make a great deal of sense,  but there are a lot of
unknown parameters too, such as which levels are being played by each
program.Perhaps we are not comparing apples to apples?

- Don


David Fotland wrote:
 The styles of CS (CS-9-17-10k-1CPU), MFGO (mfgo12exp-15), and GNUGO
 (gnugo3.7.10_10) are different, and it's generating some odd results.

 Many Faces beats GnuGo 70%.  There are not many games, but this is
 consistent with over 100 test games I've run.
 CS beats GnuGo 55%.  Over 100 games played.
 CS beats Many Faces 90%.  Only 20 games, but consistent with earlier
 results.

 If we look at results against GnuGo, Many Faces seems stronger than CS, but
 in games against CS, Many Faces is much weaker.

 Many Faces plays a fighting style, and CS plays a territorial style, but I'm
 still surprised at the difference.

 David

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

   
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread Don Dailey
Rémi,

The idea of a non one dimension rating model is interesting.  If you
decide to pursue this I can give you the CGOS data in a compact format, 
1 line per result. 

I thought of this idea too, but I didn't try to produce a model.It
would be easier to test and build such a model however if you
synthesized them artificially.  You could purposely build a
rocks/scissors/paper style into a dozen different players or more.  
Randomly give them different strength (using straightforward ELO
ratings) but also give them one of 3 playing styles (rock, scissors,
paper) in which their actual performance against a given opponent was
bumped up or down 100 ELO or so depending on whether they had
conflicting styles. So a paper might still beat a scissors, but 
it would be more difficult than their base elo ratings would suggest.  

Then you can play hundreds of thousands of simulated games in just
seconds and generate data and see if your model can predict the results
reliably. 

Another approach I thought of is to take a very simple game (such as
tic-tac-toe) and create many players that play by simple rules but where
significant transitivities might exist.You would not want the rules
to be deterministic or the games would all play the same, but the rules
could be probabilistic. 

It would be remarkable if you could capture strength characteristics
with just 2 or 3 numbers instead of one.   I would guess that 2 numbers
might be far more accurate than 1,  but with quickly diminishing returns
for additional parameters. Of course it might require a huge amount
of data in order to zero in on a players characteristics statistically.

- Don




Rémi Coulom wrote:
 David Fotland wrote:
 The styles of CS (CS-9-17-10k-1CPU), MFGO (mfgo12exp-15), and GNUGO
 (gnugo3.7.10_10) are different, and it's generating some odd results.

 Many Faces beats GnuGo 70%.  There are not many games, but this is
 consistent with over 100 test games I've run.
 CS beats GnuGo 55%.  Over 100 games played.
 CS beats Many Faces 90%.  Only 20 games, but consistent with earlier
 results.

 If we look at results against GnuGo, Many Faces seems stronger than
 CS, but
 in games against CS, Many Faces is much weaker.

 Many Faces plays a fighting style, and CS plays a territorial style,
 but I'm
 still surprised at the difference.

 David

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/
   

 I noticed that too. My feeling is that is because MF is a classical
 program with a global search, GNU a classical program with no global
 search, and Crazy Stone a MC program. MF beats GNU thanks to global
 search. But MF's strength without the global search (whatever that
 would mean) is inferior to that of GNU. CS also has a global search,
 so MF's global-search advantage does not work against CS.

 I guess that KCC Igo had the same problem as MF against Crazy Stone.

 I thought about a model for multi-dimensional Elo ratings once (don't
 give only one value to each player, but two or three, with an
 appropriate formula for predicting game outcome). Maybe I'll try it on
 CGOS data when I have time. This would not rate players along a
 one-dimensional line. Here is a reference to a similar idea:

 http://dx.doi.org/10.1016/j.jspi.2004.05.008


  Abstract

 The Bradley–Terry model is widely and often beneficially used to rank
 objects from paired comparisons. The underlying assumption that makes
 ranking possible is the existence of a latent linear scale of merit or
 equivalently of a kind of transitiveness of the preference. However,
 in some situations such as sensory comparisons of products, this
 assumption can be unrealistic. In these contexts, although the
 Bradley–Terry model appears to be significantly interesting, the
 linear ranking does not make sense. Our aim is to propose a
 2-dimensional extension of the Bradley–Terry model that accounts for
 interactions between the compared objects. From a methodological point
 of view, this proposition can be seen as a multidimensional scaling
 approach in the context of a logistic model for binomial data. Maximum
 likelihood is investigated and asymptotic properties are derived in
 order to construct confidence ellipses on the diagram of the
 2-dimensional scores. It is shown by an illustrative example based on
 real sensory data on how to use the 2-dimensional model to inspect the
 lack-of-fit of the Bradley–Terry model.

 Rémi

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread Vlad Dumitrescu
On Jan 6, 2008 11:00 PM, Don Dailey [EMAIL PROTECTED] wrote:
 The idea of a non one dimension rating model is interesting.  If you
 decide to pursue this I can give you the CGOS data in a compact format,
 1 line per result.

Hi all,

I'm not sure I get the whole picture regarding multi-dimensional
ratings. How can you compare two players with a 2-dimensional rating?
You can't, so how would one use this rating? In my book, a rating's
goal is to make things comparable...

best regards,
Vlad
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread steve uurtamo
you can use a multi-d ranking system to predict
the outcome of a contest between two players.

this is good for handicapping, for instance.

this will not necessarily create a linear ordering
of the players, as you've mentioned, but it is still
quite useful, and radically more efficient and useful
than storing the n^2-n per-pair results.

s.

- Original Message 
From: Vlad Dumitrescu [EMAIL PROTECTED]
To: computer-go computer-go@computer-go.org
Sent: Sunday, January 6, 2008 5:12:56 PM
Subject: Re: [computer-go] Odd results on 19x19


On Jan 6, 2008 11:00 PM, Don Dailey [EMAIL PROTECTED] wrote:
 The idea of a non one dimension rating model is interesting.  If you
 decide to pursue this I can give you the CGOS data in a compact
 format,
 1 line per result.

Hi all,

I'm not sure I get the whole picture regarding multi-dimensional
ratings. How can you compare two players with a 2-dimensional rating?
You can't, so how would one use this rating? In my book, a rating's
goal is to make things comparable...

best regards,
Vlad
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/





  

Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread Rémi Coulom

Vlad Dumitrescu wrote:

On Jan 6, 2008 11:00 PM, Don Dailey [EMAIL PROTECTED] wrote:
  

The idea of a non one dimension rating model is interesting.  If you
decide to pursue this I can give you the CGOS data in a compact format,
1 line per result.



Hi all,

I'm not sure I get the whole picture regarding multi-dimensional
ratings. How can you compare two players with a 2-dimensional rating?
You can't, so how would one use this rating? In my book, a rating's
goal is to make things comparable...

best regards,
Vlad
  


The idea is that players would not be ranked on a linear scale, but we 
would have a formula to estimate the probability of winning between any 
pair of players.


For instance, if player A has rating (A1, A2, A3) and player B has 
rating (B1, B2, B3)


Delta = ((A1-B1)^3 + (A2-B2)^3 + (A3-B3)^3) / ((A1-B1)² + (A2-B2)² + 
(A3-B3)²)

P(A beats B) = 1 / (1 + exp(-Delta))

if A1 = A2 = A3 and B1 = B2 = B3, this is the usual Bradley-Terry model. 
But with 3 dimensions, it is possible to get a cycle for instance with:

A=(1, -1, 0)
B=(-1, 0, 1)
C=(0, 1, -1)

With these ratings and the formula above, P(A beats B)  0.5, P(B beats 
C)  0.5, and P(C beats A)  0.5.


It is exactly the same principle as the basic Bradley-Terry model. The 
very big difficulty is finding the maximum-a-posteriori of the ratings 
from the observation of game results. There is no easy optimization 
algorithm like for the one-dimensional model. The probability 
distribution has many local optima, so it is tricky.


Rémi
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread Don Dailey


Vlad Dumitrescu wrote:
 On Jan 6, 2008 11:00 PM, Don Dailey [EMAIL PROTECTED] wrote:
   
 The idea of a non one dimension rating model is interesting.  If you
 decide to pursue this I can give you the CGOS data in a compact format,
 1 line per result.
 

 Hi all,

 I'm not sure I get the whole picture regarding multi-dimensional
 ratings. How can you compare two players with a 2-dimensional rating?
 You can't, so how would one use this rating? In my book, a rating's
 goal is to make things comparable...

   
A 2-dimensional or more rating would be used to predict the winner.   
You would be able to say that a given player will beat another specified
player some percentage of the time. With more than one dimension
perhaps the formula would be a better predictor since it could take
playing styles into consideration.

It could not be used to rank players in the sense of putting them on a
scale such as CGOS uses.   Since there is an inherently intransitive
relationship,  you cannot rank players in strict order with more than 1
dimension.

- Don


 best regards,
 Vlad
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

   
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Odd results on 19x19

2008-01-06 Thread dhillismail
I've mentioned this before, but hopefully not recently enough to make this 
annoying. Computer go people and corewars people overlap somewhat. 
Intransitivity is extremely important for corewars, making?corewars a good 
domain to study it.

Here is an example of a nice graphical way to visualize intransitivities 
between corewars programs.
http://www.koth.org/lcgi-bin/hugetable.pl?hill94x?

In corewars, you might look at a table like this and say Oh, I'm losing too 
many games against x-type strategy, I need to concentrate on that aspect of my 
bot. Or you might say, I'm doing great against y-type strategy, and I'm 
betting there will be a lot of those in the upcoming tournament. It's more 
than predicting the outcome of a particular matchup, or a ranking in a static 
field.

We haven't seen strong statistical evidence for intransitivity in computer go, 
but I don't think anyone has looked very hard yet.

- Dave Hillis

-Original Message-
From: Vlad Dumitrescu [EMAIL PROTECTED]
To: computer-go computer-go@computer-go.org
Sent: Sun, 6 Jan 2008 5:12 pm
Subject: Re: [computer-go] Odd results on 19x19



On Jan 6, 2008 11:00 PM, Don Dailey [EMAIL PROTECTED] wrote:
 The idea of a non one dimension rating model is interesting.  If you
 decide to pursue this I can give you the CGOS data in a compact format,
 1 line per result.

Hi all,

I'm not sure I get the whole picture regarding multi-dimensional
ratings. How can you compare two players with a 2-dimensional rating?
You can't, so how would one use this rating? In my book, a rating's
goal is to make things comparable...

best regards,
Vlad
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/



More new features than ever.  Check out the new AIM(R) Mail ! - 
http://webmail.aim.com
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Odd results on 19x19

2008-01-06 Thread Don Dailey
We haven't seen strong statistical evidence for intransitivity in
computer go, but I don't think anyone has looked very hard yet.


It seems like it probably exists to some degree - It would be
interesting to study this. 


- Don




[EMAIL PROTECTED] wrote:
 I've mentioned this before, but hopefully not recently enough to make
 this annoying. Computer go people and corewars people overlap
 somewhat. Intransitivity is extremely important for corewars,
 making corewars a good domain to study it.

 Here is an example of a nice graphical way to visualize
 intransitivities between corewars programs.
 http://www.koth.org/lcgi-bin/hugetable.pl?hill94x 

 In corewars, you might look at a table like this and say Oh, I'm
 losing too many games against x-type strategy, I need to concentrate
 on that aspect of my bot. Or you might say, I'm doing great against
 y-type strategy, and I'm betting there will be a lot of those in the
 upcoming tournament. It's more than predicting the outcome of a
 particular matchup, or a ranking in a static field.

 We haven't seen strong statistical evidence for intransitivity in
 computer go, but I don't think anyone has looked very hard yet.

 - Dave Hillis

 -Original Message-
 From: Vlad Dumitrescu [EMAIL PROTECTED]
 To: computer-go computer-go@computer-go.org
 Sent: Sun, 6 Jan 2008 5:12 pm
 Subject: Re: [computer-go] Odd results on 19x19

 On Jan 6, 2008 11:00 PM, Don Dailey [EMAIL PROTECTED] mailto:[EMAIL 
 PROTECTED] wrote:
  The idea of a non one dimension rating model is interesting.  If you
  decide to pursue this I can give you the CGOS data in a compact format,
  1 line per result.

 Hi all,

 I'm not sure I get the whole picture regarding multi-dimensional
 ratings. How can you compare two players with a 2-dimensional rating?
 You can't, so how would one use this rating? In my book, a rating's
 goal is to make things comparable...

 best regards,
 Vlad
 ___
 computer-go mailing list
 computer-go@computer-go.org mailto:computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/
 
 More new features than ever. Check out the new AIM(R) Mail
 http://o.aolcdn.com/cdn.webmail.aol.com/mailtour/aol/en-us/text.htm?ncid=aimcmp000501!
 

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/