Re: [Computer-go] CGOS source on github

2021-01-23 Thread Brian Lee
DeepMind has published a number of papers on how to stabilize RL strategies
in a landscape of nontransitive cycles. See
https://papers.nips.cc/paper/2018/file/cdf1035c34ec380218a8cc9a43d438f9-Paper.pdf

I haven't fully digested the paper, but what I'm getting from it is that if
you want your evaluation environment to be more independent of the
population of agents that you're evaluating against, you should first
compute a max-entropy Nash equilibrium of agents, and evaluate against this
equilibrium distribution.

To give a concrete example from the paper, imagine the CRPSS - the Computer
Rock Paper Scissors Server. Imagine there are currently 4 bots connected: a
Rock-only bot, a Paper-only bot, and two Scissor-only bots. The max-entropy
Nash equilibrium is 1/3, 1/3, 1/6, 1/6. So the duplicated Scissor bots are
naturally detected and their impact on the rating distribution is negated.
With CGOS's current evaluation scheme, the Rock bot would appear to have a
higher Elo score, because it has more opportunities to beat up on the two
Scissors bots.

The paper also proposes a vector extension to Elo that can better predict
outcomes under these nontransitive cycles.

Given that what we have is (at a macro level) duplication of various bot
lineages, and (at a micro level) rock-paper-scissors relationships between
bots in sharp openings, this paper seems quite relevant.


On Sat, Jan 23, 2021 at 5:34 AM Darren Cook  wrote:

> > ladders, not just liberties. In that case, yes! If you outright tell the
> > neural net as an input whether each ladder works or not (doing a short
> > tactical search to determine this), or something equivalent to it, then
> the
> > net will definitely make use of that information, ...
>
> Each convolutional layer should spread the information across the board.
> I think alpha zero used 20 layers? So even 3x3 filters would tell you
> about the whole board - though the signal from the opposite corner of
> the board might end up a bit weak.
>
> I think we can assume it is doing that successfully, because otherwise
> we'd hear about it losing lots of games in ladders.
>
> > something the first version of AlphaGo did (before they tried to make it
> > "zero") and something that many other bots do as well. But Leela Zero and
> > ELF do not do this, because of attempting to remain "zero", ...
>
> I know that zero-ness was very important to DeepMind, but I thought the
> open source dedicated go bots that have copied it did so because AlphaGo
> Zero was stronger than AlphaGo Master after 21-40 days of training.
> I.e. in the rarefied atmosphere of super-human play that starter package
> of human expert knowledge was considered a weight around its neck.
>
> BTW, I agree that feeding the results of tactical search in would make
> stronger programs, all else being equal. But it is branching code, so
> much slower to parallelize.
>
> Darren
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-23 Thread Darren Cook
> ladders, not just liberties. In that case, yes! If you outright tell the
> neural net as an input whether each ladder works or not (doing a short
> tactical search to determine this), or something equivalent to it, then the
> net will definitely make use of that information, ...

Each convolutional layer should spread the information across the board.
I think alpha zero used 20 layers? So even 3x3 filters would tell you
about the whole board - though the signal from the opposite corner of
the board might end up a bit weak.

I think we can assume it is doing that successfully, because otherwise
we'd hear about it losing lots of games in ladders.

> something the first version of AlphaGo did (before they tried to make it
> "zero") and something that many other bots do as well. But Leela Zero and
> ELF do not do this, because of attempting to remain "zero", ...

I know that zero-ness was very important to DeepMind, but I thought the
open source dedicated go bots that have copied it did so because AlphaGo
Zero was stronger than AlphaGo Master after 21-40 days of training.
I.e. in the rarefied atmosphere of super-human play that starter package
of human expert knowledge was considered a weight around its neck.

BTW, I agree that feeding the results of tactical search in would make
stronger programs, all else being equal. But it is branching code, so
much slower to parallelize.

Darren
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-22 Thread uurtamo
also frankly not a problem for a rating system to handle.

a rating system shouldn't be tweaked to handle eccentricities of its
players other than the general assumptions of how a game's result is
determined (like, does it allow for "win" and "draw" and "undetermined" or
just "win").

s.


On Fri, Jan 22, 2021 at 6:29 AM David Wu  wrote:

> On Fri, Jan 22, 2021 at 8:08 AM Rémi Coulom  wrote:
>
>> You are right that non-determinism and bot blind spots are a source of
>> problems with Elo ratings. I add randomness to the openings, but it is
>> still difficult to avoid repeating some patterns. I have just noticed that
>> the two wins of CrazyStone-81-15po against LZ_286_e6e2_p400 were caused by
>> very similar ladders in the opening:
>> http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/73.sgf
>> http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/733301.sgf
>> Such a huge blind spot in such a strong engine is likely to cause rating
>> compression.
>> Rémi
>>
>
> I agree, ladders are definitely the other most noticeable way that Elo
> model assumptions may be broken, since pure-zero bots have a hard time with
> them, and can easily cause difference(A,B) + difference(B,C) to be very
> inconsistent with difference(A,C). If some of A,B,C always handle ladders
> very well and some are blind to them, then you are right that probably no
> amount of opening randomization can smooth it out.
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-22 Thread David Wu
@Claude - Oh, sorry, I misread your message, you were also asking about
ladders, not just liberties. In that case, yes! If you outright tell the
neural net as an input whether each ladder works or not (doing a short
tactical search to determine this), or something equivalent to it, then the
net will definitely make use of that information, There are some bad side
effects even to doing this, but it helps the most common case. This is
something the first version of AlphaGo did (before they tried to make it
"zero") and something that many other bots do as well. But Leela Zero and
ELF do not do this, because of attempting to remain "zero", i.e. free as
much as possible from expert human knowledge or specialized feature
crafting.


On Fri, Jan 22, 2021 at 9:26 AM David Wu  wrote:

> Hi Claude - no, generally feeding liberty counts to neural networks
> doesn't help as much as one would hope with ladders and sekis and large
> capturing races.
>
> The thing that is hard about ladders has nothing to do with liberties - a
> trained net is perfectly capable of recognizing the atari, this is
> extremely easy. The hard part is predicting if the ladder will work without
> playing it out, because whether it works depends extremely sensitively on
> the exact position of stones all the way on the other side of the board. A
> net that fails to predict this well might prematurely reject a working
> ladder (which is very hard for the search to correct), or be highly
> overoptimistic about a nonworking ladder (which takes the search thousands
> of playouts to correct in every single branch of the tree that it happens
> in).
>
> For large sekis and capturing races, liberties usually don't help as much
> as you would think. This is because approach liberties, ko liberties, big
> eye liberties, shared liberties versus unshared liberties, throwin
> possibilities all affect the "effective" liberty count significantly. Also
> very commonly you have bamboo joints, simple diagonal or hanging
> connections and other shapes where the whole group is not physically
> connected, also making the raw liberty count not so useful. The neural net
> still ultimately has to scan over the entire group anyways, computing these
> things.
>
> On Fri, Jan 22, 2021 at 8:31 AM Claude Brisson via Computer-go <
> computer-go@computer-go.org> wrote:
>
>> Hi. Maybe it's a newbie question, but since the ladders are part of the
>> well defined topology of the goban (as well as the number of current
>> liberties of each chain of stone), can't feeding those values to the
>> networks (from the very start of the self teaching course) help with large
>> shichos and sekis?
>>
>> Regards,
>>
>>   Claude
>> On 21-01-22 13 h 59, Rémi Coulom wrote:
>>
>> Hi David,
>>
>> You are right that non-determinism and bot blind spots are a source of
>> problems with Elo ratings. I add randomness to the openings, but it is
>> still difficult to avoid repeating some patterns. I have just noticed that
>> the two wins of CrazyStone-81-15po against LZ_286_e6e2_p400 were caused by
>> very similar ladders in the opening:
>> http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/73.sgf
>> http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/733301.sgf
>> Such a huge blind spot in such a strong engine is likely to cause rating
>> compression.
>>
>> Rémi
>>
>> ___
>> Computer-go mailing 
>> listComputer-go@computer-go.orghttp://computer-go.org/mailman/listinfo/computer-go
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-22 Thread David Wu
Hi Claude - no, generally feeding liberty counts to neural networks doesn't
help as much as one would hope with ladders and sekis and large capturing
races.

The thing that is hard about ladders has nothing to do with liberties - a
trained net is perfectly capable of recognizing the atari, this is
extremely easy. The hard part is predicting if the ladder will work without
playing it out, because whether it works depends extremely sensitively on
the exact position of stones all the way on the other side of the board. A
net that fails to predict this well might prematurely reject a working
ladder (which is very hard for the search to correct), or be highly
overoptimistic about a nonworking ladder (which takes the search thousands
of playouts to correct in every single branch of the tree that it happens
in).

For large sekis and capturing races, liberties usually don't help as much
as you would think. This is because approach liberties, ko liberties, big
eye liberties, shared liberties versus unshared liberties, throwin
possibilities all affect the "effective" liberty count significantly. Also
very commonly you have bamboo joints, simple diagonal or hanging
connections and other shapes where the whole group is not physically
connected, also making the raw liberty count not so useful. The neural net
still ultimately has to scan over the entire group anyways, computing these
things.

On Fri, Jan 22, 2021 at 8:31 AM Claude Brisson via Computer-go <
computer-go@computer-go.org> wrote:

> Hi. Maybe it's a newbie question, but since the ladders are part of the
> well defined topology of the goban (as well as the number of current
> liberties of each chain of stone), can't feeding those values to the
> networks (from the very start of the self teaching course) help with large
> shichos and sekis?
>
> Regards,
>
>   Claude
> On 21-01-22 13 h 59, Rémi Coulom wrote:
>
> Hi David,
>
> You are right that non-determinism and bot blind spots are a source of
> problems with Elo ratings. I add randomness to the openings, but it is
> still difficult to avoid repeating some patterns. I have just noticed that
> the two wins of CrazyStone-81-15po against LZ_286_e6e2_p400 were caused by
> very similar ladders in the opening:
> http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/73.sgf
> http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/733301.sgf
> Such a huge blind spot in such a strong engine is likely to cause rating
> compression.
>
> Rémi
>
> ___
> Computer-go mailing 
> listComputer-go@computer-go.orghttp://computer-go.org/mailman/listinfo/computer-go
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-22 Thread David Wu
On Fri, Jan 22, 2021 at 3:45 AM Hiroshi Yamashita  wrote:

> This kind of joseki is not good for Zero type. Ladder and capturing
>   race are intricately combined. In AlphaGo(both version of AlphaGoZero
>   and Master) published self-matches, this joseki is rare.
> -
>
> I found this joseki in kata1_b40s575v100 (black) vs LZ_286_e6e2_p400
> (white).
> http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/22/733340.sgf
>

Hi Hiroshi - yep. This is indeed a joseki that was partly popularized by AI
and jointly explored with humans. It is probably fair to say that it is by
far the most complicated common joseki known right now, and more
complicated than either of the avalanche or the taisha.

Some zero-trained bots will find and enter into this joseki, some won't.
The ones that don't play this joseki in self-play will have a significant
chance to be vulnerable to it if an opponent plays it against them, because
there are a large number of traps and blind spots that cannot be solved if
the net doesn't have experience with the position. And even having some
experience is not always enough. For example, ELF and Leela Zero have
learned some lines, but are far from perfect. There is a good chance that
AlphaGoZero or Master would have been vulnerable to it as well. KataGo at
the time of 1.3.5 was also vulnerable to it too - it only rarely came up in
self-play, and therefore was never learned and correctly evaluated, so from
the 3-3 invader's side the joseki could be forced and KataGo would likely
mess up the joseki and be losing the game right at the start. (The most
recent KataGo nets are much less vulnerable now though).

The example you found is one where this has happened to Leela Zero. In the
game you linked, move 34 is a big mistake. Leela Zero underweights the
possibility of move 35, and then is blind to the seeming-bad-shape move of
37, and as a result, is in a bad position now. The current Leela Zero nets
consistently makes this mistake, *and* consistently prefer playing down
this line, so against an opponent happy to play it with them, Leela Zero
will lose many games right in the opening all the same way.

Anyways, the reason this joseki is responsible for more such distortions
than other joseki seems to be because it is so sharp, and unlike most other
common joseki, contains at least 5-6 enormous blind spots in different
variations that zero-trained nets variously have trouble to learn on their
own.

> a very large sampling of positions from a wide range
> > of human professional games, from say, move 20, and have bots play
> starting
> > from these sampled positions, in pairs once with each color.
>
> This sounds interesting.
> I will think about another CGOS that handle this.


I'm glad you're interested. I don't know if move 20 is a good number (I
just threw it out there), maybe it should be varied, it might take
some experimentation. And I'm not sure it's worth doing, since it's still
probably only the smaller part of the problem in general - as Remi pointed
out, likely ladder handling will be a thing that always continues to
introduce Elo-nontransitivity, and probably all of this is less important
than generally having a variety of long-running bots to help stabilize the
system over time.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-22 Thread David Wu
On Fri, Jan 22, 2021 at 8:08 AM Rémi Coulom  wrote:

> You are right that non-determinism and bot blind spots are a source of
> problems with Elo ratings. I add randomness to the openings, but it is
> still difficult to avoid repeating some patterns. I have just noticed that
> the two wins of CrazyStone-81-15po against LZ_286_e6e2_p400 were caused by
> very similar ladders in the opening:
> http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/73.sgf
> http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/733301.sgf
> Such a huge blind spot in such a strong engine is likely to cause rating
> compression.
> Rémi
>

I agree, ladders are definitely the other most noticeable way that Elo
model assumptions may be broken, since pure-zero bots have a hard time with
them, and can easily cause difference(A,B) + difference(B,C) to be very
inconsistent with difference(A,C). If some of A,B,C always handle ladders
very well and some are blind to them, then you are right that probably no
amount of opening randomization can smooth it out.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-22 Thread Claude Brisson via Computer-go
Hi. Maybe it's a newbie question, but since the ladders are part of the 
well defined topology of the goban (as well as the number of current 
liberties of each chain of stone), can't feeding those values to the 
networks (from the very start of the self teaching course) help with 
large shichos and sekis?


Regards,

  Claude

On 21-01-22 13 h 59, Rémi Coulom wrote:

Hi David,

You are right that non-determinism and bot blind spots are a source of 
problems with Elo ratings. I add randomness to the openings, but it is 
still difficult to avoid repeating some patterns. I have just noticed 
that the two wins of CrazyStone-81-15po against LZ_286_e6e2_p400 were 
caused by very similar ladders in the opening:

http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/73.sgf
http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/733301.sgf
Such a huge blind spot in such a strong engine is likely to cause 
rating compression.


Rémi

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-22 Thread Rémi Coulom
Hi David,

You are right that non-determinism and bot blind spots are a source of
problems with Elo ratings. I add randomness to the openings, but it is
still difficult to avoid repeating some patterns. I have just noticed that
the two wins of CrazyStone-81-15po against LZ_286_e6e2_p400 were caused by
very similar ladders in the opening:
http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/73.sgf
http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/21/733301.sgf
Such a huge blind spot in such a strong engine is likely to cause rating
compression.

Rémi
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-22 Thread Hiroshi Yamashita

Hi,


The most noticeable case of this is with Mi Yuting's flying dagger joseki.


I'm not familiar with this.
I found Hirofumi Ohashi 6d pro's explanation half year ago in HCCL ML.
The following is a quote.
-
https://gokifu.net/t2.php?s=3591591539793593
It seems that it is called a flying dagger joseki in China.
This shape, direct 33 to lower tsuke (black 9th move B6) is researched
 jointly with humans and AI, but still inconclusive. After kiri (black
 15th move E4), mainstream is white A, but depending on the version of
 KataGo, white B may be recommended. By the way, KataGo I'm using now
 is 1.3.5, which is just a short time ago.

This kind of joseki is not good for Zero type. Ladder and capturing
 race are intricately combined. In AlphaGo(both version of AlphaGoZero
 and Master) published self-matches, this joseki is rare.
-

I found this joseki in kata1_b40s575v100 (black) vs LZ_286_e6e2_p400 (white).
http://www.yss-aya.com/cgos/viewer.cgi?19x19/SGF/2021/01/22/733340.sgf

Mi Yuting wiki has this joseki.
https://zh.wikipedia.org/wiki/%E8%8A%88%E6%98%B1%E5%BB%B7
KataGo has special option.
https://github.com/lightvector/KataGo/blob/4a79cde56e81209ce4e2fd231b0f2cbee3a8354b/cpp/neuralnet/nneval.cpp#L499


a very large sampling of positions from a wide range
of human professional games, from say, move 20, and have bots play starting
from these sampled positions, in pairs once with each color. 


This sounds interesting.
I will think about another CGOS that handle this.

Thanks,
Hiroshi Yamashita
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-21 Thread David Wu
One tricky thing is that there are some major nonlinearities between
different bots early in the opening that break Elo model assumptions quite
blatantly at these higher levels.

The most noticeable case of this is with Mi Yuting's flying dagger joseki.
I've noticed for example that in particular matchups between different
pairs of bots (e.g. one particular KataGo net as white versus ELF as black,
or one version of LZ as black versus some other version as white), maybe as
many as 30% of games will enter into this joseki and the preferences for
the bots may happen by chance to line up such that consistently they will
play down a path where one side hits a blind spot and begins the game with
an early disadvantage. Each different bot may have different preferences
such that arbitrarily each possible pairing randomly runs into such a trap
or not.

And, having significant early-game temperature in the bot itself doesn't
always help as much as you would think because this particular joseki is so
sharp that a particular bot could easily have such a strong preference for
one path or another (even when it is ultimately wrong) so as to override
any reasonable temperature. Sometimes, adding temperature or extra
randomness simply only mildly changes the frequency of the sequence, or
just varies the time before the joseki and trap/blunder happens anyways.

If games are to begin from the empty board, I'm not sure there's an easy
way around this except having a very large variety of opponents.

One thing that I'm pretty sure would mostly "fix" the problem (in the sense
of producing a smoother metric of general strength in a variety of
positions not heavily affected by just a few key lines) would be to
semi-arbitrarily take a very large sampling of positions from a wide range
of human professional games, from say, move 20, and have bots play starting
from these sampled positions, in pairs once with each color. This would
still include many AI openings, because of the way human pros in the last
3-4 years have quickly integrated and experimented with them, but would
also introduce a lot more variety in general than would occur in any
head-to-head matchup.

This is almost surely a *smaller *problem than simply having enough games
mixing between different long-running bots to anchor the Elo system. And it
is not the only way major nontransitivities can show up, (e.g. ladders).
But to take a leaf from computer Chess, playing from sampled forced
openings seems to be a common practice there and maybe it's worth
considering in computer Go as well, even if it only fixes what is currently
the smaller of the issues.


On Thu, Jan 21, 2021 at 12:01 PM Rémi Coulom  wrote:

> Thanks for computing the new rating list.
>
> I feel it did not fix anything. The old Zen, cronus, etc.have almost no
> change at all.
>
> So it is not a good fix, in my opinion. No need to change anything to the
> official ratings.
>
> The fundamental problem seems that the Elo rating model is too wrong for
> this data, and there is no easy fix for that.
>
> Long ago, I had thought about using a more complex multi-dimensional Elo
> model. The CGOS data may be a good opportunity to try it. I will try when I
> have some free time.
>
> Rémi
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-21 Thread Rémi Coulom
Thanks for computing the new rating list.

I feel it did not fix anything. The old Zen, cronus, etc.have almost no
change at all.

So it is not a good fix, in my opinion. No need to change anything to the
official ratings.

The fundamental problem seems that the Elo rating model is too wrong for
this data, and there is no easy fix for that.

Long ago, I had thought about using a more complex multi-dimensional Elo
model. The CGOS data may be a good opportunity to try it. I will try when I
have some free time.

Rémi
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-21 Thread Hiroshi Yamashita

Hi,

This is original BayesElo. I updated manually. This is latest.
http://www.yss-aya.com/cgos/19x19/bayes.html
CrazyStone-18.044065
CrazyStone-81b-TiV  4032
Zen-15.7-3c1g   3999
CrazyStone-57-TiV   3618

This renames CrazyStone-57-TiV to CrazyStone-18.04.
http://www.yss-aya.com/cgos/19x19/bayes_20210121_rename_CrazyStone-57-TiV_to_CrazyStone-18.04.html
CrazyStone-81b-TiV  4051
Zen-15.7-3c1g   3968
CrazyStone-18.043778

Would you like to continue to rename CrazyStone-57-TiV?
The drift looks fixed, but a little for other programs.

Thanks,
Hiroshi Yamashita
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-21 Thread Rémi Coulom
I checked, and CrazyStone-57-TiV is using the same neural network and
hardware as CrazyStone-18.04. Batch size, cuDNN version, and time
management heuristics may have changed, but I expect that strength should
be almost identical. CrazyStone-57-TiV may be a little stronger.

So it seems that the rating drift over 3 years is about 450 Elo points, and
the "All Time Ranks" are a bit meaningless.

Can you produce a list where CrazyStone-57-TiV is renamed to
CrazyStone-18.04? It may be enough to fix the drift.

I need the machine for something else, so I disconnected the GPU version.
CrazyStone-81-15po is running 15 playouts per move on the CPU of a small
machine, and will stay.

Rémi
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-18 Thread Hiroshi Yamashita

Hi,


The Elo statistical model is wrong when different kind of programs play against


I have a similar experience.
I calculated Japanese Shogi women pro rating before.
The strongest woman, Ichiyo Shimizu, her rating is 1578 Elo.
Her winrate against men pros is 18%(163 games), and against women pro is 
65%(523 games).
Her rating without women pros game is 1286 Elo.
There is 292(=1578 - 1286) Elo difference.
It is because usually women pros play with women pros. Women pros vs men pros 
are rare.

I think similar thing happens on CGOS. There are three eras, Zen, LeelaZero and 
KataGo.
Number of Zen vs LeelaZero games are a little.
CrazyStone-18.04 rate maybe depend on Zen-15.7-3c1g.
http://www.yss-aya.com/cgos/19x19/cross/CrazyStone-18.04.html

Zen's absence is maybe a reason of this drift.

Thanks,
Hiroshi Yamashita
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-18 Thread uurtamo
It's a relative ranking versus who you actually get to play against.

Sparsity of actual skill will lead to that kind of clumping.

The only way that a rating could meaningfully climb by playing gnugo or
your direct peers is going to happen exponentially slowly -- you'd need to
lose to gnugo twice less often (or win all the time over twice as many
games) to get more points. So although it would eventually increase, it
would flatten out pretty quickly.

Good point about mcmc. A more dramatic approach would be to remove gnugo
altogether.


On Mon, Jan 18, 2021, 6:41 AM Rémi Coulom  wrote:

> Hi,
>
> Thanks to you for taking care of CGOS.
>
> I have just connected CrazyStone-57-TiV. It is not identical, but should
> be similar to the old CrazyStone-18.04. CrazyStone-18.04 was the last
> version of my program that used tensorflow. CrazyStone-57 is the first
> neural network that did not use tensorflow, running with my current code.
> So it should be stronger than CrazyStone-18.04, and I expect it will get a
> much lower rating.
> A possible explanation for the rating drift may be that most of the old MC
> programs have disappeared. They won easily against GNU Go, and were easily
> beaten by the CNN programs. The Elo statistical model is wrong when
> different kind of programs play against each other. When the CNN program
> had to get a rating by playing directly against GNU Go, they did not manage
> to climb as high as when they had the MC programs between them and GNU Go.
> I'll try to investigate this hypothesis more with the data.
>
> Rémi
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


Re: [Computer-go] CGOS source on github

2021-01-18 Thread Rémi Coulom
Hi,

Thanks to you for taking care of CGOS.

I have just connected CrazyStone-57-TiV. It is not identical, but should be
similar to the old CrazyStone-18.04. CrazyStone-18.04 was the last version
of my program that used tensorflow. CrazyStone-57 is the first neural
network that did not use tensorflow, running with my current code. So it
should be stronger than CrazyStone-18.04, and I expect it will get a much
lower rating.
A possible explanation for the rating drift may be that most of the old MC
programs have disappeared. They won easily against GNU Go, and were easily
beaten by the CNN programs. The Elo statistical model is wrong when
different kind of programs play against each other. When the CNN program
had to get a rating by playing directly against GNU Go, they did not manage
to climb as high as when they had the MC programs between them and GNU Go.
I'll try to investigate this hypothesis more with the data.

Rémi
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go