Re: [computer-go] Re: fuego strength

Christian Nentwich Tue, 23 Jun 2009 23:34:51 -0700

Don,

you might have your work cut out if you try to control inflationdirectly, that can turn into a black art very quickly. Multiple anchorswould be preferable. An offline, X * 1000 game playoff between gnugo andanother candidate anchor would be enough to fix their rating difference.If their bilateral winnings drift away during continuous play, theanchor rating could be tweaked.

I'm not sure if the worries voiced on this list about anchors are notsomewhat overdone. Other bots, with improvements, may do just as wellagainst an old version of Fuego as the full Fuego does, we don't know.Maybe they would do better than new versions of Fuego. All this relianceon gnugo introduces bias, too, and after all the anchor player is not asingle control variable that determines the destiny of the server.Players will still play many different opponents. If Fuego keeps beatingthe anchor player but losing to everybody else, it still won't get ahigher rank.

For me, gnugo as an anchor is fine, as I am still experimenting around alow ELO level. For authors of strong programs: I am quite surprised thatyou are not insisting on a much more highly rated anchor. I rememberwhen KGS was anchored in the kyu ranks, many years ago. I found myself 7dan one day, until somebody intervened and reanchored the server. Theterritory far above a single anchor player is unsafe.


Christian



On 24/06/2009 05:28, Don Dailey wrote:

>From what I have discovered so far, there is no compelling reason tochange anchors. What I really was hoping we could do is UPGRADE theanchor, since many programs are now far stronger than 1800.

Fuego is pretty strong, but not when it plays at the same CPUintensity as gnugo. I went up to 5000 simulations and the match isfairly close and the time is about the same. Going from 3000 to5000 was quite a remarkable jump in strength and no doubt we could runat 10,000 and have substantial superiority - but that's not reallywhat I had in mind.

So I think I agree with all the comments I have received so far - andmy own observations and testing, there is no compelling reasons tochange.

Now if fuego was substantially stronger using less resources, I wouldbe more eager to change after carefully calibrating the difference,but that is not the case, at least not at 19x19.

There is another way to keep ratings stable and that is to monitor keyplayers over time and build a deflation/inflation mechanism into theserver to keep it in tune. For instance if there were no anchors,the server could monitor gnugo and if it were to gradually drop inrating, I could make minor adjustments to the ratings of winners andlosers to compensate gradually over time. For example the winnercould get 1% more ELO and the loser could lose 1% less ELO when ininflation mode and just the opposite when in deflation mode. In thisway I could feed points into the rating pool, or gradually extractthem as needed. I don't plan to do this, but there is more than oneway to skin a cat.

If we use more than one player as anchors, I would still pick oneplayer as the standard, and periodically adjust the "other" anchorsbased on their global perormance rating - since they will all tend todrift around relative to each other and I would not want to make anyassumptions about what the other anchors should be. We cannot justsay gnugo is 1800, fuego is 2000, etc because we don't really know theexact difference between the 2. But we could refine this over time.


- Don

On Tue, Jun 23, 2009 at 11:34 PM, David Fotland<[email protected] <mailto:[email protected]>> wrote:


    I'd also prefer to use gnugo as an anchor.  Since fuego is under
    development, new versions will be playing with an odler version of
    itself.
    Fuego will win more often against its old version.  I don't care
    about it
    distorting Fuego's rating, but it will distort the rating system.
     If new
    fuego is on with few other programs it will gain rating points,
    then when
    other programs come new fuego will give them the other program as
    its rating
    drops.  The effect will be to make the rating system less stable,
    so it's
    hard to use cgos to evaluate new versions of programs to see if
    they are
    stronger.

    I think it's best to use an anchor that's not under active
    development.  I
    like gnugo since there is lots of published results against it,
    and it is
    not changing rapidly.  Also it has a different style than the
    monte carlo
    programs, so it's more likely to expose bugs in the monte carlo
    programs.

    David

    > -----Original Message-----
    > From: [email protected]
    <mailto:[email protected]> [mailto:computer-go-
    <mailto:computer-go->
    > [email protected] <mailto:[email protected]>] On
    Behalf Of Hideki Kato
    > Sent: Tuesday, June 23, 2009 5:15 PM
    > To: computer-go
    > Subject: [computer-go] Re: fuego strength
    >
    > I'm running Fatman1, GNU Go and GNU Go MC version for 9x9 and two
    > instances of GNU Go for 13x13, five programs in total, on a
    dual-core
    > Athlon at home.
    >
    > I strongly believe current anchors are resource friendly enough for
    > older pentium 3, 4 or even Celeron processors and not necessary
    being
    > changed.
    >
    > Changing anchors is a big problem, similar to changing the
    > International prototypes.  Also, GNU Go is used as a reference in
    > almost every computer-go research these days.
    >
    > I'm against that idea, especially for 19x19.
    >
    > Hideki
    >
    > Don Dailey:
    <[email protected]
    <mailto:[email protected]>>:
    > >I'm trying now to get a rough idea about the strength of fuego
    and it's
    > >suitablity as the anchor player.
    > >
    > >Right now the numbers are very rough as I need more samples.   I'm
    > currently
    > >looking at:
    > >
    > >  1.  9x9 fuego at 1000 simulations
    > >
    > >  2. 19x19 fuego at 3000 simulations.
    > >
    > >
    > >I'm testing against the current CGOS anchors,  so FatMan vs
    fuego at 9x9
    > and
    > >gnugo-3.7.10 at 19x19.
    > >
    > >
    > >At 9x9 fuego appears to be substantially stronger than FatMan,
    perhaps
    > >100-200 ELO.   It also is far faster at 1000 simulation than
    fatman which
    > >requires many more simulations to reach anchor strength.   So
    there is no
    > >questions about fuego being a capable anchor for small boards.
     At this
    > >level on 9x9 FatMan is also stronger than gnugo, so fuego is
    far stronger
    > >than gnugo on 9x9 and is very resource friendly too.
    > >
    > >At 19x19 the story is a bit different.  gnugo appears to be
    significantly
    > >stronger, but about twice as slow.   There is not enough data
    to narrow
    > this
    > >down much, but it appears to be over 200 ELO weaker at this level.
    > >
    > >Since fuego is using only about half the CPU resources of
    gnugo,  I can
    > >increase the level.    I've only played 30 games at 19x19, so this
    > >conclusion is subject to signficant error, but it's enough to
    conclude
    > that
    > >it's almost certainly weaker at this level but perhaps not when
    run at
    the
    > >same CPU intensity as gnugo.
    > >
    > >Of course at higher levels yet, fuego would be far stronger than
    > >gnugo-3.7.10 as seen in the 19x19 cgos tables.   But I'm hoping
    not to
    > push
    > >the anchors too hard - hopefully they can be run on someones
    older spare
    > >computer or set unobtrusively in the background on someones desktop
    > >machine.
    > >
    > >
    > >- Don
    > >---- inline file
    > >_______________________________________________
    > >computer-go mailing list
    > >[email protected] <mailto:[email protected]>
    > >http://www.computer-go.org/mailman/listinfo/computer-go/
    > --
    > [email protected] <mailto:[email protected]> (Kato)
    > _______________________________________________
    > computer-go mailing list
    > [email protected] <mailto:[email protected]>
    > http://www.computer-go.org/mailman/listinfo/computer-go/

    _______________________________________________
    computer-go mailing list
    [email protected] <mailto:[email protected]>
    http://www.computer-go.org/mailman/listinfo/computer-go/


------------------------------------------------------------------------

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Re: fuego strength

Reply via email to