date:20081216

RE: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Denis fidaali



 I agree that the experience is interesting in itself.
 I also agree that it's hard to draw any conclusion
 from it :) Running the game to the end would probably
 give near 0% win for the AMAF bot.

 Running the 5k bot against the 100K bot is certainly
 something you would like to do if you were to argue
 that 5k is indeed stronger. Although it also might
 be that for some reason the 5k bot is better at
 the oppening. The 5k has a wider range of choice
 while playing than the 100k bot. So it's easy
 to imagine that it plays the good (oppening) moves more.

 All in all, trying to assess the strength of a bot
 is awfully hard. It can make very good move
 and yet be very weak. It can have good global
 perception, or move ordonnancing, and be very
 weak. It can predict pro-moves with incredible
 accurracy, and still be very weak. (although you
 then would be able to use this prediction feature 
 in a monte-carlo bot - CrazyStone).

 I guess any hard data will always be welcome. Your
 experiment was very original, in that few people
 would have tried it. I have no idea what one should
 conclude out of it. But it certainly can't hurt our 
 understanding :) (or un-understanding) Maybe some day
 someone will look-up at this particular experiment
 and come out with the next computer-go revolution :)

 Date: Mon, 15 Dec 2008 21:10:07 -0200
 From: tesujisoftw...@gmail.com
 To: computer-go@computer-go.org
 Subject: Re: [computer-go] RefBot (thought-) experiments
 
 Weston,
 
 Although those result sound intriguing, it also looks like a
 convoluted experiment. I wouldn't call gnu-go an expert judge,
 although it is an impartial one. The fact that it says that the 5K
 ref-bot is ahead after 10 moves 46% of the time alone makes it suspect
 in my eyes. But it is curious it consistently shows a much lower
 percentage for the bot with more playouts.
 
 It would have been much more persuasive if you had simply run a 5K
 playout bot against a 100K bot and see which wins more. It shouldn't
 take much more than a day to gather a significant number of games.
 twogtp is perfect for this. Or connect both to CGOS and see which ends
 up with a higher rating. But in that case it will take a week or more
 before you get conclusive data. Unless the difference is really clear.
 
 I did in fact put up a 100K+ ref-bot on CGOS for a little while, and
 it ended up with a rating slightly (possibly insignificantly) higher
 than the 2K ref-bot. Maybe I didn't put it there long enough,
 certainly not for thousands of games. But it didn't look anywhere near
 to supporting your findings.
 
 I say 100K+ because I didn't set it to a specific number, just run as
 many as it could within time allowed. Generally it would reach well
 over 100K per move, probably more like 250K-500K. That should only
 make things worse according to your hypothesis.
 
 So although I think the result of your experiment is very curious, I
 think it might be a bit hasty draw your conclusion.
 
 Mark
 
 
 On Mon, Dec 15, 2008 at 8:30 PM, Weston Markham
 weston.mark...@gmail.com wrote:
  Hi.  This is a continuation of a month-old conversation about the
  possibility that the quality of AMAF Monte Carlo can degrade, as the
  number of simulations increases:
 
  Me:  running 10k playouts can be significantly worse than running 5k 
  playouts.
 
  On Tue, Nov 18, 2008 at 2:27 PM, Don Dailey drdai...@cox.net wrote:
  On Tue, 2008-11-18 at 14:17 -0500, Weston Markham wrote:
  On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams
  michaelwilliam...@gmail.com wrote:
   It doesn't make any sense to me from a theoretical perspective.  Do you 
   have
   empirical evidence?
 
  I used to have data on this, from a program that I think was very
  nearly identical to Don's reference spec.  When I get a chance, I'll
  try to reproduce it.
 
  Unless the difference is large, you will have to run thousands of games
  to back this up.
 
  - Don
 
  I am comparing the behavior of the AMAF reference bot with 5000
  playouts against the behavior with 10 playouts, and I am only
  considering the first ten moves (five from each player) of the (9x9)
  games.  I downloaded a copy of Don's reference bot, as well as a copy
  of Mogo, which is used as an opponent for each of the two settings.
  gnugo version 3.7.11 is also used, in order to judge which side won
  (jrefgo or mogo) after each individual match.  gnugo was used because
  it is simple to set it up for this sort of thing via command-line
  options, and it seems plausible that it should give a somewhat
  realistic assessment of the situation.
 
  jrefgo always plays black, and Mogo plays white.  Komi is set to 0.5,
  so that jrefgo has a reasonable number of winning lines available to
  it, although the general superiority of Mogo means that egregiously
  bad individual moves will be punished.
 
  In the games played, Mogo would occasionally crash.  (This was run
  under Windows Vista; perhaps there is some incompatibility of the

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Jason House

When thinking about the apparent strength loss, I came up with a  
potential theory: consistency. With more simulations, noise has less  
of an impact. I'm going to guess that the known bias of AMAF leads to  
blunder that is played more consistently. Bots with fewer simulations  
would make the blunder too, but also pick sub-optimal moves due to  
evaluation noise.


Sent from my iPhone

On Dec 16, 2008, at 3:48 AM, Denis fidaali denis.fida...@hotmail.fr  
wrote:




 I agree that the experience is interesting in itself.
 I also agree that it's hard to draw any conclusion
 from it :) Running the game to the end would probably
 give near 0% win for the AMAF bot.

 Running the 5k bot against the 100K bot is certainly
 something you would like to do if you were to argue
 that 5k is indeed stronger. Although it also might
 be that for some reason the 5k bot is better at
 the oppening. The 5k has a wider range of choice
 while playing than the 100k bot. So it's easy
 to imagine that it plays the good (oppening) moves more.

 All in all, trying to assess the strength of a bot
 is awfully hard. It can make very good move
 and yet be very weak. It can have good global
 perception, or move ordonnancing, and be very
 weak. It can predict pro-moves with incredible
 accurracy, and still be very weak. (although you
 then would be able to use this prediction feature
 in a monte-carlo bot - CrazyStone).

 I guess any hard data will always be welcome. Your
 experiment was very original, in that few people
 would have tried it. I have no idea what one should
 conclude out of it. But it certainly can't hurt our
 understanding :) (or un-understanding) Maybe some day
 someone will look-up at this particular experiment
 and come out with the next computer-go revolution :)

 Date: Mon, 15 Dec 2008 21:10:07 -0200
 From: tesujisoftw...@gmail.com
 To: computer-go@computer-go.org
 Subject: Re: [computer-go] RefBot (thought-) experiments

 Weston,

 Although those result sound intriguing, it also looks like a
 convoluted experiment. I wouldn't call gnu-go an expert judge,
 although it is an impartial one. The fact that it says that the 5K
 ref-bot is ahead after 10 moves 46% of the time alone makes it  
suspect

 in my eyes. But it is curious it consistently shows a much lower
 percentage for the bot with more playouts.

 It would have been much more persuasive if you had simply run a 5K
 playout bot against a 100K bot and see which wins more. It shouldn't
 take much more than a day to gather a significant number of games.
 twogtp is perfect for this. Or connect both to CGOS and see which  
ends
 up with a higher rating. But in that case it will take a week or  
more
 before you get conclusive data. Unless the difference is really  
clear.


 I did in fact put up a 100K+ ref-bot on CGOS for a little while, and
 it ended up with a rating slightly (possibly insignificantly) higher
 than the 2K ref-bot. Maybe I didn't put it there long enough,
 certainly not for thousands of games. But it didn't look anywhere  
near

 to supporting your findings.

 I say 100K+ because I didn't set it to a specific number, just run  
as

 many as it could within time allowed. Generally it would reach well
 over 100K per move, probably more like 250K-500K. That should only
 make things worse according to your hypothesis.

 So although I think the result of your experiment is very curious, I
 think it might be a bit hasty draw your conclusion.

 Mark


 On Mon, Dec 15, 2008 at 8:30 PM, Weston Markham
 weston.mark...@gmail.com wrote:
  Hi. This is a continuation of a month-old conversation about the
  possibility that the quality of AMAF Monte Carlo can degrade, as  
the

  number of simulations increases:
 
  Me: running 10k playouts can be significantly worse than  
running 5k playouts.

 
  On Tue, Nov 18, 2008 at 2:27 PM, Don Dailey drdai...@cox.net  
wrote:

  On Tue, 2008-11-18 at 14:17 -0500, Weston Markham wrote:
  On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams
  michaelwilliam...@gmail.com wrote:
   It doesn't make any sense to me from a theoretical  
perspective.  Do you have

   empirical evidence?
 
  I used to have data on this, from a program that I think was  
very
  nearly identical to Don's reference spec. When I get a chance,  
I'll

  try to reproduce it.
 
  Unless the difference is large, you will have to run thousands  
of games

  to back this up.
 
  - Don
 
  I am comparing the behavior of the AMAF reference bot with 5000
  playouts against the behavior with 10 playouts, and I am only
  considering the first ten moves (five from each player) of the  
(9x9)
  games. I downloaded a copy of Don's reference bot, as well as a  
copy
  of Mogo, which is used as an opponent for each of the two  
settings.
  gnugo version 3.7.11 is also used, in order to judge which side  
won
  (jrefgo or mogo) after each individual match. gnugo was used  
because

  it is simple to set it up for this sort of thing via command-line
  options,

[computer-go] 9x9 MoGo vs human

2008-12-16 Thread Olivier Teytaud

In the computer-Go event of Clermont-Ferrand,
MoGo played four 9x9 games, plus blitz games,
against Motoki Noguchi (chinese rules, komi 7.5);
the result is a draw - the games are presented and discussed in
http://www.lri.fr/~teytaud/crClermont/cr.pdf

Best regards,
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Mark Boon

On Tue, Dec 16, 2008 at 12:20 PM, Jason House
jason.james.ho...@gmail.com wrote:
 When thinking about the apparent strength loss, I came up with a potential
 theory: consistency. With more simulations, noise has less of an impact. I'm
 going to guess that the known bias of AMAF leads to blunder that is played
 more consistently. Bots with fewer simulations would make the blunder too,
 but also pick sub-optimal moves due to evaluation noise.

This is something I noticed while watching a few games on CGOS. The
higher the number of playouts, the more often it plays the first moves
exactly the same. That may lead to skewed results to an individual
opponent. For example, if it always plays the same losing sequence,
the loss ratio against that opponent becomes larger than normal. This
gets averaged out with a large number of opponents, but CGOS has just
a few participants.

Mark
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 9x9 MoGo vs human

2008-12-16 Thread Nick Wedd

In message
aa5e3c330812160812k2d6c9de0k52672a41a464e...@mail.gmail.com, Olivier
Teytaud teyt...@lri.fr writes
In the computer-Go event of Clermont-Ferrand,
MoGo played four 9x9 games, plus blitz games,
against Motoki Noguchi (chinese rules, komi 7.5);
the result is a draw - the games are presented and discussed in 
http://www.lri.fr/~teytaud/crClermont/cr.pdf

Thank you for writing this very interesting report.  But it's a 40Mb pdf
file, my Internet Explorer can't handle it at all, and my FireFox only
with difficulty.  A more accessible version, perhaps without the
high-resolution pictures, might reach more readers.

Best wishes,
  Nick
-- 
Nick Weddn...@maproom.co.uk
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

RE: Results of the 2nd UEC Cup (Re: [computer-go] UEC cup)

2008-12-16 Thread David Fotland

Thank you for the results.  Thank you for providing a machine and letting
Many Faces participate, even though I could not travel to Japan.

Is it true that the final was a single elimination tournament, and not a
Swiss tournament?  It seems that Many Faces never played Fudo Go.  In future
tournaments, please consider using the Swiss tournament system.  Most people
would agree that it gives more accurate results.

Regards,

David

 -Original Message-
 From: computer-go-boun...@computer-go.org [mailto:computer-go-
 boun...@computer-go.org] On Behalf Of Hideki Kato
 Sent: Tuesday, December 16, 2008 4:16 AM
 To: computer-go
 Subject: Results of the 2nd UEC Cup (Re: [computer-go] UEC cup)
 
 Official results (only Japanese right now) are at:
 http://jsb.cs.uec.ac.jp/~igo/2008/result.html (first day)
 http://jsb.cs.uec.ac.jp/~igo/2008/result2.html (second day; final)
 
 1. Crazy Stone(invited, first seed)
 2. Fudo Go
 3. Many Faces of Go
 4. Katsunari  (second seed)
 5. Aya(fourth seed)
 6. RGO
 7. Gogonomitan
 8. agouti
 9. Boozer
 10. martha
 11. caren
 12. kinoa igo
 13. MC_ark
 14. Igoppi
 15. Kasumi
 16. MoGo  (invited, third seed)
 
 You can download the game record of the exhibition match from
 http://jsb.cs.uec.ac.jp/~igo/2008/kifu/aoba-crazystone.sgf
 
 Hideki
 --
 g...@nue.ci.i.u-tokyo.ac.jp (Kato)
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Don Dailey

On Mon, 2008-12-15 at 17:30 -0500, Weston Markham wrote:
 Out of 3637 matches using 5k playouts, jrefgo won (i.e., was ahead
 after 10 moves, as estimated by gnugo) 1688 of them.  (46.4%)
 Out of 2949 matches using 100k playouts, jrefgo won 785.  (26.6%)
 
 It appears clear to me that increasing the number of playouts from 5k
 to 100k certainly degrades the performance of jrefgo.  Below, I am
 including the commands that I used to run the tests and tally the
 results.

Sometimes, it's possible for a bot to make a good move for the wrong
reason.  In other words, it doesn't understand the position.  It's also
possible to go the other way,  a better program will play a weaker move
in a particular situation.

I remember once being showed a chess position where a master claimed
that weaker players are more likely to play the correct move - but it
went against a lot of the knowledge and patterns that strong players
have.   

A monte carlo bot like refbot, in most positions is going to converge on
some specific move.   I think in the starting position it wants to
play e5 and it is going to play e5 with an infinite number of playouts,
whether than is the best move or not.There will be many situations
where the move it wants to play is not the best, and so you can
surmise that it's more likely to play a good move with fewer playouts.

However, that by no means implies that it will play better with fewer
playouts.   It may play the worst move on the board too - the chances of
that happening increases as the number of playouts drops.   So this cuts
both ways.  

With only 10 moves into the game, it's a possibility that a common trap,
sequence or pattern is being played and together the three programs
(refbot, mogo and gnugo) have conspired to make your results happen.
It could be that refbot really does play a better move, but the better
move is something that mogo is particularly good at handling.   This is
just speculation - but I think you have produced something that is like
a bad pseudo random number generator that has obvious patterns and
glitches.

Another way to look at this is that playing better may not correlate
well with increasing your chances of playing the best move - it's more
about increasing your chances of playing a good move,  or in my own
personal opinion it's about avoiding bad moves.



- Don






 

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Results of the 2nd UEC Cup (Re: [computer-go] UEC cup)

2008-12-16 Thread Hideki Kato

Official results (only Japanese right now) are at:
http://jsb.cs.uec.ac.jp/~igo/2008/result.html (first day)
http://jsb.cs.uec.ac.jp/~igo/2008/result2.html (second day; final)

1. Crazy Stone  (invited, first seed)
2. Fudo Go
3. Many Faces of Go
4. Katsunari(second seed)
5. Aya  (fourth seed)
6. RGO
7. Gogonomitan
8. agouti
9. Boozer
10. martha
11. caren
12. kinoa igo
13. MC_ark
14. Igoppi
15. Kasumi
16. MoGo(invited, third seed)

You can download the game record of the exhibition match from
http://jsb.cs.uec.ac.jp/~igo/2008/kifu/aoba-crazystone.sgf

Hideki
--
g...@nue.ci.i.u-tokyo.ac.jp (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 9x9 MoGo vs human

2008-12-16 Thread David Goh

If you are using Outlook Express or a similar email
client, you can just right-click on the link and
choose Save Target As... to save the whole PDF file on
your hard disk first, then open the PDF file.  This will
solve the viewing problem with IE or whatever.

Thanks for the nice report anyway.

David

- Original Message - 
From: Nick Wedd n...@maproom.co.uk
To: computer-go computer-go@computer-go.org
Sent: Wednesday, December 17, 2008 1:01 AM
Subject: Re: [computer-go] 9x9 MoGo vs human


 In message
 aa5e3c330812160812k2d6c9de0k52672a41a464e...@mail.gmail.com, Olivier
 Teytaud teyt...@lri.fr writes
In the computer-Go event of Clermont-Ferrand,
MoGo played four 9x9 games, plus blitz games,
against Motoki Noguchi (chinese rules, komi 7.5);
the result is a draw - the games are presented and discussed in 
http://www.lri.fr/~teytaud/crClermont/cr.pdf
 
 Thank you for writing this very interesting report.  But it's a 40Mb pdf
 file, my Internet Explorer can't handle it at all, and my FireFox only
 with difficulty.  A more accessible version, perhaps without the
 high-resolution pictures, might reach more readers.
 
 Best wishes,
  Nick
 -- 
 Nick Weddn...@maproom.co.uk
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 9x9 MoGo vs human

2008-12-16 Thread Olivier Teytaud



 Thank you for writing this very interesting report.  But it's a 40Mb pdf
 file, my Internet Explorer can't handle it at all, and my FireFox only
 with difficulty.  A more accessible version, perhaps without the
 high-resolution pictures, might reach more readers.


Sorry for that :-)

http://www.lri.fr/~teytaud/crClermont/cr.pdf
is much easier to download now (10 times smaller).

Thanks to Terry for comments.

Best regards,
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UEC cup

2008-12-16 Thread Michael Goetze


dave.de...@planet.nl wrote:
Also, a 4p is not a 7p. The difference should be about one stone. 4p is 
equivalent to 8d EGF.


I wish people would stop spreading such incorrect information. The 
correlation between professional ranks and playing strength is quite 
bad, and EGF 7dans are not, generally speaking, professional strength. 
Also, please note that some professional associations have different 
rules for male and female players.


If you find a Japanese 7p who can give a Korean 1p 2 stones and win, I 
will eat my hat...


Regards,
Michael
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 9x9 MoGo vs human

2008-12-16 Thread Nick Wedd

In message aa5e3c330812161339g98f42d0lbcd3212893475...@mail.gmail.com, 
Olivier Teytaud teyt...@lri.fr writes


 Thank you for writing this very interesting report.  But it's a 40Mb
 pdf
 file, my Internet Explorer can't handle it at all, and my FireFox
 only
 with difficulty.  A more accessible version, perhaps without the
 high-resolution pictures, might reach more readers.

Sorry for that :-)

http://www.lri.fr/~teytaud/crClermont/cr.pdf
is much easier to download now (10 times smaller).


Thank you!  That is much better, it even still has all the pictures.  My 
favourite is of Huygens' cooling system.


Nick


Thanks to Terry for comments.
Best regards,
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


--
Nick Weddn...@maproom.co.uk
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UEC cup

2008-12-16 Thread Darren Cook

 If you find a Japanese 7p who can give a Korean 1p 2 stones and win, I
 will eat my hat...

No one mentioned Korean professionals. But, as far as I know, a Japanese
7p should be able to give a Japanese 1p 2 stones and win 50% of the
time. Roughly.

Darren


-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic
open source dictionary/semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/blogs.html (My blogs and articles)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham

On Mon, Dec 15, 2008 at 5:47 PM, Don Dailey drdai...@cox.net wrote:
 Is Jrefgo the pure version that does not use tricks like the futures
 map?   If you use things like that, all bets are off - I can't be sure
 this is not negatively scalable.

I don't know, although I was under the impression that I had
downloaded the pure version.  I found a reference to the source here
on the list, and downloaded and compiled that.  When I get back home,
how would I quickly determine which is the case?

 You cannot draw any reasonable conclusions by stopping after 10 moves
 and letting gnugo judge the game either.Why didn't you play complete
 games?

I think that complete games would have to be at least one of:
1.  Against a similarly weak opponent.  This casts doubt on whether
the results apply against other opponents.
2.  Unlikely to be won by an AMAF program.  This makes their
differences hard to measure.
3.  Played with handicap stones.  The granularity seems too coarse on
9x9.  Nevertheless, it might be worthwhile to try this.
4.  Played with a komi that is very far from an even game.  In 9x9,
this would mean that the better player must control the entire board
in order to win.  At that point, komi no longer becomes useful as a
means for providing a handicap.

Originally, (about two years ago) I ran studies such as this in order
to tune parameters that affected the playouts, and that I thought
could probably have different optimum values at different points in
the game.  Playing against an opponent that is generally stronger
makes it more likely that the improvements I find are likely to apply
to opponents in general, rather than simply tuning my program against
one particular opponent.  Playing against a close relative of the same
program (e.g., pitting 5k against 100k directly) gives misleading
results, in my experience.  Often, both programs will be blind to the
same lines of play, allowing genuinely bad moves to go unpunished.

On Mon, Dec 15, 2008 at 6:10 PM, Mark Boon tesujisoftw...@gmail.com wrote:
 It would have been much more persuasive if you had simply run a 5K
 playout bot against a 100K bot and see which wins more. It shouldn't
 take much more than a day to gather a significant number of games.

I may do that, although personally I would be far more cautious about
drawing conclusions from those matches, as compared to ones played
against a strong reference opponent.  But I guess other people feel
differently about this.  Anyway, the results would still be
interesting to me no matter which way they went, even if they failed
to convince me of anything.

 I did in fact put up a 100K+ ref-bot on CGOS for a little while, and
 it ended up with a rating slightly (possibly insignificantly) higher
 than the 2K ref-bot. Maybe I didn't put it there long enough,
 certainly not for thousands of games. But it didn't look anywhere near
 to supporting your findings.

That doesn't particularly disagree with my conclusions either.  For
example, I would guess that the best overall performance is somewhere
around 5k-10k, so a program with a setting in that range would obtain
a higher rating than either of 2K and your 100K+.  I could easily be
wrong about that, though.

 I say 100K+ because I didn't set it to a specific number, just run as
 many as it could within time allowed. Generally it would reach well
 over 100K per move, probably more like 250K-500K. That should only
 make things worse according to your hypothesis.

Yes, this is what sparked the conversation originally.  When you
reported that a while ago, my reaction was, Of course that won't work
very well; you're running way too many simulations.  I was actually a
bit surprised that noone else thought that this was as bad as I think
it is.

 So although I think the result of your experiment is very curious, I
 think it might be a bit hasty draw your conclusion.

Yes, it very well may be.  As I mentioned, I ran a number of similar
experiments a couple years ago, for which I unfortunately lost the
results.  My recollection is that they typically indicated the same
thing, across a number of variations on my own program.  Performance
would improve up to a point, then degrade as the program's behavior
became essentially deterministic.  But I may have made mistakes in
those tests, or I could be misremembering.


On Tue, Dec 16, 2008 at 12:20 PM, Don Dailey dailey@gmail.com wrote:
 A monte carlo bot like refbot, in most positions is going to converge on
 some specific move.   I think in the starting position it wants to
 play e5 and it is going to play e5 with an infinite number of playouts,
 whether than is the best move or not.There will be many situations
 where the move it wants to play is not the best, and so you can
 surmise that it's more likely to play a good move with fewer playouts.

Incidentally, when I get home, I'll post the line of play that follows
those moves with the highest (asymptotic) Monte Carlo values,
according to jrefgo.  I have

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham

On Tue, Dec 16, 2008 at 7:34 PM, Weston Markham
weston.mark...@gmail.com wrote:
 And I believe that current
 Monte Carlo methods only really manage to avoid the very worst of the
 bad moves, regardless of how many playouts they run.

Um, perhaps I should qualify that as pure Monte Carlo, meaning
without any form of tree search.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Michael Williams


Weston Markham wrote:

I say 100K+ because I didn't set it to a specific number, just run as
many as it could within time allowed. Generally it would reach well
over 100K per move, probably more like 250K-500K. That should only
make things worse according to your hypothesis.


Yes, this is what sparked the conversation originally.  When you
reported that a while ago, my reaction was, Of course that won't work
very well; you're running way too many simulations.  I was actually a
bit surprised that noone else thought that this was as bad as I think
it is.



It seems like you are the only one who believes in your hypothesis, even after 
your experiments.

I agree that a simpler test should be used.  Another option is to use 
GtpStatistics to gather move prediction numbers.
But you will need a large sample of games to get the noise small enough.  But 
that approach may also have issues.
I would definitely try 5k vs 100k and also put them both on CGOS.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Darren Cook

 It would have been much more persuasive if you had simply run a 5K
 playout bot against a 100K bot and see which wins more. ...
 
 I may do that, although personally I would be far more cautious about
 drawing conclusions from those matches, as compared to ones played
 against a strong reference opponent.  ...

The thing is, the 100K bot should either win more, or be the same
strength. If you can show, with statistical confidence, that it is
actually weaker then people have to sit up and pay attention.

(If not you may still be on to something, but it will be harder to prove.)

I'd also like to second Mark Boon's statement that Gnugo is not an
expert judge, especially not after only 10 moves. One experiment I did,
a couple of years ago, was scoring lots of terminal or almost-terminal
9x9 positions with gnugo and crazy stone, and they disagreed a lot of
the time. (Sorry, I don't remember what a lot was, maybe 10% or so.)

Darren

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UEC cup

2008-12-16 Thread Hideki Kato

Darren Cook: 49483abe.9070...@dcook.org:
 If you find a Japanese 7p who can give a Korean 1p 2 stones and win, I
 will eat my hat...

No one mentioned Korean professionals. But, as far as I know, a Japanese
7p should be able to give a Japanese 1p 2 stones and win 50% of the
time. Roughly.

I don't agree.  Japanese Professinals' ranks never decrease.  It's not 
rare in general that a young 4p is stronger than an aged 9p.  Aoba 4p 
is, however, not this case.

Hideki
--
g...@nue.ci.i.u-tokyo.ac.jp (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham

On Wed, Dec 17, 2008 at 12:34 AM, Weston Markham
weston.mark...@gmail.com wrote:
 I don't know, although I was under the impression that I had
 downloaded the pure version.  I found a reference to the source here
 on the list, and downloaded and compiled that.  When I get back home,
 how would I quickly determine which is the case?

The program reports 081016-2022 to the GTP version command.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Mark Boon

By the way, what does scratch100k.sh look like?
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham

On Wed, Dec 17, 2008 at 12:51 AM, Darren Cook dar...@dcook.org wrote:
 I'd also like to second Mark Boon's statement that Gnugo is not an
 expert judge, especially not after only 10 moves. One experiment I did,
 a couple of years ago, was scoring lots of terminal or almost-terminal
 9x9 positions with gnugo and crazy stone, and they disagreed a lot of
 the time. (Sorry, I don't remember what a lot was, maybe 10% or so.)

Hmm.  I agree as well.  I see this as the biggest weakness in the
experiment.  The weaknesses of gnugo could very well favor 5k's end
positions over 100k's, for irrelevant reasons.  The earlier
experiments I had run also used gnugo.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham

On Wed, Dec 17, 2008 at 1:32 AM, Mark Boon tesujisoftw...@gmail.com wrote:
 By the way, what does scratch100k.sh look like?

../gogui-1.1.3/bin/gogui-twogtp -auto -black java -jar jrefgo.jar 10 -game
s 1 -komi 0.5 -maxmoves 10 -referee gnugo --mode gtp --score aftermath --ch
inese-rules --positional-superko -sgffile games/jr100k-v-mogo-10m -size 9 -whit
e `cygpath -w /home/Experience/projects/go/MoGo_release3/mogo`
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UEC cup

2008-12-16 Thread Darren Cook

 No one mentioned Korean professionals. But, as far as I know, a Japanese
 7p should be able to give a Japanese 1p 2 stones and win 50% of the
 time. Roughly.
 
 I don't agree.  Japanese Professinals' ranks never decrease. 

Hi,
Are we talking about different things? All I meant to say was that I
thought in Japanese professional ranks that one rank is worth a third of
a handicap stone. So when there are 6 ranks difference then two handicap
stones should give an even game.

I also think a 1-dan Japanese professional is equivalent to about a
7-dan amateur. So, going back to the original 7 handicap against a 4p
situation, then if it is an even game it implies black is about 1 dan
(Japanese).

With all the usual disclaimers about the large error margin on a sample
of just 1 game :-).

Darren


-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic
open source dictionary/semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/blogs.html (My blogs and articles)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Mark Boon

- Show quoted text -

On Tue, Dec 16, 2008 at 11:35 PM, Weston Markham
weston.mark...@gmail.com wrote:
 On Wed, Dec 17, 2008 at 1:32 AM, Mark Boon tesujisoftw...@gmail.com wrote:
 By the way, what does scratch100k.sh look like?

 ../gogui-1.1.3/bin/gogui-twogtp -auto -black java -jar jrefgo.jar 10 
 -game
 s 1 -komi 0.5 -maxmoves 10 -referee gnugo --mode gtp --score aftermath 
 --ch
 inese-rules --positional-superko -sgffile games/jr100k-v-mogo-10m -size 9 
 -whit
 e `cygpath -w /home/Experience/projects/go/MoGo_release3/mogo`

Thanks. I just realized that you set the komi to 0.5. That doesn't
sound like a good idea. I wanted to make sure you had the same for the
100k version. Were your earlier experiments also with 0.5 komi? MC
programs are highly sensitive to komi, so I'd use something more
reasonable, like 6.5 or 7.5.

Mark
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UEC cup

2008-12-16 Thread Mark Boon

On Tue, Dec 16, 2008 at 8:52 PM, Michael Goetze mgoe...@mgoetze.net wrote:
 I wish people would stop spreading such incorrect information. The
 correlation between professional ranks and playing strength is quite bad,
 and EGF 7dans are not, generally speaking, professional strength.

I'm not claiming to be an authority on the matter, but I beg to
differ. Name me an EGF 7-dan that's not professional level. And then
explain how come they are listed among players that are anywhere from
1p to 5p in different Asian countries. I used to be an EGF 6-dan and
have beaten top 9p players with 3 stones on occasion. For a while I
had a Japanese 2p teacher but stopped taking lessons when I started to
beat him on black pretty consistently. That was when I was still
5-dan. So I don't think it's so far off to say 7-dan amateur is pro
level.

The main problem is that ranks of different countries differ
considerably, even for professionals. I also think as an amateur my
chances would have been much lower had there been anything at stake
for the pro.

But it's little use to quible about it. If CrazyStone is 1-dan or
more, it will become clear sooner or later. It's just a matter of time
and enough games.

Mark
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Don Dailey

On Tue, 2008-12-16 at 19:34 -0500, Weston Markham wrote:
 I may do that, although personally I would be far more cautious about
 drawing conclusions from those matches, as compared to ones played
 against a strong reference opponent.  But I guess other people feel
 differently about this.  Anyway, the results would still be
 interesting to me no matter which way they went, even if they failed
 to convince me of anything.

It's my opinion that it's bad to test against a single opponent.
Ideally, if it is possible to arrange you want to test against a variety
of opponents that are different, or not based on the same code or even
similar in design.   But I don't think it's possible any longer to avoid
MCTS based bots - but it's possible with the reference bot.   

Also, you want to normalize the strength.  You want to find opponents
that play close to the same strength.   You may have to manipulate the
playing levels of course to achieve this.   It's basically a waste of
resources to play opponents that are going to beat you most of the time,
or that you are going to beat because it takes more games to zero on
your actual performance with any accuracy.For instance if you can
only win 1 out of 100 games,  how much are you going to learn from 100
games?   You will either lose all 100,  or possibly win 1 or 2 games and
you cannot ascertain with much precision how strong the program is.  

- Don



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham

On Wed, Dec 17, 2008 at 2:07 AM, Mark Boon tesujisoftw...@gmail.com wrote:
 Thanks. I just realized that you set the komi to 0.5. That doesn't
 sound like a good idea. I wanted to make sure you had the same for the
 100k version. Were your earlier experiments also with 0.5 komi? MC
 programs are highly sensitive to komi, so I'd use something more
 reasonable, like 6.5 or 7.5.

Ah.  I'm sorry if I failed to draw attention to that in my original description.

I believe that some of my experiments were very similar to this one.
I don't really recall the details although I did pick the values used
here of 5k, 100k, and 10 moves based on my recollection.  I think that
originally I may have measured the difference between the two over
moves 5-10 or 5-15.  And of course the Monte Carlo program was my own,
which had some minor differences.  More significantly, I would
probably have used gnugo as the reference program for those
experiments.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Don Dailey

On Tue, 2008-12-16 at 19:34 -0500, Weston Markham wrote:
 I don't know, although I was under the impression that I had
 downloaded the pure version.  I found a reference to the source here
 on the list, and downloaded and compiled that.  When I get back home,
 how would I quickly determine which is the case?

Is it the java version?  I believe there is only one version of that and
it's the pure reference bot.   I did make modification to a C version
but I think I kept that private.

- Don


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham

On Tue, Dec 16, 2008 at 7:34 PM, Weston Markham
weston.mark...@gmail.com wrote:
 Incidentally, when I get home, I'll post the line of play that follows
 those moves with the highest (asymptotic) Monte Carlo values,
 according to jrefgo.  I have about 18 moves calculated with high
 accuracy.

Here is a .sgf for the first 17 moves.   They are:

E5 D5 D6 E6 F6 E7 F5 D4 C6 F7 G7 G6 G5 D7 C7 C8 D8

The 18th move is very nearly a tie between E8 and F8, although I
think F8 eventually wins.

(;FF[4]CA[UTF-8]AP[GoGui:1.1.3]SZ[9]
KM[6.5]PB[JrefBot:081016-2022]PW[JrefBot:081016-2022]DT[2008-12-04]RE[B+34.5]
C[Black command: java -jar jrefgo.jar 10
White command: java -jar jrefgo.jar 10
Black version: 081016-2022
White version: 081016-2022
Opening: openings/opening2.sgf
Result[Black\]: ?
Result[White\]: ?
Referee: gnugo --level 0 --mode gtp --score estimate
--capture-all-dead --chinese-rules
Result[Referee\]: B+34.5
Host: localhost.localdomain (Pentium III (Coppermine))
Date: December 5, 2008 2:05:47 PM EST]
;B[ee];W[de];B[dd];W[ed];B[fd];W[ec];B[fe];W[df];B[cd];W[fc]
;B[gc];W[gd];B[ge];W[dc];B[cc];W[cb];B[db]
)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham

On Wed, Dec 17, 2008 at 2:38 AM, Don Dailey dailey@gmail.com wrote:
 Is it the java version?  I believe there is only one version of that and
 it's the pure reference bot.   I did make modification to a C version
 but I think I kept that private.

Yes, it is the Java version.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UEC cup

2008-12-16 Thread Hideki Kato


Darren Cook: 49485a64.5080...@dcook.org:
 No one mentioned Korean professionals. But, as far as I know, a Japanese
 7p should be able to give a Japanese 1p 2 stones and win 50% of the
 time. Roughly.
 
 I don't agree.  Japanese Professinals' ranks never decrease. 

Hi,
Are we talking about different things? All I meant to say was that I
thought in Japanese professional ranks that one rank is worth a third of
a handicap stone. So when there are 6 ranks difference then two handicap
stones should give an even game.

I also think a 1-dan Japanese professional is equivalent to about a
7-dan amateur. So, going back to the original 7 handicap against a 4p
situation, then if it is an even game it implies black is about 1 dan
(Japanese).

I'd like to say it's very hard, even almost impossible, to map the 
ranks of Japanese professinals to the ranks of the amateur players.

Japanese usually estimate a players rank by the game itself.  For the 
7 handicap Crazy Stone vs Aoba 4p game, all strong amateures (above 
5d), who were watching the game, said Crazy Stone were playing like 5d 
or even 6d.

The mistery is, however, it was the same executable as the exhibition 
match at FIT2008 September this year, which was an 8 handicap game 
against Aoba 4p and Crazy Stone played not so excellent as this 
time.
https://secure1.gakkai-web.net/gakkai/fit/program/html/event/event.html#6 
(in Japanese)

With all the usual disclaimers about the large error margin on a sample
of just 1 game :-).

Agree. :-)

Hideki
--
g...@nue.ci.i.u-tokyo.ac.jp (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: Results of the 2nd UEC Cup (Re: [computer-go] UEC cup)

2008-12-16 Thread Hideki Kato

Hello David,

David Fotland: 00ca01c95fa2$5ee6bb50$1cb431...@com:
Thank you for the results.  Thank you for providing a machine and letting
Many Faces participate, even though I could not travel to Japan.

Thank you for the participation and a very excellent and interesting 
game against Crazy Stone at the semi-final.

Although Tei and Aoba Professionals explained the match at the 
front stage with a projection, the game was so complicated that I 
couldn't see which is winning until near the end.  Another semi-final 
match, my Fudo Go vs Katsunari, also was shown on the screen but in a 
small picture at upper right corner and had explained very shortly.  
Yes, all the people, including me :), were concentrated on your game 
and exciting.

Is it true that the final was a single elimination tournament, and not a
Swiss tournament?  It seems that Many Faces never played Fudo Go.  In future
tournaments, please consider using the Swiss tournament system.  Most people
would agree that it gives more accurate results.

You're right.  The final of UEC Cup is not Swiss.

Thank you for the comments.  As I'm not a staff of the tounament I 
can't say anything about its future but I'll tell the staff your 
comments.

I agree it's less accurate but personally I don't think it's better to 
use Swiss style for the final.

I guess the major reason not using Swiss style is the larger number of 
participants (24 this year) and the shorter time (two days including 
an exhibition match) of the tournament than for example Computer 
Olympiad.

For the accuracy, we have the Olympiad which is Swiss or even round 
robin this year.  I don't think we need other tournaments of the same 
style.  I'd like to add, even with single-elimination, all the people 
understood MFG was clearly stronger than Fudo Go or Katsunari by 
watching the games.

Also, this style, Swiss for the preliminary and single-elimination for 
the final, is common in Japan.  World Computer Shogi Championship uses 
the same style for example.

At last, I (and maybe most of the participants and spectators) felt 
it's, more or less, exciting and interesting in fact.  If you were 
here you shared it, I strongly believe.

Again, I'm not a staff of UEC Cup and above are just my personal 
opinions.

Regards,

Hideki

Regards,

David

 -Original Message-
 From: computer-go-boun...@computer-go.org [mailto:computer-go-
 boun...@computer-go.org] On Behalf Of Hideki Kato
 Sent: Tuesday, December 16, 2008 4:16 AM
 To: computer-go
 Subject: Results of the 2nd UEC Cup (Re: [computer-go] UEC cup)
 
 Official results (only Japanese right now) are at:
 http://jsb.cs.uec.ac.jp/~igo/2008/result.html (first day)
 http://jsb.cs.uec.ac.jp/~igo/2008/result2.html (second day; final)
 
 1. Crazy Stone   (invited, first seed)
 2. Fudo Go
 3. Many Faces of Go
 4. Katsunari (second seed)
 5. Aya   (fourth seed)
 6. RGO
 7. Gogonomitan
 8. agouti
 9. Boozer
 10. martha
 11. caren
 12. kinoa igo
 13. MC_ark
 14. Igoppi
 15. Kasumi
 16. MoGo (invited, third seed)
 
 You can download the game record of the exhibition match from
 http://jsb.cs.uec.ac.jp/~igo/2008/kifu/aoba-crazystone.sgf
 
 Hideki
 --
 g...@nue.ci.i.u-tokyo.ac.jp (Kato)
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
--
g...@nue.ci.i.u-tokyo.ac.jp (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

RE: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

[computer-go] 9x9 MoGo vs human

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] 9x9 MoGo vs human

RE: Results of the 2nd UEC Cup (Re: [computer-go] UEC cup)

Re: [computer-go] RefBot (thought-) experiments

Results of the 2nd UEC Cup (Re: [computer-go] UEC cup)

Re: [computer-go] 9x9 MoGo vs human

Re: [computer-go] 9x9 MoGo vs human

Re: [computer-go] UEC cup

Re: [computer-go] 9x9 MoGo vs human

Re: [computer-go] UEC cup

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] UEC cup

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] UEC cup

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] UEC cup

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] RefBot (thought-) experiments

Re: [computer-go] UEC cup

Re: Results of the 2nd UEC Cup (Re: [computer-go] UEC cup)

32 matches

Site Navigation

Mail list logo

Footer information