Re: [computer-go] Dynamic komi

2009-08-19 Thread Weston Markham
I'm curious to find out what is meant by lazy.  If, as I am led to
believe by your report, Monte Carlo strategies applied to Double Step
Races are lazy, yet they converge to perfect play, then I'm not sure
why we are meant to worry.  I certainly understand that the strategies
can converge faster in some cases than in others.  I would expect that
this could be used to one's advantage, by using Monte Carlo to
evaluate a related game, that has similar correct moves, but for which
the Monte Carlo evaluation is more accurate.  But it isn't clear how
one is supposed to transform games like his 6-vs-6 into 6-vs-5,
without already knowing enough to completely solve the game!
Specifically, is there a way to do this that doesn't _also_ convert
6-vs-5 into 6-vs-4?

Switching back to go, I would like to point out that it seems to me to
be a serious mistake to apply UCT to any game other than the one at
hand.  If you do some dynamic fiddling with komi, you really ought
to only do it in playouts, and you should ensure that as the length of
the playout decreases, the amount of fiddling approaches zero.  That
way, once UCT has expanded a deep line of play, it will be working
with accurate win/loss values, instead of some fantasy based on a
globally-applied dynamic komi.  This seems very much in line with what
others have already suggested regarding the insertion of passes or
other ways of degrading the play in the playouts.  I suspect that
adjusting komi in the above manner may be an even better solution.

Weston


On Wed, Aug 19, 2009 at 1:45 PM, Ingo Althöfer3-hirn-ver...@gmx.de wrote:
 Don wrote:
 But how do you create the required tension in a way that
 produces a program that plays the game better?

 At least in high handicap go on 19x19 (with the dynamic bot being
 the stronger player) it seems to work when the bot is kept
 in some 35-45 % corridor, as long as it is clearly behind.

 Ingo.


 --
 GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
 Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] New CGOS

2009-06-05 Thread Weston Markham
On Fri, Jun 5, 2009 at 5:59 PM, Christoph Birk b...@ociw.edu wrote:
 Handicap games are for humans ... they get frustrated losing
 over and over. Computers have no problems with that.

2009/6/5 terry mcintyre terrymcint...@yahoo.com:
 The purpose of a handicap games is to allow a 50% chance of either
 contestant winning.

A handicap system based on time controls could be appropriate for many
computer programs.  Not that I expect Don to implement any such thing
any time soon.  I just think that it is worth mentioning.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-12-17 Thread Weston Markham
On Mon, Dec 15, 2008 at 11:10 PM, Mark Boon tesujisoftw...@gmail.com wrote:
 It would have been much more persuasive if you had simply run a 5K
 playout bot against a 100K bot and see which wins more.

In 200 games, 100k beat 5k a total of 127 times.  So that's about a
63.5% win rate.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham
On Mon, Dec 15, 2008 at 5:47 PM, Don Dailey drdai...@cox.net wrote:
 Is Jrefgo the pure version that does not use tricks like the futures
 map?   If you use things like that, all bets are off - I can't be sure
 this is not negatively scalable.

I don't know, although I was under the impression that I had
downloaded the pure version.  I found a reference to the source here
on the list, and downloaded and compiled that.  When I get back home,
how would I quickly determine which is the case?

 You cannot draw any reasonable conclusions by stopping after 10 moves
 and letting gnugo judge the game either.Why didn't you play complete
 games?

I think that complete games would have to be at least one of:
1.  Against a similarly weak opponent.  This casts doubt on whether
the results apply against other opponents.
2.  Unlikely to be won by an AMAF program.  This makes their
differences hard to measure.
3.  Played with handicap stones.  The granularity seems too coarse on
9x9.  Nevertheless, it might be worthwhile to try this.
4.  Played with a komi that is very far from an even game.  In 9x9,
this would mean that the better player must control the entire board
in order to win.  At that point, komi no longer becomes useful as a
means for providing a handicap.

Originally, (about two years ago) I ran studies such as this in order
to tune parameters that affected the playouts, and that I thought
could probably have different optimum values at different points in
the game.  Playing against an opponent that is generally stronger
makes it more likely that the improvements I find are likely to apply
to opponents in general, rather than simply tuning my program against
one particular opponent.  Playing against a close relative of the same
program (e.g., pitting 5k against 100k directly) gives misleading
results, in my experience.  Often, both programs will be blind to the
same lines of play, allowing genuinely bad moves to go unpunished.

On Mon, Dec 15, 2008 at 6:10 PM, Mark Boon tesujisoftw...@gmail.com wrote:
 It would have been much more persuasive if you had simply run a 5K
 playout bot against a 100K bot and see which wins more. It shouldn't
 take much more than a day to gather a significant number of games.

I may do that, although personally I would be far more cautious about
drawing conclusions from those matches, as compared to ones played
against a strong reference opponent.  But I guess other people feel
differently about this.  Anyway, the results would still be
interesting to me no matter which way they went, even if they failed
to convince me of anything.

 I did in fact put up a 100K+ ref-bot on CGOS for a little while, and
 it ended up with a rating slightly (possibly insignificantly) higher
 than the 2K ref-bot. Maybe I didn't put it there long enough,
 certainly not for thousands of games. But it didn't look anywhere near
 to supporting your findings.

That doesn't particularly disagree with my conclusions either.  For
example, I would guess that the best overall performance is somewhere
around 5k-10k, so a program with a setting in that range would obtain
a higher rating than either of 2K and your 100K+.  I could easily be
wrong about that, though.

 I say 100K+ because I didn't set it to a specific number, just run as
 many as it could within time allowed. Generally it would reach well
 over 100K per move, probably more like 250K-500K. That should only
 make things worse according to your hypothesis.

Yes, this is what sparked the conversation originally.  When you
reported that a while ago, my reaction was, Of course that won't work
very well; you're running way too many simulations.  I was actually a
bit surprised that noone else thought that this was as bad as I think
it is.

 So although I think the result of your experiment is very curious, I
 think it might be a bit hasty draw your conclusion.

Yes, it very well may be.  As I mentioned, I ran a number of similar
experiments a couple years ago, for which I unfortunately lost the
results.  My recollection is that they typically indicated the same
thing, across a number of variations on my own program.  Performance
would improve up to a point, then degrade as the program's behavior
became essentially deterministic.  But I may have made mistakes in
those tests, or I could be misremembering.


On Tue, Dec 16, 2008 at 12:20 PM, Don Dailey dailey@gmail.com wrote:
 A monte carlo bot like refbot, in most positions is going to converge on
 some specific move.   I think in the starting position it wants to
 play e5 and it is going to play e5 with an infinite number of playouts,
 whether than is the best move or not.There will be many situations
 where the move it wants to play is not the best, and so you can
 surmise that it's more likely to play a good move with fewer playouts.

Incidentally, when I get home, I'll post the line of play that follows
those moves with the highest (asymptotic) Monte Carlo values,
according to jrefgo.  I have 

Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham
On Tue, Dec 16, 2008 at 7:34 PM, Weston Markham
weston.mark...@gmail.com wrote:
 And I believe that current
 Monte Carlo methods only really manage to avoid the very worst of the
 bad moves, regardless of how many playouts they run.

Um, perhaps I should qualify that as pure Monte Carlo, meaning
without any form of tree search.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham
On Wed, Dec 17, 2008 at 12:34 AM, Weston Markham
weston.mark...@gmail.com wrote:
 I don't know, although I was under the impression that I had
 downloaded the pure version.  I found a reference to the source here
 on the list, and downloaded and compiled that.  When I get back home,
 how would I quickly determine which is the case?

The program reports 081016-2022 to the GTP version command.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham
On Wed, Dec 17, 2008 at 12:51 AM, Darren Cook dar...@dcook.org wrote:
 I'd also like to second Mark Boon's statement that Gnugo is not an
 expert judge, especially not after only 10 moves. One experiment I did,
 a couple of years ago, was scoring lots of terminal or almost-terminal
 9x9 positions with gnugo and crazy stone, and they disagreed a lot of
 the time. (Sorry, I don't remember what a lot was, maybe 10% or so.)

Hmm.  I agree as well.  I see this as the biggest weakness in the
experiment.  The weaknesses of gnugo could very well favor 5k's end
positions over 100k's, for irrelevant reasons.  The earlier
experiments I had run also used gnugo.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham
On Wed, Dec 17, 2008 at 1:32 AM, Mark Boon tesujisoftw...@gmail.com wrote:
 By the way, what does scratch100k.sh look like?

../gogui-1.1.3/bin/gogui-twogtp -auto -black java -jar jrefgo.jar 10 -game
s 1 -komi 0.5 -maxmoves 10 -referee gnugo --mode gtp --score aftermath --ch
inese-rules --positional-superko -sgffile games/jr100k-v-mogo-10m -size 9 -whit
e `cygpath -w /home/Experience/projects/go/MoGo_release3/mogo`
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham
On Wed, Dec 17, 2008 at 2:07 AM, Mark Boon tesujisoftw...@gmail.com wrote:
 Thanks. I just realized that you set the komi to 0.5. That doesn't
 sound like a good idea. I wanted to make sure you had the same for the
 100k version. Were your earlier experiments also with 0.5 komi? MC
 programs are highly sensitive to komi, so I'd use something more
 reasonable, like 6.5 or 7.5.

Ah.  I'm sorry if I failed to draw attention to that in my original description.

I believe that some of my experiments were very similar to this one.
I don't really recall the details although I did pick the values used
here of 5k, 100k, and 10 moves based on my recollection.  I think that
originally I may have measured the difference between the two over
moves 5-10 or 5-15.  And of course the Monte Carlo program was my own,
which had some minor differences.  More significantly, I would
probably have used gnugo as the reference program for those
experiments.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham
On Tue, Dec 16, 2008 at 7:34 PM, Weston Markham
weston.mark...@gmail.com wrote:
 Incidentally, when I get home, I'll post the line of play that follows
 those moves with the highest (asymptotic) Monte Carlo values,
 according to jrefgo.  I have about 18 moves calculated with high
 accuracy.

Here is a .sgf for the first 17 moves.   They are:

E5 D5 D6 E6 F6 E7 F5 D4 C6 F7 G7 G6 G5 D7 C7 C8 D8

The 18th move is very nearly a tie between E8 and F8, although I
think F8 eventually wins.

(;FF[4]CA[UTF-8]AP[GoGui:1.1.3]SZ[9]
KM[6.5]PB[JrefBot:081016-2022]PW[JrefBot:081016-2022]DT[2008-12-04]RE[B+34.5]
C[Black command: java -jar jrefgo.jar 10
White command: java -jar jrefgo.jar 10
Black version: 081016-2022
White version: 081016-2022
Opening: openings/opening2.sgf
Result[Black\]: ?
Result[White\]: ?
Referee: gnugo --level 0 --mode gtp --score estimate
--capture-all-dead --chinese-rules
Result[Referee\]: B+34.5
Host: localhost.localdomain (Pentium III (Coppermine))
Date: December 5, 2008 2:05:47 PM EST]
;B[ee];W[de];B[dd];W[ed];B[fd];W[ec];B[fe];W[df];B[cd];W[fc]
;B[gc];W[gd];B[ge];W[dc];B[cc];W[cb];B[db]
)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-12-16 Thread Weston Markham
On Wed, Dec 17, 2008 at 2:38 AM, Don Dailey dailey@gmail.com wrote:
 Is it the java version?  I believe there is only one version of that and
 it's the pure reference bot.   I did make modification to a C version
 but I think I kept that private.

Yes, it is the Java version.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-12-15 Thread Weston Markham
Hi.  This is a continuation of a month-old conversation about the
possibility that the quality of AMAF Monte Carlo can degrade, as the
number of simulations increases:

Me:  running 10k playouts can be significantly worse than running 5k playouts.

On Tue, Nov 18, 2008 at 2:27 PM, Don Dailey drdai...@cox.net wrote:
 On Tue, 2008-11-18 at 14:17 -0500, Weston Markham wrote:
 On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams
 michaelwilliam...@gmail.com wrote:
  It doesn't make any sense to me from a theoretical perspective.  Do you 
  have
  empirical evidence?

 I used to have data on this, from a program that I think was very
 nearly identical to Don's reference spec.  When I get a chance, I'll
 try to reproduce it.

 Unless the difference is large, you will have to run thousands of games
 to back this up.

 - Don

I am comparing the behavior of the AMAF reference bot with 5000
playouts against the behavior with 10 playouts, and I am only
considering the first ten moves (five from each player) of the (9x9)
games.  I downloaded a copy of Don's reference bot, as well as a copy
of Mogo, which is used as an opponent for each of the two settings.
gnugo version 3.7.11 is also used, in order to judge which side won
(jrefgo or mogo) after each individual match.  gnugo was used because
it is simple to set it up for this sort of thing via command-line
options, and it seems plausible that it should give a somewhat
realistic assessment of the situation.

jrefgo always plays black, and Mogo plays white.  Komi is set to 0.5,
so that jrefgo has a reasonable number of winning lines available to
it, although the general superiority of Mogo means that egregiously
bad individual moves will be punished.

In the games played, Mogo would occasionally crash.  (This was run
under Windows Vista; perhaps there is some incompatibility of the
binary I downloaded)  I have discarded these games (about 1 out of 50,
I think) from the statistics gathered.  As far as I know, there would
be no reason to think that this would skew the comparison between 5k
playouts and 100k playouts.  Other than occasional crashes, the
behavior of Mogo seemed reasonable in other games that I observed.  I
have no reason to think that it was not playing at a relatively high
level in the retained results.

Out of 3637 matches using 5k playouts, jrefgo won (i.e., was ahead
after 10 moves, as estimated by gnugo) 1688 of them.  (46.4%)
Out of 2949 matches using 100k playouts, jrefgo won 785.  (26.6%)

It appears clear to me that increasing the number of playouts from 5k
to 100k certainly degrades the performance of jrefgo.  Below, I am
including the commands that I used to run the tests and tally the
results.

Weston


$ cat scratch5k.sh

../gogui-1.1.3/bin/gogui-twogtp -auto -black \C:Program FilesJavaj
dk1.6.0_06binjava.exe\ -jar jrefgo.jar 5000 -games 1 -komi 0.5 -ma
xmoves 10 -referee gnugo --mode gtp --score aftermath --chinese-rules --positio
nal-superko -sgffile games/jr5k-v-mogo -size 9 -white C:cygwinhomeE
xperienceprojectsgoMoGo_release3mogo.exe


$ grep B+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l
1688

$ grep W+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l
1949

$ grep B+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l
785

$ grep W+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l
2164
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-11-18 Thread Weston Markham
On Mon, Nov 17, 2008 at 11:13 PM, Michael Williams
[EMAIL PROTECTED] wrote:
 No one ever alleged that pure AMAF or pure MC was infinitely scalable.

My point is that in many cases, they doesn't even keep all of their
benefits, after some number of trials have been run.  So, running 10k
playouts can be significantly worse than running 5k playouts.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-11-18 Thread Weston Markham
On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams
[EMAIL PROTECTED] wrote:
 It doesn't make any sense to me from a theoretical perspective.  Do you have
 empirical evidence?

I used to have data on this, from a program that I think was very
nearly identical to Don's reference spec.  When I get a chance, I'll
try to reproduce it.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RefBot (thought-) experiments

2008-11-17 Thread Weston Markham
On Mon, Nov 17, 2008 at 2:30 PM, Don Dailey [EMAIL PROTECTED] wrote:
 On Mon, 2008-11-17 at 16:04 -0200, Mark Boon wrote:
 On another note, as an experiment I have a bot running on CGOS that
 is the ref-bot but instead of using a fixed number of simulations I
 use a fixed amount of time that slowly diminishes towards the end of
 the game. The result is it does about 200K simulations per move for
 most of the game on a single processor. Its rating is currently
 stuck
 After 2k playouts you are already facing serious diminishing returns -
 but there is still a bit more to be had.  But after 5-10K playouts you
 get almost nothing.

I think that I have seen this sort of thing with Monte Carlo programs,
and I think it is possible to get even less than almost nothing.
You may be getting overly-precise measurements of the Monte Carlo
values of the moves near the beginning of the game, so that the played
moves are biased toward lines of play where the Monte Carlo values are
unrealistically good.  (This could be thought of as being somewhat
analogous to a horizon effect.)

Put another way, the Monte Carlo values do tend to accurately
distinguish the relatively good moves from the relatively poor moves,
(which of course makes them very useful) but at any given position,
you can't expect them to give the best score to the best move, even in
the limit.  As you run additional playouts, you can be more and more
confident that your program has identified the move with the best
Monte Carlo value.  But suppose that there are other moves that are
equally good (or better) under perfect play (or against a particular
opponent).  Then any supposed superiority of the program's selected
move over those alternatives is entirely due to inaccuracies of the
Monte Carlo value.  So once you are running enough playouts to detect
those differences, it also becomes more likely that subsequent
positions will encounter these same inaccuracies.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] small study

2008-10-27 Thread Weston Markham
On Sun, Oct 26, 2008 at 4:22 PM, Don Dailey [EMAIL PROTECTED] wrote:
 I imagine that it approaches some hypothetical
 level in an asymptotic way.

For most board positions, it is reasonable to expect that there exists
a single move for which the asymptotic Monte Carlo value is higher
than the rest.  Even when this is not the case, the different moves
with the same value are generally symmetrical, and the only possible
strategic advantage of one of these moves over another would be due to
superko.  So I expect that the asymptotic behavior is very nearly
deterministic.

In the past, I have had difficulties in evaluating the overall
strength of such players, because I would not get a good variety of
positions in the games played.  If you continue with higher numbers of
playouts, you might want to either ensure that you either include at
least one strong player that plays a variety of openings, or else play
games with a variety of starting positions.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] From zero to playing on CGOS in 10 minutes

2008-10-23 Thread Weston Markham
On Thu, Oct 23, 2008 at 1:00 PM, Mark Boon [EMAIL PROTECTED] wrote:
 This is still something I don't understand. Are there others who implemented
 the same thing and got 111 moves per game on average? I tried to look
 through some posts on this list but didn't see any other numbers published.

111 matches my recollection from past experiments that I did with
playouts that probably match Don's.  I think that this number has been
posted previously on the mailing list.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Go with modified scores

2008-10-09 Thread Weston Markham
In the context of Monte Carlo, a win or loss by a large margin is
quite likely (at least in any close game) to be due to large blunders.
 (For example, allowing a large, safe group of stones to be captured.)
 Given this, it does not make sense to weight it more strongly than
any other win or loss.  It would not even surprise me if a better
evaluation could be obtained by _penalizing_ wins by greater margins.
(Something like 100-k for a win and 100+k for a loss, exactly contrary
to what you propose.)

As a hopefully concrete example, consider the following 5x5 position,
with black to play:

.XO..
.XOO.
.XXOO
XXXO.
.XO.O

Komi is such that black can win by taking and filling the ko.

Black _could_ still capture the whole board, by playing on the
upper-right corner, but only if white fails to defend its group.  One
needs to avoid making this move appear overly attractive, because the
winning move does not win by such a large margin.

Weston

On Thu, Oct 9, 2008 at 3:23 AM, Ingo Althöfer [EMAIL PROTECTED] wrote:
 Some of you may want to stone me for this heresy,
 but read carfully before.

 When you have MCST-/UCT for Go that can work for
 real-valued scores (or at least a version that can
 work for three-valued scores: winm, draw, loss),
 you may look at Go with different scoring systems.

 Example:
 A win by 0.5+k points is worth 100+k.
 A loss by 0.5+k points gives score -100-k.
 Let's call b=100 the base score.
 Question: How does the playing style change with b?
 (Of course, for very large B you should have almost the
 normal playing style.)

 Ingo.

 PS: Such evaluations may also help to reduce the problem
 of MC's laziness.
 --
 Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
 Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] a few more questions

2008-05-13 Thread Weston Markham
On Tue, May 13, 2008 at 7:08 PM, Gunnar Farnebäck [EMAIL PROTECTED] wrote:
  And I agree, don't even think of doing this with floating point
  numbers.

This is a bit tangential to computer go.  But you have piqued my curiosity

Offhand, that sounds rather extreme to me.  Perhaps I haven't really
thought this through completely, but it seems as if there is a simple
modification to the stated algorithm, that would ensure that it works
fine with floating point numbers.  Or at least well enough for any
conceivable purpose:

Make sure that you select each random number from a slightly larger
range than prescribed by the exact totals.  (For example, always round
up while calculating the totals.  If you do not have control over the
rounding mode, just add on some small fraction of the largest weight.)
 Then, in the event that subtracting the weights of the
points/rows/buckets never reduced the selected number to a
non-positive value, just select a new random number and start over.

Would this work fine, or is there some flaw in my thinking that Álvaro
and Gunnar have already considered?

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: endgame (Was [computer-go] Re: Should 9x9 komi be 8.0 ?])

2008-03-06 Thread Weston Markham
On 3/6/08, Christoph Birk [EMAIL PROTECTED] wrote:
 On Thu, 6 Mar 2008, Don Dailey wrote:

  advantageous to give away stones that not.  Despite what many people
   believe,  MC programs don't normally believe it's better to win small
   and they are not hell-bent on giving away stones in order to try to make
   the score come out to be exactly 0.5 win.


 You are correct that it is not explicitly programmed into the MC
  programs to win by 0.5 pts, but since most of them don't care about
  the margin they in practice often do.

You are right, but I think that you may also be misconstruing the
nakade problem as a lack of concern about margin, when it is really a
fundamental failure to understand (i.e., failure to explore
sufficiently) the nakade shapes.  The tendancy for your final scores
on lost games to be 0.5 really reflects a motivation on your own part
to lose by the smallest margin that you can.  After a point when the
game is resolved, this doesn't conflict with the program's goal, so it
lets you do just that.

As an interesting thought, I think that it might actually be
informative for people to try out programs that _do_ try to win by
exactly 0.5!  Not as a fundamental goal, but rather as a slight
preference for the endgame.  If authors can do this in a manner that
does not degrade the playing ability much, then human players might be
able to see even more clearly what life  death errors a program
makes, and at what point it is able to discover the correct solution.
Clearly the programs would be weaker like this, but I think it could
expose some systematic problems very quickly, so that they could be
focused on.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Is Rémi correct?

2008-02-06 Thread Weston Markham
I know that other people have mentioned this sort of thing already,
but the result of level 8 being better than level 10 matches my own
experience with slightly older versions of gnugo.  As I recall, 8 was
the best, 9 a little worse, and 10 worse again.  Increasing the level
seems to improve play after that, but it dramatically increases the
time.

Weston

On Feb 6, 2008 12:48 PM, Don Dailey [EMAIL PROTECTED] wrote:
 Here is an update from the new 1000 game test using gungo at level 8
 instead of 10.

 Rank Name   Elo+- games score oppo. draws
1 Gnugo-3.7.11  1800   34   30  2186   97%  11370%
2 Mogo_03   1507   48   56   186   16%  18000%
3 Mogo_02   1202   43   51  10003%  18000%
4 Mogo_01   1003   70   96  10001%  18000%

 The test, at this point, seems to indicate that gnugo at level 8 is
 stronger than at level 10 because mogo is not doing as well as in the
 previous test.It will be more meaningful when we get to levels close
 to gnugo's strength.

 - Don


  As promised,  to answer Rémi, I did a study with mogo vs Gnu at various
  levels.   There is NO self play involved, Gnugo-3.7.11 is the only
  opponent for progressively higher rated version of Mogo.
 
  Here are the raw results so far:
 
  Rank Name   Elo+- games score oppo. draws
 1 Mogo_10   2319   72   60   500   95%  18000%
 2 Mogo_11   2284   94   74   259   94%  18000%
 3 Mogo_09   2234   57   49   500   92%  18000%
 4 Mogo_08   2124   43   39   500   87%  18000%
 5 Mogo_07   2016   35   33   500   78%  18000%
 6 Mogo_06   1961   32   30   500   72%  18000%
 7 Mogo_05   1814   28   28   500   52%  18000%
 8 Gnugo-3.7.11  1800   13   13  5259   44%  18230%
 9 Mogo_04   1711   29   29   500   37%  18000%
10 Mogo_03   1534   35   38   500   18%  18000%
11 Mogo_02   1281   60   72   5005%  18000%
12 Mogo_01   1004  115  178   5001%  18000%
 
 
  The issue is whether self-play results distort the rating of programs.
  In this case, we are only testing whether it distorts the ratings of
  Mogo since no other programs were tested.
 
  In the following table,  I played up to 500 games between Gnugo and Mogo
  at various levels.   The levels are the exact levels that correspond to
  the big scalability study.  In the middle column I listed the
  ratings as computed by bayeselo in games against  ONLY Gnugo and set the
  default rating of Gnugo to 1800, just as in the study.
 
  Unfortunately,  I used level 10 in the gnugo only games but in the big
  study we use level 8.   It's my understanding there is little difference
  between these 2 but we can probably assume Mogo might be a little better
  than indicated relative to the big scalability study.
 
  It looks like there indeed is a lot of distortion at the low end of the
  scale.  Mogo seems much stronger at low levels than the larger
  scalability study indicated.
 
  At the higher levels,  we also get a mismatch,  where Mogo's rating
  doesn't seem as high when playing only Gnugo.   This is as Rémi
  claims.
 
  One thing to note is that at higher levels it's more difficult to get an
  accurate rating.  Mogo_10 is winning 95% of it's games against Gnugo,
  and an extra win or loss every few games can make a lot of difference.
  However I am inclined to believe this is real since it seems to hold for
  several upper levels.   At level 7 it's only 42 ELO, but at levels
  beyond this it's over 100 ELO.
 
  I've never doubted that there is some intransivity between programs, but
  I am a little surprised that it is this much.  Even if the comparison is
  slightly unfair due to Mogo playing a stronger version of Gnugo in this
  study,  it's still seems like it must be at least 100 ELO.
 
 
  vers  vs Gnu  Study
    --  -
011004688
021281   1093
031534   1331
041711   1554
051814   1751
061961   1971
072016   2058
082124   2270
092234   2347
102319   2470
 
 
  My suggestion to improve this situation is to play a few thousands games
  against a well rated Gnugo and set up mogo as a second anchor.
 
  - Don
 
 
 
  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/
 
 
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: Re : [computer-go] Is MC-UCT really scalable ... is a troll

2008-01-22 Thread Weston Markham
(I typed the following up earlier today, before other people cast some
doubts about what infinite scalability means.  Perhaps it is
helpful.)

I think that Alain was specifically referring to a property of UCT,
whereby given that a winning line of play exists, the probability that
the algorithm has discovered one such line  settled upon it
approaches 1 as time approaches infinity.  I believe that this has
been proven.  And no, this theoretical result does not represent any
intrinsic advantage over a calculation of the minimax value.  (Someone
should feel free to correct me if there is a stronger theoretical
result that I am not aware of, though.)

I think it is important to understand that unless you restrict UCT
from exploring some correct lines of play, this infinite scalability
will still be true.  And, it is true regardless of what moves have
been preferred/pruned/whatever within the MC playouts.  (Or whatever
one does at the leaf nodes; it need not be MC at all, as long as the
UCT tree eventually gets expanded as the node is revisited.)

This is not to say that this particular property of UCT is sufficient
in order for us to claim that UCT is scalable in practice.  My point
is rather that noone is claiming infinite scalability of MC, but
rather scalability of UCT.  When you combine that with the fact that
UCT has been very successful at 9x9 go, (and scalable across a wide
range of time limits) it seems to be reasonable to expect more of the
same in 19x19.  (Which is what concerned the original poster.)  Also,
one should expect that the UCT portion of MC-UCT will tend to
eventually fix up any systematic errors that are made by the MC
playouts.

I have one other point I'd like to make, in regard to light versus
heavy playouts:  Even light playouts do not actually use a uniform
distribution, due to the quasi-eye rule that is generally used.  I
think that there is every reason to expect that other nonuniform
playout policies further outperform light playouts in every
practical way.  Granted, it is possible to introduce severe
misconceptions when one incorporates knowledge into one's playouts.
But even in that case, one can still fall back on the fact that UCT is
cleaning up after one's mistakes:  the eventual behavior, given enough
time, is still perfect play.  (And of course it is not as if people
blindly adjust the monte carlo policies without checking the revised
versions for severe misconceptions.)

Weston

On 1/22/08, ivan dubois [EMAIL PROTECTED] wrote:
 I think there is some missconception about this infinite scalability of MC 
 stuff.
 Theory is only usefull when the model it is based on shares important aspects 
 with reality.
 In theory, 19*19 Go is a finite game, wich can be analysed in a finite amount 
 of time. So ANY algorithm that solves the game at some point is, in theory, 
 infinitely scalable. For example, the folowing algorithm is infinitely 
 scalable :
 Analyse the complete mini-max tree of the game. If enough time to finish, 
 returns the correct move, if not, return a random move.

 Now, is this algorithm really scalable, in practive ? Of course not.

 I support the idea that MC is infinitely scalable (in the reality) ONLY if 
 the play-out part does not have severe misconceptions. So i think that 
 currently, only MC based on uniform playouts is infinitely scalable.
 It is well know that even Mogo has troubles with big eyes (he thinks a big 
 eye gives life, wich is false). This problem can not be solved with more 
 computing power (excep absurdly big, of course you can always mini-max to the 
 end).

 - Message d'origine 
 De : Alain Baeckeroot [EMAIL PROTECTED]
 À : computer-go computer-go@computer-go.org
 Envoyé le : Mardi, 22 Janvier 2008, 22h13mn 26s
 Objet : Re: [computer-go] Is MC-UCT really scalable ... is a troll

 Le mardi 22 janvier 2008, David Fotland a écrit :
 
  The UCT-MC programs do particularly well against traditional programs
  because they expose the brittleness inherent in the pattern databases they
  use.  Strong humans are not so easily beaten by playing unconventional and
  somewhat inferior moves.
 
 I think Remi posted a game of CrazyStone on 19x19 commented by one pro
 who said this move is 2 dan level.
 So far no go program got such analyse, and the also the huge novelty
 provided by MC/UCT programs is they have a _real_ strenght with their
 own style, like humans:
 play solid when winning, and play hamete (tricky moves) when losing.

 On kgs MC programs played hundreds (if not thousands) games against humans,
 and no doubt they fully merit their rank, and no doubt they are improving.

 Infinite scalability is a theoricaly proved property, so please
 don't feed the troll :-)

 Alain

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/


   
 _
 Ne 

Re: [computer-go] Simplified MC evaluator ¿explained?

2007-04-09 Thread Weston Markham

The second explanation was no clearer to me.  I'll try to criticize in
more detail:

1.  Uniform playouts, as used in practice, are not really uniform over
all legal go moves.  Generally, pass moves are excluded until
necessary, and moves that fill eyelike points are excluded.  So, I
assume that when you use the word legal, you mean admissible within
this sort of playout.

2.  That variance depends on the length of the playout.  It is
difficult for me to make sense of this statement, simply because not
all playouts from a given position have the same length.  My best
guess is that you are claiming that the longer a playout is, the more
likely it is that its result differs from the result under correct
play.  However, I strongly doubt that this is true for all starting
positions.  (Imagine that the first player needs to prevent the second
player from forming two eyes in a large group.  After doing this, that
group will eventually be captured, allowing playouts to continue
longer by filling the intersections that it once occupied.  Failing to
kill this group may allow the playouts to complete much more quickly,
but gives inaccurate results.)

2.5.  The variance of the stochastic process is not to be mixed up
with the distribution of
the error of a repeated Bernoulli experiment!  Perhaps I have mixed
them up.  Can you explain more clearly or precisely what the variance
of the stochastic process is?  Do you perhaps mean some measurement
of variation across different starting points, rather than across
different Bernoulli trials from the same starting position?  Or, do
you mean to distinguish the probability that a playout's outcome
differs from the outcome under correct play, from the probability that
a playout results in a win?  (Although those are just two different
Bernoulli experiments, right?)  Or is there some subtlety that I have
missed?

3.  'p is a biased towards 1/2 estimator of W'.  Consider the game:

o
  / \
 o   o
/ \  |
1   0 0

(1 is a win by the first player, and 0 is a loss.)  There is a move
that could allow the first player to win, if the second player does
not respond to it correctly.  This sounds like a realistic scenario
for go.

W = 1/3
p = 1/4

p is further from 1/2 than W.  Does this game violate the condition
that the number of legal moves for each side is balanced?  (It is
still not clear to me what this condition is that you are attempting
to impose.)  Or, was I supposed to calculate a statistic across
multiple game trees where W=1/3, in order to interpret p as an
estimator of W?

4.  Even if we can compute W exactly, do we have any reason to think
that its value is a good estimate of the minimax value of the game?
Is it even a better estimate than p, which we can already estimate
accurately?  (Note that in the game tree above, it is not.)  My
offhand guess is that it would not be as good.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] The physics of Go playing strength.

2007-04-09 Thread Weston Markham

On 4/8/07, Don Dailey [EMAIL PROTECTED] wrote:

These programs, in theory, will play perfect GO given
enough time.


... and space.  I doubt that your current programs would be capable of
storing a large enough game tree to actually converge to the
alpha-beta value.  So in practice, it really would level out somewhere
below optimal play, unless you also increase the memory usage, right?
I think that it is still valid to present your chart as representing
the lower portions of curves that do not do this, because you would
presumably scale up the amount of memory used as well as the time, if
you could.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Simplified MC evaluator ¿explained?

2007-04-07 Thread Weston Markham

On 4/7/07, Jacques Basaldúa [EMAIL PROTECTED] wrote:


Assuming two simplifying hypotheses:

1. The playouts are uniformly random.
2. Both players have the same number of legal moves (or any
unbalanced numbers compensate in the long term).



I did not understand your post either.  Is #2 the same as neither player
passes until the end of the game?  Or do you intend to assume that at any
given level of the game tree, all nodes have the same number of children?
The latter assumption seems to be much stronger than what you intend, since
it would imply that p=W exactly.  Plus, it is obviously (and dramatically)
false under any normal go rules where two passes end the game.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Re: pseudoliberties

2007-03-29 Thread Weston Markham

It appears to me that at least 91 is possible:

.xx.x.xx.
xx.xxx.xx
.xx.x.xx.
xx.xxx.xx
.xx.x.xx.
xx.xxx.xx
.xx.x.xx.
xx.xxx.xx
.xxx.xxx.

Weston


On 3/29/07, Jason House [EMAIL PROTECTED] wrote:

After some trial and error, I got 90

 * * * *
*
 * * * *
* * * ***
* * *
*** * * *
 * * * *
*
 * * * *


On 3/29/07, John Tromp [EMAIL PROTECTED] wrote:
 On 3/29/07, John Tromp [EMAIL PROTECTED] wrote:

  Is 88 the maximum number of pseuoliberties a string can have on 9x9?

 Make that 89:-)

 -John
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Re: pseudoliberties

2007-03-29 Thread Weston Markham

A pseudo-liberty is a pairing of a stone in the group and an adjacent,
empty intersection.

On 3/29/07, Jim O'Flaherty, Jr. [EMAIL PROTECTED] wrote:


What's a pseudo-liberty?  And how can there be more of them than there are
empty intersections (81) on the board?


- Original Message 
From: Jason House [EMAIL PROTECTED]
To: computer-go computer-go@computer-go.org
Sent: Thursday, March 29, 2007 1:02:01 PM
Subject: Re: [computer-go] Re: pseudoliberties

After some trial and error, I got 90

 * * * *
*
 * * * *
* * * ***
* * *
*** * * *
 * * * *
*
 * * * *


On 3/29/07, John Tromp [EMAIL PROTECTED] wrote:
 On 3/29/07, John Tromp [EMAIL PROTECTED] wrote:

  Is 88 the maximum number of pseuoliberties a string can have on 9x9?

 Make that 89:-)

 -John
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Re: pseudoliberties

2007-03-29 Thread Weston Markham

On 3/29/07, John Tromp [EMAIL PROTECTED] wrote:

On 3/29/07, Weston Markham [EMAIL PROTECTED] wrote:
 It appears to me that at least 91 is possible:
Nice! If you use O's instead like

.OO.O.OO.
OO.OOO.OO
.OO.O.OO.
OO.OOO.OO
.OO.O.OO.
OO.OOO.OO
.OO.O.OO.
OO.OOO.OO
.OOO.OOO.

it looks pretty artistic, like a vase with a flower inside:-)


:)

It appears that you can get 92 as well, by replacing the two empty
intersections along the upper edge of my diagram with a single one in
the center of that edge, and placing an empty intersection on one of
the other intersections along the central line of symmetry.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Re: pseudoliberties

2007-03-29 Thread Weston Markham

On 3/29/07, John Tromp [EMAIL PROTECTED] wrote:

On 3/29/07, Weston Markham [EMAIL PROTECTED] wrote:
 On 3/29/07, John Tromp [EMAIL PROTECTED] wrote:
  On 3/29/07, Weston Markham [EMAIL PROTECTED] wrote:
   It appears to me that at least 91 is possible:
  Nice! If you use O's instead like
 
  .OO.O.OO.
  OO.OOO.OO
  .OO.O.OO.
  OO.OOO.OO
  .OO.O.OO.
  OO.OOO.OO
  .OO.O.OO.
  OO.OOO.OO
  .OOO.OOO.
 
  it looks pretty artistic, like a vase with a flower inside:-)

 :)

 It appears that you can get 92 as well, by replacing the two empty
 intersections along the upper edge of my diagram with a single one in
 the center of that edge, and placing an empty intersection on one of
 the other intersections along the central line of symmetry.

If it weren't for the fact that that central line is needed to keep
everything connected:-)



Yes, I realized that later (as soon as I tried to draw it) but didn't
bother to post anything, since Gunnar had already posted that 91 is
the best possible.  But thanks for pointing it out.  :)

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] GTPv3

2007-03-08 Thread Weston Markham

On 3/7/07, Paul Pogonyshev [EMAIL PROTECTED] wrote:

Don Dailey wrote:
 On Wed, 2007-03-07 at 22:53 +0100, Gunnar Farneback wrote:
  * To abort the asynchronous genmove, the controller should send the
(synchronous) command abort_async_genmove. If the engine has not
returned the asynchronous genmove response before responding to the
abort command, it is no longer allowed to return a move. I'm open
  for
discussion on the exact name and whether it should return an error
  if
the move had already been sent (I don't think it should).

 But what about race conditions here?   The engine may be responding to
 the async_genmove command an instant before it realizes an abort
 command
 just arrived.   In this case it would be violating your rule but it
 wouldn't be anyone's fault.

There is no race condition because commands _are_ read synchronously.
But responses _may be_ sent asynchronously.

Paul


I'm not sure that I understand Paul's explanation, so I'll try out my
own:  Any race condition that occurs here is entirely the fault of the
engine.  The engine is already responsible for serializing all of the
responses that it makes.  Any failure to do so would allow interleaved
responses, which could not possibly be understood by the controller.
So, at the time that the async_genmove is permitted to write its
asynchronous response, (it may have to acquire a lock in order for
this to be the case) it is unambiguous whether or not the response to
abort_async_genmove has been sent.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [gtp] Re: [computer-go] GTPv3

2007-03-08 Thread Weston Markham

On 3/7/07, Paul Pogonyshev [EMAIL PROTECTED] wrote:

Gunnar Farneback wrote:
 Example 3: Like example 2, but abort command comes too late.

...

Maybe it should then read

?2 not in progress

It also makes sense to forbid an async_genmove (or simple genmove for
that matter) until the previous genmove/async_genmove has finished.
The engine should just fail with a predefined error string; I think
it is really cumbersome for the engine to try synchronization here and
serves little purpose.

Paul


Although Gunnar specifies async_genmove to return zero or one
responses, I believe that it may be more useful to allow it to return
any number.  This allows the possibility for the controller to update
a current best display while it has still not accepted a single
definitive response.  From the controller's point of view, the command
could still be in effect until a response to the abort_async_genmove
has been received.  (although perhaps end_async_genmove would be more
appropriate name in this case.)  I don't feel strongly either way on
this, but I thought that I should mention the possibility.

Does it also make sense to forbid commands that modify the engine's
state?  If the board state changes before async_genmove generates a
response, to which board does the response apply?

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] GTPv3

2007-03-08 Thread Weston Markham

On 3/8/07, Don Dailey [EMAIL PROTECTED] wrote:

On Thu, 2007-03-08 at 12:27 -0500, Weston Markham wrote:
I think I am confused.


I may be confused as well.  I see that Paul has responded as well, but
just in case this doesn't clear up on it's own:



So what you might get is this:

  1. controller sends async_genmove
  2. controller (after some period of time) sends abort.
  3. engine responds to aysnc_genmove
  4. engine responds to the abort search


...


Isn't this a race condition?   I believe this should cause no problems
because the controller can simply ignore the non-relevant response to
genmove (it knows it send abort.)   But I don't think that is what you
are talking about.



I didn't think that this is what you were talking about, either, so
now it is not clear to me what race condition you are worried about.
What behavior in this example depends on the timing?  (I see no
constraint on the actual ordering of #2, and #3, but no matter what
order they occur in, the engine's responses would be the same,
wouldn't they?  I assume, by the way, that #3 is the asynchronous
response to #1, and not the initial synchronous one.)  You are
claiming that there is some case where this scenario violates some
rule of Gunnar's proposed protocol, correct?  (The proposal did not
constrain the responses based on what additional commands had already
been sent, but rather what other responses had already been sent.)
What exactly is that violation?

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Big board

2007-02-21 Thread Weston Markham

That board needs to have the inside edge be connected to its outside
edge, in order to represent a torus.

Weston

On 2/20/07, Antonin Lucas [EMAIL PROTECTED] wrote:

No need for those difficulties,  you can play along this board :

http://www.youdzone.com/go.html


On 2/21/07, Weston Markham [EMAIL PROTECTED] wrote:
 Somewhere online, I played a game on a torus, against someone's Java
 applet that has this option.  I seem to recall playing a normal game
 at either 9x9 or 13x13, and then a game on the same-sized torus.  I
 recall the first game as being somewhat challenging to me, (a
 beginner) and the second game to be somewhat hard to visualize, but
 quite easy to beat the computer opponent.  Of course, it is possible
 that my experience in the first game helped me significantly in the
 second one, so it doesn't mean much.  However, I was puzzled at the
 time because I had expected my inability to visualize the interactions
 across the edge of the board to be a huge handicap, relative to a
 program that presumably has no such difficulty.

 Weston

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Big board

2007-02-21 Thread Weston Markham

(oops.  Other people have already pointed this out, in an
appropriately re-named thread.)

On 2/21/07, Weston Markham [EMAIL PROTECTED] wrote:

That board needs to have the inside edge be connected to its outside
edge, in order to represent a torus.

Weston

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Big board

2007-02-20 Thread Weston Markham

Somewhere online, I played a game on a torus, against someone's Java
applet that has this option.  I seem to recall playing a normal game
at either 9x9 or 13x13, and then a game on the same-sized torus.  I
recall the first game as being somewhat challenging to me, (a
beginner) and the second game to be somewhat hard to visualize, but
quite easy to beat the computer opponent.  Of course, it is possible
that my experience in the first game helped me significantly in the
second one, so it doesn't mean much.  However, I was puzzled at the
time because I had expected my inability to visualize the interactions
across the edge of the board to be a huge handicap, relative to a
program that presumably has no such difficulty.

Weston

On 2/20/07, steve uurtamo [EMAIL PROTECTED] wrote:

here's my first guess at don's question about how this
would affect the game.  my intuition is weak here, but
i'll take a stab at it just for fun.

no edges, no corners and no center mean that
you're effectively playing in the middle at all times.
this should mean that life would be harder to make
and that scores would be much lower.  certainly
komi would have to change.

my gut is that it'd be a harder game and that there
aren't any cheap clever tricks that would reduce it
to a simplistic game.

someone with a very powerful 9x9 MC player could
run a few thousand games to see what would happen.

oh, and keeping track of symmetry would really be
annoying, since translations would have to be taken
into account as well.  oops, this changes my intuition --
this means that many, many more board positions
are identical, which should vastly reduce the complexity
of the game.  :)

s.






No need to miss a message. Get email on-the-go
with Yahoo! Mail for Mobile. Get started.
http://mobile.yahoo.com/mail
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-13 Thread Weston Markham

On 2/9/07, Weston Markham [EMAIL PROTECTED] wrote:

I don't seem to have any numbers on this anymore, but I should be able
to try some experiments this weekend.  I do have some code that does
what I describe below.  It is also using an all moves as first
heuristic.  According to my notes, I made this change in an attempt to
avoid severely conservative (in my non-expert opinion) moves near the
beginning of the game, which seem to be preferred when using
all-moves-as-first.  It specifically aims for a 30-point lead at the
beginning of the game, and reduces this by one point for each turn
into the game.

I should point out that I am not averaging scores, but simply changing
which games I count as wins for the evaluation of a move.  This is
perhaps not quite what Steve Uurtamo had in mind when he was
originally musing about being greedy at the beginning of the game.
Nevertheless, it is a very similar sort of idea to what he described,
so I thought that I would mention it.

Weston

On 2/8/07, Chris Fant [EMAIL PROTECTED] wrote:
 On 2/8/07, Weston Markham [EMAIL PROTECTED] wrote:
  I believe that I have had some success with an approach like this,
  actually.  I believe that I initially only tally games that are won by
  a certain margin, and reduce that margin to zero as the game
  progresses.  I am pretty sure that this improves upon straight Monte
  Carlo.  I think that I can get some numbers on it, if anyone is
  interested.
 
  Weston

 Yes, please.



I did run some tests over the weekend, but they did not complete as
quickly as I had expected.  Anyway, I have enough data at this point
to say that although there is nothing spectacular to show one way or
the other, I was probably wrong that the code was an improvement.

Playing against gnugo 3.7.10, komi 7.5, playing black in 50% of the games:

86/732 games won baseline (11.7%)
106/1000 games won after enabling logic described above. (10.6%, a
slight degradation)

I also tried a version that dynamically adjusts an expected score,
and tallies the number of games where the score exceeds this.  It
appears to give an improvement:  12.8% of the games were won, out of
1000.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-12 Thread Weston Markham

I think that you are essentially correct.  However, this is only going
to affect a small number of games where two different moves are
exactly tied for the best winning percentage, after many playouts.
Even if the underlying probabilities are exactly the same, you can't
really expect this to happen much.

Weston

On 2/11/07, Heikki Levanto [EMAIL PROTECTED] wrote:

But there is a way. If we do N play-outs, the effect of any single of
them is 1/N. If we make sure to scale the score to be less than half of
this, it can not disturb anything in cases where the number of wins is
different. Only in cases with exactly the same number of wins in the
play-outs, would the score break the tie.

In other words my large constant of 1000 was far too small. It would
have to be something like 2NM, where M is the maximum score (say 361).
Round it up to 1000N, and we should be safe.

I still believe it would make endgames look more reasonable, and
possibly even better, in case the winning program has overlooked a
detail somewhere, having a large margin of points on the board should
act as an insurance against small blunders.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-09 Thread Weston Markham

I don't seem to have any numbers on this anymore, but I should be able
to try some experiments this weekend.  I do have some code that does
what I describe below.  It is also using an all moves as first
heuristic.  According to my notes, I made this change in an attempt to
avoid severely conservative (in my non-expert opinion) moves near the
beginning of the game, which seem to be preferred when using
all-moves-as-first.  It specifically aims for a 30-point lead at the
beginning of the game, and reduces this by one point for each turn
into the game.

I should point out that I am not averaging scores, but simply changing
which games I count as wins for the evaluation of a move.  This is
perhaps not quite what Steve Uurtamo had in mind when he was
originally musing about being greedy at the beginning of the game.
Nevertheless, it is a very similar sort of idea to what he described,
so I thought that I would mention it.

Weston

On 2/8/07, Chris Fant [EMAIL PROTECTED] wrote:

On 2/8/07, Weston Markham [EMAIL PROTECTED] wrote:
 I believe that I have had some success with an approach like this,
 actually.  I believe that I initially only tally games that are won by
 a certain margin, and reduce that margin to zero as the game
 progresses.  I am pretty sure that this improves upon straight Monte
 Carlo.  I think that I can get some numbers on it, if anyone is
 interested.

 Weston

Yes, please.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-08 Thread Weston Markham

On 2/8/07, steve uurtamo [EMAIL PROTECTED] wrote:

i wonder if this kind of greediness might, however, be useful for selecting,
say, the first move or two in a 9x9 game.  the thinking here is that since the
endgame is essentially noise at this point, you might as well be greedy
before tactics become an issue.  that's probably faulty intuition, though.


I believe that I have had some success with an approach like this,
actually.  I believe that I initially only tally games that are won by
a certain margin, and reduce that margin to zero as the game
progresses.  I am pretty sure that this improves upon straight Monte
Carlo.  I think that I can get some numbers on it, if anyone is
interested.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go Misnomer?

2007-02-07 Thread Weston Markham

Unfortunately, it's just not that simple, because it depends far more
on _how_ the playout is improved, rather than, say, the ELO numbers
that measure that improvement.  For example, programs exist that are
good (in part) because they entirely disregard some lines of play.
They may be correct to disregard these lines in almost every case,
which generally makes the playout program stronger.  However, for the
few cases where this heuristic pruning is not correct, the calling
program will suffer greatly, because these lines of play become
completely invisible to the random playouts, no matter how many
playouts are performed.  As an extreme example, consider programs that
play completely deterministic strategies.  These are obviously useless
as random players, yet it is in principle possible to construct ones
that play arbitrarily well.

Weston

On 2/7/07, Ephrim Khong [EMAIL PROTECTED] wrote:

Is there any known (by theory or tests) function of how much a increase
in the strength of the simulation policy increases the strength of the
MC/UCT Program as a whole?

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC Go Effectiveness

2007-02-07 Thread Weston Markham

On 2/7/07, Unknown [EMAIL PROTECTED] wrote:

 to binary search the file on the parent position key,
 collect all of these records together (making sure there is a
 legal move which leads to the cannonical response key) and then

You are not clear here.
Is there only a one-move-step between key and resp ?
Why not store the move in the first place ? (instead of the resulting
hash)


128 bits of hash vs 64 bits.  From his first post:  This makes
collision very unlikely.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] MC approach

2007-02-07 Thread Weston Markham

But of course, it's not the size of the win that counts, it is rather
the confidence that it really is a win.  In random playouts that
continue from a position from a close game, the ones that result in a
large victory are generally only ones where the opponent made a severe
blunder.  (Put another way, the score of the game is affected more by
how bad the bad moves are, rather than how good the good ones are, or
even how good most of the moves are.  Others have commented on this
effect in this list, in other contexts.)  Since you can't count on
that happening in the real game, these simulations have a lower value
in the context of ensuring a win.

Even when the program is losing (say by a little bit) it is more
important to play moves that it thinks are the most likely to convert
the game to a win by confusing the opponent, rather than to play moves
that will make the losing outcome more apparent.  These tend to be
different moves, and Monte Carlo methods are good at uncovering these
differences.  I agree that this is a bit surprising, but I find it
much less so when I think about it in these terms.

Given that people have reported such a strong effect, I am actually
wondering if these simulations (those that result in a large score
difference) should be _penalized_, for not being properly
representative of the likely outcome of the game.  In other words:

value = 1000 * win - score

Weston

On 2/7/07, Don Dailey [EMAIL PROTECTED] wrote:

My intuition is also the same as yours - one would think 100 point
win is better.   It's not that it's better or it's a better move
objectively,  but it should be better against a fallible opponent
who could easily get distracted by the little battles and forget
the real point of the game.   At least it's a way to fight when
you are losing.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] February KGS computer Go tournament: results

2007-02-05 Thread Weston Markham

The version in the Open division was the standard development
version. The one in the Open division, MonteGNU

Should one of those Opens be Formal?

Weston

On 2/5/07, Nick Wedd [EMAIL PROTECTED] wrote:


I have put a short report on the event at
http://www.weddslist.com/kgs/past/23/index.html

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Re: Scaling monte carlo up to 19x19

2007-01-30 Thread Weston Markham

I have an idea in the back of my mind that is an extreme version of
this:  Divide the board into 361 separate local searches, then use
information from these to guide a global search.  The local searches
would be done on the full board, but would only search for strategies
that will capture or defend individual intersections.  I suspect that
this first phase could potentially benefit from parallelization for a
significant portion of the game.  Eventually this parallelism will
break down because there will only be a limited number of local
battles, and the eventual status of points that are in the same chain
will almost always be the same.  Any practical program would need to
deal with this gracefully, of course, rather than duplicate its effort
many times.  Also, I only have a vague idea of how to take advantage
of the information gained from the local searches, when performing the
global search.

Weston

On 1/30/07, Dave Dyer [EMAIL PROTECTED] wrote:

The idea isn't more than lightly toasted (less than half baked), but
the kernal is turn the full board search into set of searches on
much smaller boards, using the overlapping strips as boundary
conditions, then do some unifying final step to pick the move.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Can a computer beat a human?

2007-01-24 Thread Weston Markham

On a slightly (but not much) more serious note:

The proposal that elicited (for better or for worse) Alain's
size-of-the-universe comment was not for a complete table of all
possible board states, but rather a table of winning moves.  I expect
that most positions will have multiple winning moves, but only a
single one would need to be stored.  The table does not need to store
anything for a position that cannot be reached by playing those
selected moves.

I believe that this greatly reduces the size needed.  I certainly
don't know how much smaller, although someone may be able to come up
with actual numbers for 5x5 and 7x7 games, and use an educated guess
on how these will scale up to 19x19.  My offhand guess is that this
size is still _far_ greater than the meagre memory resources that were
suggested in the original post.  For what it is worth, I expect that
it is also larger than the number of quantum states of one of Alain's
eyelashes.  I would guess that it is smaller, however, than 10^100
bits, which I believe is a common back-of-the-envelope number for the
total information in the visible universe.

One also has the freedom to pick and choose which winning moves to
store.  This can be used to maximize the overlap between the different
stored variations.  If you also take advantage of patterns within the
table, you will certainly be able to compress it.  (As has been
mentioned already.)  At the far extreme, of course, you only need one
entry in the table for each of 282 possible moves, and a hash function
that simply ... er, figures out which move to play.  Or, one can
strike a balance between these two extremes that gives an appropriate
tradeoff between computation time and memory.

I would guess that kami no itte would still be impossible on 32GB,
300Mhz.  However, beating a professional dan player seems reasonable
to me.  Of course, Alain will still have a large enough lookup table
that he will be able to beat it ... in the blink of an eye.

Weston

On 1/24/07, Don Dailey [EMAIL PROTECTED] wrote:

On Wed, 2007-01-24 at 21:11 +0100, alain Baeckeroot wrote:
 With 10^170 legal position for 19x19 what would be the size of this
 table ?
 I m afraid we cannot build it with all the matter in visible
 universe.

I think the computer science greats should have consulted you before
writing their textbooks - I just looked at this crazy thing called a
turing machine in one of my textbooks.

A universal turing machine supposedly has an infinite tape attached to
it.   Maybe they are smart about  computers, but they don't know
anything
about physics.   I think all these textbooks need to be thrown out
because they are obviously of no practical value.

- Don

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Can a computer beat a human?

2007-01-24 Thread Weston Markham

On 1/24/07, Weston Markham [EMAIL PROTECTED] wrote:

282 possible moves


Um.  Dunno where I got that number from.  (I meant 362, I think.)

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] an idea for a new measure of a computer go program's rank.

2007-01-23 Thread Weston Markham

Personally, I use the terminology in much the same way as Heikki.  I
use the word mistake to describe (for example) a move that loses a
large group, but does not change the game from a win to a loss.  It
makes sense to me to generally apply mistake to any move that loses
points relative to the best move on the board.  The term blunder
means, essentially, a move that lost the game.  It can be quite
difficult, of course, to determine unambiguously whether or not a
particular move is a blunder.  In an otherwise close match, a large
mistake (i.e., loses many points) is probably a blunder.  Toward the
end of a close game, it may be possible to find unambiguous blunders,
and some of these could be single point mistakes.

Weston

On 1/23/07, Heikki Levanto [EMAIL PROTECTED] wrote:

On Sun, Jan 21, 2007 at 08:16:07PM -0800, Ray Tayek wrote:
 I don't know the percentage of blunders. It also depends on what you
 call a blunder. Is a 1 point mistake a blunder?

 no, maybe 10 or more points

My gut feeling is that a real blunder is enough to loose the game.
Between equally strong players, a one point mistake can be a blunder, if
it was late in the yose, and the game was won by half a point. On the
other hand, throwing away a 20-stone group may not be a blunder if you
were already going to loose by 100 points. It could even be a
(mis?)calculated risk, ignoring a threatening move in order to get an
attack on an even larger group, even if that attack later turns out not
to work...

Just my uninformed gut feeling, of course.

-H

--
Heikki Levanto   In Murphy We Turst heikki (at) lsd (dot) dk

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Fw: Compensation for handicap plays?

2006-12-29 Thread Weston Markham

Okay.  Don's later post does indicate that he intends to compensate
for the stones.  So, in the interest of being 100% clear:  is this
compensation included in the komi value that is sent to the client?

Weston

On 12/29/06, Weston Markham [EMAIL PROTECTED] wrote:

Am I correct in inferring that this is also what
Don Dailey has in mind, but with no compensation for the handicap
stones?

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] professional game libraries for pattern harvesting

2006-12-13 Thread Weston Markham

Also, there was a recent thread on the mailing list:  50, 576 pro/dan
games without repetitions nor easily detectable problems, started by
Jacques Basaldúa, who has put together a collection of games:

http://www.dybot.com/masters/masters.zip

If I recall correctly, the format of this file is only documented in
Jacques' message, so you may need to refer to that for details.

Weston
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/