On Tue, 2008-10-14 at 16:22 +0100, Claus Reinke wrote:
> > .. I think people (me included) feel that replacing a whole swath
> > of relevant information by a single number points to potentially some
> > serious inefficiency and loss of information. The fact that nobody
> > has found how to make use of the excess of information is no proof of
> > course that it can't be done. I think it's a very valid question at
> > the very least and a bit premature to call it a wrong premise. MCTS
> > programs are still in their early days of development and it's quite
> > possible some good improvements can be made by using more information
> > than the simple win/loss ratio.
> 
> First, let me state that I agree: there should be ways to get more. In
> fact, my impression is that there have been successes in making better
> use of simulation results (ownership maps, territory heuristic). Just that
> they use even more detail (final board vs final score), which makes it
> easier to extract information that is safe only over a large number of
> runs. Just looking at the sum score can hide the details that lead to
> statistically relevant information.

I resisted responding to those original remarks on this, but I will
now.  

Mark Boon went off on a tangent here when he talked about a "swath of
information available" and his imaginative discourse on how it might be
used.   He really launched into a different discussion and I don't
disagree with him.  It was just something else he was talking about.

I have no idea whatsoever whether it's possible to build a better
scoring algorithm based on taking advantage of a "whole swath of
relevant information"  but we were not talking about that.   We were
talking about using "margin of victory."     Perhaps margin of victory
can play a minor role but it has to if you consider a "whole swath of
other information" too, which in the context of the original discussion
is admitting that it's really not as important as you thought it was.

This type of discussion happens too often here. I don't know what the
technical term is but I'll call distraction or deflection.  Here is an
example of it:

  person A:  I don't like dogs.

  person B:  What?  You don't like creatures with legs?  You must be
anti-social if you don't like people.

  person A:  I wasn't talking about people, just dogs.

  person B:  Why did you put down people then?


- Don
     



> 
> One might look at the distribution of sum scores, though: for instance,
> if scores are either drastic or close, with little in-between, one might
> assume that there is a pivot move (combination) to be searched for
> on which the game outcome hinges in the current position (eg, a large
> group in danger).
> 
> Second, having now looked at some more random "light playouts"
> (just instrument your engine to output sgf before starting the next run),
> I feel that the name is highly misleading. These simulation runs have
> very little in common with actual play, eg, in a 19x19 run from an
> empty board, one might see an opening period, with 200 moves of
> randomly placed stones, followed by a middle period, with 100 or
> so moves occasionally making a firm connection or a true single-space
> eye, followed by an endgame with 100-200 moves exploring possible
> consequences of those little bits of random mess that cannot be ruined
> by random play.
> 
> You can perhaps predict which side has the more-likely-connected
> strings and the more-likely-unkillable strings at 250 or 300 moves,
> and hence, which side is likely to win the "playout" in the end, but
> again those properties have very little in common with good shape
> in actual play.
> 
> Of course, the outcome is supposed to be nearly evenly random from
> an empty board, but if you look at move 300 and try to compare the
> strings and eyes that almost make it vs those that do, it drives home
> the message that the individual run is nearly meaningless, and the
> simulation is blind to many "obvious" features of a game position.
> Upping the number of simulations escalates some of those "nearly"s
> to significance, but care is needed to decide between significant result
> and significant error, and how to interpret either.
> 
> So I now prefer to call these simulation runs "legal fill-ins" and I think
> it would be worthwhile trying to improve our knowledge on what kind
> of information can be safely extracted from (sets of) them.
> 
> Even the one-bit win/loss information appears to depend on (a) undecided
> areas being filled in such a way that they are allocated evenly to both sides
> and (b) decided areas and influence on both sides having an equal share
> of ruinable and non-ruinable aspects. This way, a Murphy-style (whatever
> can go wrong, will) random fill-in will neither see unrealistic advantages
> emerge in undecided areas nor reduce one side's advantages more strongly
> than the other one's, so even if the random scores bear little relation to the
> realistic scores, the win/loss bit would still be useable, at least over large
> enough samples.
> 
> Similarly, if very nearly all random fill-ins in a large enough sample agree
> on the territorial status of an intersection, that might seem another safe bit
> of information to extract.
> 
> But that may not be entirely accurate, and "it works most of the time"
> is quite different from "statistical analysis shows an error rate of E% for
> information X after N simulation runs" (*). Even the latter leaves enough
> room both for interpretation and for significant surprises in actual play.
> 
> Below is another of those odd examples that you might want to run through
> your Monte-Carlo evaluation engine (mentally or computer-based:-). I'd
> be interested to see the results, and how they vary with number of 
> simulations.
> 
> Assuming no komi, Chinese count is 40/40, so whoever fills the center
> wins. In a simulation-based evaluation, random invasions are impossible
> in the black side, but only highly unlikely in the white side. Perhaps someone
> else can construct an example where the difference is actually significant?
> If anything, white has been more efficient in building (four moves less, as
> apparent in Japanese count), so one could say that simulation-based
> scoring based on naive play is slightly biased toward inefficient play,
> because it sees most clearly those features that cannot be ruined by
> Murphy-style play.
> 
> Simulation-based evaluation does not see the same board position we
> see, and if the evaluations have anything in common, that is because of
> assumptions like (a)/(b) above. If those assumptions are violated, all
> bets are off, so it would be good to have as complete a catalogue of
> such assumptions as possible. Has there been any work in this direction?
> 
> Claus
> 
> (*) Since statistics seem to be the only thing that saves simulation runs
>     and experiment-based reasoning from irrelevance: could anyone here
>     please suggest a good book or online tutorial for those who, like myself,
>     would like a refresher on the relevant basic aspects of statistics?
> 
> (
> ;
> FF[4]
> GM[1]
> SZ[9]
> AP[Jago:Version 5.0]
> AB[ab][ba][ad][af][ah][bi][bg][be][bc][cb][cd][cf][ch][da][dc][de][dg][di][ef][eh][eg][ei][db][dd][df][dh]
> TB[aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][ea][eb][ec][ed][ee][ef][eg][eh][ei][fa][fb][fc][fd][fe][ff][fg][fh][fi][ga][gb][gc][gd][ge][gf][gg][gh][gi][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][ea][ec][ee][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][ea][ec][ee][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][ea][ec][ee][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][ea][ec][ee][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][ea][ec][ee][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][ea][ec][ee][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][ea][ec][ee][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][aa][ac][ae][ag][ai]
>  
> [bb][bd][bf][bh][ca][cc][ce][cg][ci][db][dd][df][dh][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci][aa][ac][ae][ag][ai][bb][bd][bf][bh][ca][cc][ce][cg][ci]
> AW[ga][gb][gc][gd][ge][gf][gg][gh][gi][ff][fg][fh][fi][fe][fd][fc][fb][fa][ea][eb][ec][ed]
> TW[ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][gg][gh][gi][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][gg][gh][gi][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][gg][gh][gi][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][gg][gh][gi][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][ha][hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii][ha]
>  [hb][hc][hd][he][hf][hg][hh][hi][ia][ib][ic][id][ie][if][ig][ih][ii]
> GN[fill-in-score]
> C[
> Chinese count:
> Black: 40, White: 40
> Japanese count:
> Black: 14, White: 18]
> )
> 
> 
> 
> 
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to