Hi!

On Thu, Dec 08, 2016 at 11:23:50PM -0800, Freeman Ng wrote:
> No, it's because the bots' mc based algorithms currently don't care how
> much they win by. (At least I'm assuming that's what Ingo meant.) They just
> try to maximize their odds of winning.
> 
> I've often wondered about this, though, and maybe the bot developers here
> can give me an answer. There's no reason why an mc-based go program
> couldn't also factor winning margin into its decisions, is there? I assume
> that at some point, what the mc analysis yields is a winning probability
> for each candidate move, but at that point, you could still combine that
> number with other factors, right? Some combination of winning probability
> and probable winning margin, so that, for example, a 87% chance of winning
> by 5 points could be rated lower than a 85% change of winning by 20. I
> don't know what the ideal formula would be, and you'd probably want to
> prevent the winning probability from ever getting too low, while also
> ignoring potentially large winning margins beyond a certain point, but the
> idea would be to generally make the bots play more like humans.

  I think many programs use this hack.  (In Pachi, we eventually
disabled it because it compromised strength noticeably.)

  The basic explanation for why this is not straightforward is that you
never want your program to consider moves in the direction of
low-probability wins, no matter how large margins they might have; the
MC measurement function is very noisy with regards to individual samples.

> Do any commercial Go programs work this way? If not, I'd like to request it
> from the commercial developers here. It could be an option that you'd only
> have in the commercial product, for your users to turn on if they prefer
> it. You could still operate in pure mc mode for bot vs. bot play.

  A more robust version of this strategy, that is used commonly in many
programs, is dynamic komi.  But it can also exhibit unstable behavior
in some cases, especially in late endgame.  I didn't quite understand
why is this important for commercial programs in particular?

  (For this mode in particular, e.g. Pachi has a special mode where it
enables a variety of such features (but is weaker in general) when given
the "maximize_score" parameter.  I wonder if other programs also have
multiple modes like this.)

                                        Petr Baudis
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to