I wonder if this behavior could be avoided by giving a small incentive to
win by the most points (or most material in chess) similar to to the
technique mentioned by David Wu in KataGo a few days ago. The problem right
now is that the AI has literally no reason to think that winning with more
points is better than by 0.5 points, whereas human players prefer to win by
more points slightly. David, have you noticed if KataGo avoids these sorts
of losing point moves at the end of the game?

(I feel the same reasoning applies to automatic cars, they could be (and
probably are) trained to prefer smoother ride in addition to avoiding

On Tue, Mar 5, 2019 at 5:11 PM "Ingo Althöfer" <3-hirn-ver...@gmx.de> wrote:

> Hi,
> recently, Leela-Chess-Zero has become very strong, playing
> on the same level with Stockfish-10. Many of the test players
> are puzzled, however, by the "phenomenon" that Lc0 tends to
> need many many moves to transform an overwhelming advantage
> into a mate.
> Just today a new German tester reported a case and described
> it by the sentence "da wird der Hund in der Pfanne verrückt"
> ("now the dog is going crazy in the pan", to translate it word
> by word). He had seen an endgame: Stockfish with naked king,
> and LeelaZero with king, queen and two rooks. Leela first
> sacrificed the queen, then one of the rooks, and only then
> started to go for a "normal" mate with the last remaining rook
> (+ king). The guy (Florian Wieting) asked for an explanation.
> http://forum.computerschach.de/cgi-bin/mwf/topic_show.pl?tid=10262
> I think there is a very straightforward one: What Leela-Chess-Zero
> with its MCTS-based searc) performs is comparable to the
> path all MCTS Go bots took for many years when playing winning
> positions against human opponents: the advantage was reduced
> step by step, and in the end the bot gained a win by 0.5 points.
> Later, in the tournament table, that was not a problem, because
> a win is a win :-)
> Similarly in chess: overwhelming advantage is reduced by lazy play
> to some small margin advantage (against a straightforward alpha-beta
> opponent), and then the MCTS chess bot (= Leela Zero in this case)
> starts playing concentratedly.
> Another guy asked how DeepMind had worked around this problem
> with their AlphaZero. I am rather convinced: They also had this
> problem. Likely, they kept the most serious examples undisclosed,
> and furthermore set the margins for resignation rather narrow (for
> instance something like evaluation +-6 by Stockfish for three move
> pairs) to avoid nearly endless endgames.
> Ingo.
> PS: thinking of a future with automatic cars in public traffic. The
> 0.5-point wins or the related behaviour in MCTS-based chess would mean
> that an automatic car would brake only in the very last moment
> knowing that it will be sufficient to stop 20 centimeters next to the
> back-bumpers of the car ahead. Of course, a human passenger would
> not like to experience such situations too often.
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
Computer-go mailing list

Reply via email to