Hmmm, screenshot doesn't seem to be getting through to the list, so here's
a textual graphic instead.

       A B C D E F G H J K L M N O P Q R S T
     +---------------------------------------+
  19 | . . . . . . . . . . . . . . . . . . . |
  18 | . . . . . . . . . . . . . . . . . . . |
  17 | . . . . . . . . . . . . . X . . . . . |
  16 | . . . . X . . . . . . . . . . O . . . |
  15 | . . . . . . . . . . . . . X . . . . . |
  14 | . . . . . . . . . . . . . . . . O . . |
  13 | . . . . . . . . . . . . . . . X O . . |
  12 | . . . . . . . . . . . . . . . X O . . |
  11 | . . . . . . . . . . . . . . . a . . . |
  10 | . . . . . . . . . . . . . . . . . . . |
   9 | . . . . . . . . . . . . . . . . . . . |
   8 | . . . . . . . . . . . . . . . . . . . |
   7 | . . . . . . . . . . . . . . . . . . . |
   6 | . . . . . . O . . . . . . . . . . . . |
   5 | . . . . . O X b . . . . . . . . . . . |
   4 | . . . X X X O O . . . . . . . . . . . |
   3 | . . . . X O . . . . . . . . . O . . . |
   2 | . . . . . . . . . . . . . . . . . . . |
   1 | . . . . . . . . . . . . . . . . . . . |
     +---------------------------------------+

Black (X) to move. The screenshot showed that Leela's policy net put about
96% probability on a and only 1.03% on b. And that even after nearly 1
million simulations had basically not searched b at all.


On Mon, Apr 17, 2017 at 9:04 AM, David Wu <lightvec...@gmail.com> wrote:

> I highly doubt that learning refutations to pertinent bad shape patterns
> is much more exponential or combinatorially huge than learning good shape
> already is, if you could somehow be appropriately selective about it. There
> are enough commonalities that it should "compress" very well into what the
> neural net learns.
>
> To draw an analogy with humans, for example humans seem to have absolutely
> no issue learning refutations to lots of bad shapes right alongside good
> shapes, and don't seem to find it orders of magnitude harder to apply their
> pattern recognition abilities to that task. Indeed people learn bad shape
> refutations all the time from performing "training updates" based on their
> own reading about what works or not every time they read things in a game,
> (almost certainly a thing the human brain "does" to maximize the use of
> data), as well as the results of their own mistakes, and also learn and
> refutations of non-working tactics all the time if they do things like
> tsumego. But the distribution on bad moves that people devote their
> attention to learning to refute is extremely nonuniform - e.g. nobody
> studies what to do if the opponent plays the 1-2 point against your 4-4
> point - and I bet that's important too.
>
> I'm also highly curious if there's anyone who has experimented with or
> found any ways to mitigate this issue!
>
> ----------
>
> If you want an example of this actually mattering, here's example where
> Leela makes a big mistake in a game that I think is due to this kind of
> issue. Leela is white.
>
> The last several moves to reach the screenshotted position were:
> 16. wG6 (capture stone in ladder)
> 17. bQ13 (shoulder hit + ladder breaker)
> 18. wR13 (push + restore ladder)
> 19. bQ12 (extend + break ladder again)
> 20. wR12 (Leela's mistake, does not restore ladder)
>
> You can see why Leela makes the mistake at move 20. On move 21 it pretty
> much doesn't consider escaping at H5. The policy net puts it at 1.03%,
> which I guess is just a little too low for it to get some simulations here
> even with a lot of thinking time. So Leela thinks black is behind,
> considering moves with 44% and 48% overall win rates. However, if you play
> H5 on the board, Leela almost instantly changes its mind to 54% for black.
> So not seeing H5 escape on move 21 leads it to blunder on move 20. If you
> go back on move 20 and force Leela to capture the ladder, it's happy with
> the position for white, so it's also not as if Leela dislikes the ladder
> capture, it's just that it wrongly thinks R12 is slightly better due to not
> seeing the escape.
>
> To some degree this maybe means Leela is insufficiently explorative in
> cases like this, but still, why does the policy net not put H5 more than
> 1.03%. After all, it's vastly more likely than 1% that that a good player
> will see the ladder works here and try escaping in this position. I would
> speculate that this is because of biases in training. If Leela is trained
> on high amateur and pro games, then one would expect that in nearly all
> training games, conditional on white playing a move like R12 and ignoring
> the ladder escape, the ladder is often not too important and therefore
> Black should usually not escape. By contrast, when the ladder escape is
> important, then in such games White will capture at H5 instead of playing
> R12 and not allow the escape in the first place. So there is rarely a
> training example that allows the policy net to learn that the ladder escape
> is likely for black to play here.
>
>
> On Mon, Apr 17, 2017 at 7:04 AM, Jim O'Flaherty <
> jim.oflaherty...@gmail.com> wrote:
>
>> It seems chasing down good moves for bad shapes would be an explosion of
>> "exception cases", like combinatorially huge. So, while you would be saving
>> some branching in the search space, you would be ballooning up the number
>> of patterns for which to scan by orders of magnitude.
>>
>> Wouldn't it be preferable to just have the AI continue to make the better
>> move emergently and generally from probabilities around win placements as
>> opposed to what appears to be a focus on one type of local optima?
>>
>>
>> On Mon, Apr 17, 2017 at 5:07 AM, lemonsqueeze <lemonsque...@free.fr>
>> wrote:
>>
>>> Hi,
>>>
>>> I'm sure the topic must have come up before but i can't seem to find it
>>> right now, i'd appreciate if someone can point me in the right direction.
>>>
>>> I'm looking into MM, LFR and similar cpu-based pattern approaches for
>>> generating priors, and was wondering about basic bad shape:
>>>
>>> Let's say we use 6d games for training. The system becomes pretty good
>>> at predicting 6d moves by learning patterns associated with the kind of
>>> moves 6d players make.
>>>
>>> However it seems it doesn't learn to punish basic mistakes effectively
>>> (say double ataris, shapes with obvious defects ...) because they almost
>>> never show up in 6d games =) They show up in the search tree though and
>>> without good answer search might take a long time to realize these moves
>>> don't work.
>>>
>>> Maybe I missed some later paper / development but basically,
>>> Wouldn't it make sense to also train on good answers to bad moves ?
>>> (maybe harvesting them from the search tree or something like that)
>>>
>>> I'm thinking about basic situations like this which patterns should be
>>> able to recognize:
>>>
>>>        A B C D E F G H J K L M N O P Q R S T
>>>      +---------------------------------------+
>>>   19 | . . . . . . . . . . . . . . . . . . . |
>>>   18 | . . . O O . O . . . . . . . . . . . . |
>>>   17 | . . X . X O . X O . . . . . . . . . . |
>>>   16 | . . X . X O . O O . . . . . . X . . . |
>>>   15 | . . . . . X X X X . . . . . . . . . . |
>>>   14 | . . X . . . . . . . . . . . . . . . . |
>>>   13 | . O . . . . . . . . . . . . X . O . . |
>>>   12 | . . O . . . . . . . . . . . . . O . . |
>>>   11 | . . O X . . . . . . . . . . . X . . . |
>>>   10 | . . O X . . . . . . . . . . . X O . . |
>>>    9 | . . O X . . . X . . . X O . . X O . . |
>>>    8 | . . O X . . . O X X X O X)X X O O . . |
>>>    7 | . O X . . . . . O O X O . X O O . . . |
>>>    6 | O X X . X X X O . . O . . . X O X . . |
>>>    5 | . O O . . O X . . . . . O . . . . . . |
>>>    4 | . X O O O O X . . . O . . O . O . . . |
>>>    3 | . X X X X O . . X . X X . . . . . . . |
>>>    2 | . . . . X O . . . . . . O . . . . . . |
>>>    1 | . . . . . . . . . . . . . . . . . . . |
>>>      +---------------------------------------+
>>>
>>> Patterns probably never see this during training and miss W L9,
>>> For example :
>>>
>>> In Remi's CrazyPatterns.exe L9 comes in 4th position:
>>>    [ M10 N10 O6 L9 ...
>>>
>>> With Pachi's large patterns it's 8th:
>>>    [ G8  M10 G9  O17 N10 O6  J4  L9  ...
>>>
>>> Cheers,
>>> Matt
>>>
>>> ----
>>>
>>> MM: Computing Elo Ratings of Move Patterns in the Game of Go
>>>    https://www.remi-coulom.fr/Amsterdam2007/
>>>
>>> LFR: Move Prediction in Go – Modelling Feature Interactions Using
>>>       Latent Factors
>>>    https://www.ismll.uni-hildesheim.de/pub/pdfs/wistuba_et_al_
>>> KI_2013.pdf
>>>
>>>
>>> _______________________________________________
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to