I highly doubt that learning refutations to pertinent bad shape patterns is
much more exponential or combinatorially huge than learning good shape
already is, if you could somehow be appropriately selective about it. There
are enough commonalities that it should "compress" very well into what the
n
a's policy net put about
96% probability on a and only 1.03% on b. And that even after nearly 1
million simulations had basically not searched b at all.
On Mon, Apr 17, 2017 at 9:04 AM, David Wu wrote:
> I highly doubt that learning refutations to pertinent bad shape patterns
> is muc
misread a ladder this simple unless it's outright buggy.
On Mon, Apr 17, 2017 at 11:25 AM, Stefan Kaitschick
wrote:
>
> On Mon, Apr 17, 2017 at 3:04 PM, David Wu wrote:
>
>> To some degree this maybe means Leela is insufficiently explorative in
>> cases like this, bu
Leela playouts are definitely extremely bad compared to competitors like
Crazystone. The deep-learning version of Crazystone has no value net as far
as I know, only a policy net, which means it's going on MC playouts alone
to produce its evaluations. Nonetheless, its playouts often have noticeable
flaw in the value net's ability to produce
good probabilities of winning in games where the game hinges on the life
and death chances of a single large dragon, and where the expected score
could be wildly uncorrelated with the probability of winning.
On Mon, May 22, 2017 at 9:39 PM, David Wu
I've been using Leela 0.10.0 for analysis quite often, and I've noticed
something that might lead to an improvement for the search, and maybe also
for other MCTS programs.
Sometimes, after putting hundreds of thousands of simulations into a few
possible moves, Leela will appear to favor one, disli
Thanks for the replies!
On Mon, Jul 24, 2017 at 9:30 AM, Gian-Carlo Pascutto wrote:
> On 23-07-17 18:24, David Wu wrote:
> > Has anyone tried this sort of idea before?
>
> I haven't tried it, but (with the computer chess hat on) these kind of
> proposals behave pretty
Cool, thanks.
On Mon, Jul 24, 2017 at 10:30 AM, Gian-Carlo Pascutto wrote:
> On 24-07-17 16:07, David Wu wrote:
> > Hmm. Why would discounting make things worse? Do you mean that you
> > want the top move to drop off slower (i.e. for the bot to take longer
> > to achieve
* A little but not really.
* No, and as far as we can tell, never. Even 7x7 is not rigorously solved.
* Unknown.
* Against Go-God (plays move that maximizes score margin, breaking ties by
some measure of the entropy needed to build the proof tree relative to a
human-pro-level policy net), I guess u
es in that tree). Once ahead, of course it can do whatever it wants that
preserves the win.
On Sun, Aug 6, 2017 at 10:31 AM, David Wu wrote:
> * A little but not really.
> * No, and as far as we can tell, never. Even 7x7 is not rigorously solved.
> * Unknown.
> * Against Go-God
Saying in an unqualified way that AlphaGo is brute force is wrong in the
spirit of the question. Assuming AlphaGo uses a typical variant of MCTS, it
is technically correct. The reason it's technically correct uninteresting
is because the bias introduced by a policy net is so extreme that it might
a
While browsing the online, I found an interesting idea "chain pooling"
presented here:
https://github.com/jmgilmer/GoCNN
The idea is to have some early layers that perform a max-pool across
solidly-connected stones. I could also imagine it being useful to perform a
sum. So the input would be a 19x
I suspect the reason they were able to reasonably train a value net with
multiple komi at the same time was because the training games they used in
that paper were generated by a pure policy net, rather than by a MCTS
player, where the policy net was trained from human games.
Although humans give
Hex:
https://arxiv.org/pdf/1705.08439.pdf
This is not on a 19x19 board, and it was not tested against the current
state of the art (Mohex 1.0 was the state of the art at its time, but is at
least several years old now, I think), but they do get several hundred elo
points stronger than this old ver
I wouldn't find it so surprising if eventually the 20 or 40 block networks
develop a set of convolutional channels that traces possible ladders
diagonally across the board. If it had enough examples of ladders of
different lengths, including selfplay games where game-critical ladders
"failed to be
As far as a purely convolutional approach, I think you *can* do better by
adding some global connectivity.
Generally speaking, there should be some value in global connectivity for
things like upweighting the probability of playing ko threats anywhere on
the board when there is an active ko anywhe
Seems to me like you could fix that in the policy too by providing an input
feature plane that indicates the value of a draw, whether 0 as normal, or
-1 for must-win, or -1/3 for 3/1/0, or 1 for only-need-not-lose, etc.
Then just play games with a variety of values for this parameter in your
self-
Actually this pretty much solves the whole issue right? Of course the proof
would be to actually test it out, but it seems to me a pretty
straightforward solution, not nontrivial at all.
On Feb 13, 2018 10:52 AM, "David Wu" wrote:
Seems to me like you could fix that in the pol
of both players could be
> independent.
>
> In real life you usually have a good sense of how your opponent's
> "must-win" parameter is set, but that doesn't really apply here.
>
>
> On Tue, Feb 13, 2018 at 10:58 AM, David Wu wrote:
>
>> Actually t
It's not even just liberties and semeai, it's also eyes. Consider for
example a large dragon that has miai for 2 eyes in distant locations, and
the opponent then takes one of them - you'd like the policy net to now
suggest the other eye-making move far away. And you'd also like the value
net to dis
That was very interesting, thanks for sharing!
On Fri, Mar 2, 2018 at 9:50 AM, wrote:
> Hi,
> somebody asked me to post my views on 9x9 go on the list based on my
> experience with correspondence go on OGS and little Golem.
>
> I have been playing online correspondence tournaments since 2011 wit
Awesome!
I was going to complain that the color scheme had not enough contrast
between the white stones and the background (at least on my desktop, it was
hard to see the white stones clearly), and then I discovered the settings
menu already lets you change the background color. Cool! :)
On Wed,
For any interested people on this list who don't follow Leela Zero
discussion or reddit threads:
I recently released a paper on ways to improve the efficiency of
AlphaZero-like learning in Go. A variety of the ideas tried deviate a
little from "pure zero" (e.g. ladder detection, predicting board
o
er if this behavior could be avoided by giving a small incentive to
> win by the most points (or most material in chess) similar to to the
> technique mentioned by David Wu in KataGo a few days ago. The problem right
> now is that the AI has literally no reason to think that winning wit
On Fri, Mar 8, 2019 at 8:19 AM Darren Cook wrote:
> > Blog post:
> > https://blog.janestreet.com/accelerating-self-play-learning-in-go/
> > Paper: https://arxiv.org/abs/1902.10565
>
> I read the paper, and really enjoyed it: lots of different ideas being
> tried. I was especially satisfied to see
Thanks for sharing the link. Taking a brief look at this paper, I'm quite
confused about their methodology and their interpretation of their data.
For example in figure 2 (b), if I understand correctly, they plot Elo
ratings for three independent runs where they run the entire AlphaZero
process fo
Yep, global pooling is also how KataGo already handles multiple board sizes
with a single model. Convolution weights don't care about board size at all
since the filter dimensions have no dependence on it, so the only issue is
if you're using things like fully-connected layers, and those are often
Yes, it's fun to see suddenly a little cluster of people running strong 9x9
bots. :)
katab40s37-awsp3 is running on one AWS p3 2xlarge instance. So it's a
single V100, with some settings tuned appropriately for good performance on
that hardware and board size and time control (mainly, 96 threads),
Having it matter which of the stones you capture there is fascinating.
Thanks for the analysis - and thanks for "organizing" this 9x9 testing
party. :)
On Thu, May 7, 2020 at 12:06 PM Rémi Coulom wrote:
> If White recaptures the Ko, then Black can play at White's 56, capture the
> stone, and win
umber
of games now due to overplaying, and there's a good chance it does much
worse overall. Even so, I'm curious what will happen, and what the draw
rate will be. Suddenly having some 5-5 openings should certainly add some
variety to the games.
On Thu, May 7, 2020 at 12:41 PM David Wu w
of Deep Scholar
>
> On Fri, May 8, 2020 at 3:28 PM David Wu wrote:
>
>> I'm running a new account of KataGo that is set to bias towards
>> aggressive or difficult moves now (the same way it does in 19x19 handicap
>> games), to see what the effect is. Although, it see
On Fri, May 8, 2020 at 8:46 PM uurtamo . wrote:
> And this has no book, right? So it should be badly abused by a very good
> book?
>
>
Maybe!
But the version that was running before which went something like 48-52-1
(last time I counted it up) against the other top 3 bots that were rated
3300+ a
Very nice. :)
And thanks for the note about batch sizing. Specifically tuning parameters
for this level of strength on 9x9 seems like it could be quite valuable,
Kata definitely hasn't done that either.
But it feels like bots are very very close to optimal on 9x9. With some
dedicated work, more mo
@Hiroshi Yamashita - about rn - cool! Really neat to see people picking up
some of KataGo's methods and running them. Hopefully people will find ways
to improve them further.
--
Also, in case people weren't aware since I hadn't advertised it on this
list, KataGo has also been available for a
I imagine that at low visits at least, "ACT" behaves similarly to Leela
Zero's "LCB" move selection, which also has the effect of sometimes
selecting a move that is not the max-visits move, if its value estimate has
recently been found to be sufficiently larger to balance the fact that it
is lower
One tricky thing is that there are some major nonlinearities between
different bots early in the opening that break Elo model assumptions quite
blatantly at these higher levels.
The most noticeable case of this is with Mi Yuting's flying dagger joseki.
I've noticed for example that in particular m
On Fri, Jan 22, 2021 at 8:08 AM Rémi Coulom wrote:
> You are right that non-determinism and bot blind spots are a source of
> problems with Elo ratings. I add randomness to the openings, but it is
> still difficult to avoid repeating some patterns. I have just noticed that
> the two wins of Crazy
On Fri, Jan 22, 2021 at 3:45 AM Hiroshi Yamashita wrote:
> This kind of joseki is not good for Zero type. Ladder and capturing
> race are intricately combined. In AlphaGo(both version of AlphaGoZero
> and Master) published self-matches, this joseki is rare.
> -
Hi Claude - no, generally feeding liberty counts to neural networks doesn't
help as much as one would hope with ladders and sekis and large capturing
races.
The thing that is hard about ladders has nothing to do with liberties - a
trained net is perfectly capable of recognizing the atari, this is
do not do this, because of attempting to remain "zero", i.e. free as
much as possible from expert human knowledge or specialized feature
crafting.
On Fri, Jan 22, 2021 at 9:26 AM David Wu wrote:
> Hi Claude - no, generally feeding liberty counts to neural networks
> doesn't
On Sat, Jan 23, 2021 at 5:34 AM Darren Cook wrote:
> Each convolutional layer should spread the information across the board.
> I think alpha zero used 20 layers? So even 3x3 filters would tell you
> about the whole board - though the signal from the opposite corner of
> the board might end up a
41 matches
Mail list logo