Re: [Computer-go] Patterns and bad shape

2017-04-17 Thread David Wu
net put about 96% probability on a and only 1.03% on b. And that even after nearly 1 million simulations had basically not searched b at all. On Mon, Apr 17, 2017 at 9:04 AM, David Wu <lightvec...@gmail.com> wrote: > I highly doubt that learning refutations to pertinent bad shape

Re: [Computer-go] Patterns and bad shape

2017-04-17 Thread David Wu
ple unless it's outright buggy. On Mon, Apr 17, 2017 at 11:25 AM, Stefan Kaitschick <skaitsch...@gmail.com> wrote: > > On Mon, Apr 17, 2017 at 3:04 PM, David Wu <lightvec...@gmail.com> wrote: > >> To some degree this maybe means Leela is insufficiently explorative in &g

[Computer-go] Possible idea - decay old simulations?

2017-07-23 Thread David Wu
I've been using Leela 0.10.0 for analysis quite often, and I've noticed something that might lead to an improvement for the search, and maybe also for other MCTS programs. Sometimes, after putting hundreds of thousands of simulations into a few possible moves, Leela will appear to favor one,

Re: [Computer-go] Possible idea - decay old simulations?

2017-07-24 Thread David Wu
Thanks for the replies! On Mon, Jul 24, 2017 at 9:30 AM, Gian-Carlo Pascutto <g...@sjeng.org> wrote: > On 23-07-17 18:24, David Wu wrote: > > Has anyone tried this sort of idea before? > > I haven't tried it, but (with the computer chess hat on) these kind of > proposals

Re: [Computer-go] Possible idea - decay old simulations?

2017-07-24 Thread David Wu
Cool, thanks. On Mon, Jul 24, 2017 at 10:30 AM, Gian-Carlo Pascutto <g...@sjeng.org> wrote: > On 24-07-17 16:07, David Wu wrote: > > Hmm. Why would discounting make things worse? Do you mean that you > > want the top move to drop off slower (i.e. for the bot to take

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread David Wu
in that tree). Once ahead, of course it can do whatever it wants that preserves the win. On Sun, Aug 6, 2017 at 10:31 AM, David Wu <lightvec...@gmail.com> wrote: > * A little but not really. > * No, and as far as we can tell, never. Even 7x7 is not rigorously solved. > * Unknown. >

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread David Wu
Saying in an unqualified way that AlphaGo is brute force is wrong in the spirit of the question. Assuming AlphaGo uses a typical variant of MCTS, it is technically correct. The reason it's technically correct uninteresting is because the bias introduced by a policy net is so extreme that it might

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread David Wu
* A little but not really. * No, and as far as we can tell, never. Even 7x7 is not rigorously solved. * Unknown. * Against Go-God (plays move that maximizes score margin, breaking ties by some measure of the entropy needed to build the proof tree relative to a human-pro-level policy net), I guess

[Computer-go] Neural nets for Go - chain pooling?

2017-08-18 Thread David Wu
While browsing the online, I found an interesting idea "chain pooling" presented here: https://github.com/jmgilmer/GoCNN The idea is to have some early layers that perform a max-pool across solidly-connected stones. I could also imagine it being useful to perform a sum. So the input would be a

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread David Wu
od probabilities of winning in games where the game hinges on the life and death chances of a single large dragon, and where the expected score could be wildly uncorrelated with the probability of winning. On Mon, May 22, 2017 at 9:39 PM, David Wu <lightvec...@gmail.com> wrote: > Leela

Re: [Computer-go] mcts and tactics

2017-12-19 Thread David Wu
I wouldn't find it so surprising if eventually the 20 or 40 block networks develop a set of convolutional channels that traces possible ladders diagonally across the board. If it had enough examples of ladders of different lengths, including selfplay games where game-critical ladders "failed to be

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-06 Thread David Wu
Hex: https://arxiv.org/pdf/1705.08439.pdf This is not on a 19x19 board, and it was not tested against the current state of the art (Mohex 1.0 was the state of the art at its time, but is at least several years old now, I think), but they do get several hundred elo points stronger than this old

Re: [Computer-go] CGOS Real-time game viewer

2018-05-30 Thread David Wu
Awesome! I was going to complain that the color scheme had not enough contrast between the white stones and the background (at least on my desktop, it was hard to see the white stones clearly), and then I discovered the settings menu already lets you change the background color. Cool! :) On

Re: [Computer-go] Source code (Was: Reducing network size? (Was: AlphaGo Zero))

2017-10-27 Thread David Wu
I suspect the reason they were able to reasonably train a value net with multiple komi at the same time was because the training games they used in that paper were generated by a pure policy net, rather than by a MCTS player, where the policy net was trained from human games. Although humans give

Re: [Computer-go] AGZ Policy Head

2017-12-29 Thread David Wu
As far as a purely convolutional approach, I think you *can* do better by adding some global connectivity. Generally speaking, there should be some value in global connectivity for things like upweighting the probability of playing ko threats anywhere on the board when there is an active ko

Re: [Computer-go] MCTS with win-draw-loss scores

2018-02-13 Thread David Wu
Seems to me like you could fix that in the policy too by providing an input feature plane that indicates the value of a draw, whether 0 as normal, or -1 for must-win, or -1/3 for 3/1/0, or 1 for only-need-not-lose, etc. Then just play games with a variety of values for this parameter in your

Re: [Computer-go] MCTS with win-draw-loss scores

2018-02-13 Thread David Wu
Actually this pretty much solves the whole issue right? Of course the proof would be to actually test it out, but it seems to me a pretty straightforward solution, not nontrivial at all. On Feb 13, 2018 10:52 AM, "David Wu" <lightvec...@gmail.com> wrote: Seems to me li

Re: [Computer-go] MCTS with win-draw-loss scores

2018-02-13 Thread David Wu
gt; independent. > > In real life you usually have a good sense of how your opponent's > "must-win" parameter is set, but that doesn't really apply here. > > > On Tue, Feb 13, 2018 at 10:58 AM, David Wu <lightvec...@gmail.com> wrote: > >> Actua

Re: [Computer-go] Crazy Stone is back

2018-02-28 Thread David Wu
It's not even just liberties and semeai, it's also eyes. Consider for example a large dragon that has miai for 2 eyes in distant locations, and the opponent then takes one of them - you'd like the policy net to now suggest the other eye-making move far away. And you'd also like the value net to

Re: [Computer-go] Some personal thoughts on 9x9 computer go

2018-03-02 Thread David Wu
That was very interesting, thanks for sharing! On Fri, Mar 2, 2018 at 9:50 AM, wrote: > Hi, > somebody asked me to post my views on 9x9 go on the list based on my > experience with correspondence go on OGS and little Golem. > > I have been playing online correspondence

Re: [Computer-go] Hyper-Parameter Sweep on AlphaZero General

2019-03-24 Thread David Wu
Thanks for sharing the link. Taking a brief look at this paper, I'm quite confused about their methodology and their interpretation of their data. For example in figure 2 (b), if I understand correctly, they plot Elo ratings for three independent runs where they run the entire AlphaZero process

[Computer-go] Accelerating Self-Play Learning in Go

2019-03-03 Thread David Wu
For any interested people on this list who don't follow Leela Zero discussion or reddit threads: I recently released a paper on ways to improve the efficiency of AlphaZero-like learning in Go. A variety of the ideas tried deviate a little from "pure zero" (e.g. ladder detection, predicting board

Re: [Computer-go] 0.5-point wins in Go vs Extremely slow LeelaChessZero wins

2019-03-05 Thread David Wu
if this behavior could be avoided by giving a small incentive to > win by the most points (or most material in chess) similar to to the > technique mentioned by David Wu in KataGo a few days ago. The problem right > now is that the AI has literally no reason to think that winning with mor

Re: [Computer-go] Accelerating Self-Play Learning in Go

2019-03-08 Thread David Wu
On Fri, Mar 8, 2019 at 8:19 AM Darren Cook wrote: > > Blog post: > > https://blog.janestreet.com/accelerating-self-play-learning-in-go/ > > Paper: https://arxiv.org/abs/1902.10565 > > I read the paper, and really enjoyed it: lots of different ideas being > tried. I was especially satisfied to

Re: [Computer-go] Polygames: Improved Zero Learning

2020-02-02 Thread David Wu
Yep, global pooling is also how KataGo already handles multiple board sizes with a single model. Convolution weights don't care about board size at all since the filter dimensions have no dependence on it, so the only issue is if you're using things like fully-connected layers, and those are often

Re: [Computer-go] 30% faster with a batch size of 63 instead of 64!

2020-05-09 Thread David Wu
Very nice. :) And thanks for the note about batch sizing. Specifically tuning parameters for this level of strength on 9x9 seems like it could be quite valuable, Kata definitely hasn't done that either. But it feels like bots are very very close to optimal on 9x9. With some dedicated work, more

[Computer-go] katago 1.4.2 (top open-source 9x9 and 19x19 bot?)

2020-05-16 Thread David Wu
@Hiroshi Yamashita - about rn - cool! Really neat to see people picking up some of KataGo's methods and running them. Hopefully people will find ways to improve them further. -- Also, in case people weren't aware since I hadn't advertised it on this list, KataGo has also been available for a

Re: [Computer-go] Crazy Stone is playing on CGOS 9x9

2020-05-07 Thread David Wu
Having it matter which of the stones you capture there is fascinating. Thanks for the analysis - and thanks for "organizing" this 9x9 testing party. :) On Thu, May 7, 2020 at 12:06 PM Rémi Coulom wrote: > If White recaptures the Ko, then Black can play at White's 56, capture the > stone, and

Re: [Computer-go] Crazy Stone is playing on CGOS 9x9

2020-05-07 Thread David Wu
Yes, it's fun to see suddenly a little cluster of people running strong 9x9 bots. :) katab40s37-awsp3 is running on one AWS p3 2xlarge instance. So it's a single V100, with some settings tuned appropriately for good performance on that hardware and board size and time control (mainly, 96

Re: [Computer-go] Crazy Stone is playing on CGOS 9x9

2020-05-08 Thread David Wu
of games now due to overplaying, and there's a good chance it does much worse overall. Even so, I'm curious what will happen, and what the draw rate will be. Suddenly having some 5-5 openings should certainly add some variety to the games. On Thu, May 7, 2020 at 12:41 PM David Wu wrote: > Hav

Re: [Computer-go] Crazy Stone is playing on CGOS 9x9

2020-05-08 Thread David Wu
On Fri, May 8, 2020 at 8:46 PM uurtamo . wrote: > And this has no book, right? So it should be badly abused by a very good > book? > > Maybe! But the version that was running before which went something like 48-52-1 (last time I counted it up) against the other top 3 bots that were rated 3300+

Re: [Computer-go] Crazy Stone is playing on CGOS 9x9

2020-05-08 Thread David Wu
ep Scholar > > On Fri, May 8, 2020 at 3:28 PM David Wu wrote: > >> I'm running a new account of KataGo that is set to bias towards >> aggressive or difficult moves now (the same way it does in 19x19 handicap >> games), to see what the effect is. Although, it seems like

Re: [Computer-go] Monte-Carlo Tree Search as Regularized Policy Optimization

2020-07-19 Thread David Wu
I imagine that at low visits at least, "ACT" behaves similarly to Leela Zero's "LCB" move selection, which also has the effect of sometimes selecting a move that is not the max-visits move, if its value estimate has recently been found to be sufficiently larger to balance the fact that it is lower

[Computer-go] Ladder and zeroness (was: Re: CGOS source on github)

2021-01-23 Thread David Wu
On Sat, Jan 23, 2021 at 5:34 AM Darren Cook wrote: > Each convolutional layer should spread the information across the board. > I think alpha zero used 20 layers? So even 3x3 filters would tell you > about the whole board - though the signal from the opposite corner of > the board might end up a

Re: [Computer-go] CGOS source on github

2021-01-21 Thread David Wu
One tricky thing is that there are some major nonlinearities between different bots early in the opening that break Elo model assumptions quite blatantly at these higher levels. The most noticeable case of this is with Mi Yuting's flying dagger joseki. I've noticed for example that in particular

Re: [Computer-go] CGOS source on github

2021-01-22 Thread David Wu
do not do this, because of attempting to remain "zero", i.e. free as much as possible from expert human knowledge or specialized feature crafting. On Fri, Jan 22, 2021 at 9:26 AM David Wu wrote: > Hi Claude - no, generally feeding liberty counts to neural networks > doesn't help

Re: [Computer-go] CGOS source on github

2021-01-22 Thread David Wu
On Fri, Jan 22, 2021 at 3:45 AM Hiroshi Yamashita wrote: > This kind of joseki is not good for Zero type. Ladder and capturing > race are intricately combined. In AlphaGo(both version of AlphaGoZero > and Master) published self-matches, this joseki is rare. >

Re: [Computer-go] CGOS source on github

2021-01-22 Thread David Wu
Hi Claude - no, generally feeding liberty counts to neural networks doesn't help as much as one would hope with ladders and sekis and large capturing races. The thing that is hard about ladders has nothing to do with liberties - a trained net is perfectly capable of recognizing the atari, this is

Re: [Computer-go] CGOS source on github

2021-01-22 Thread David Wu
On Fri, Jan 22, 2021 at 8:08 AM Rémi Coulom wrote: > You are right that non-determinism and bot blind spots are a source of > problems with Elo ratings. I add randomness to the openings, but it is > still difficult to avoid repeating some patterns. I have just noticed that > the two wins of