[Computer-go] Why do Facebook use DNN next-move predictors? Shouldn't they use next-move generators?

2015-11-24 Thread Harald Korneliussen
When I read about Facebook's DCNN-using go program, I remembered another
paper that I'd come across on arxiv, namely "How (not) to train your
generative model: scheduled sampling, likelihood, adversary?" by Ferenc
Huszar (http://arxiv.org/pdf/1511.05101.pdf).

A lot of that paper went over my head (I am a "half-studied scoundrel" as
we say in Norway), but his speculation in the end, I think I sort of got,
and it made a lot of sense to me.

He argues that which side you approach the K-L divergence from so to say
matters for what kind of errors you get when the model, and that when
you're generating as opposed to predicting, the goal should be to minimize
the K-L divergence from the "other" way.

When you're using a DCNN in a go program, you are really doing generating,
not prediction, right? You want to generate a good move. A model that
generates "flashy" moves that LOOK really strong, but could potentially be
very bad, would be a good predictor, but a bad generator.

The ideal probability distribution is the distribution of moves a pro would
make. But to the degree your model falls short, you want to minimize the
chance of making a wildly "un-pro" move, rather than maximizing the chance
of making a "pro" move. Since these are probability distributions, those
two things are not the same unless your model is perfect (right?).

If my understanding is correct (and it's quite possible I'm way off course,
I'm an amateur! sorry for wasting your time if so!), then rather than
training a move predictor, they should use the adversarial methods which
are also in the wind now to train a generative model.

-- Harald Korneliussen
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[computer-go] Selling programs on Linux

2008-04-13 Thread Harald Korneliussen
Gian-Carlo Pascutto

Unfortunately I don't know how to sell support for my Go program, but I
am open to ideas.

Well, since you ask, there is one neat way you can sell your program
on Linux in an open source manner: Offer a so-called assurance
contract, such as the ones on fundable.org. The basic idea is that you
tell the world what you want for your program - not from each user,
but in total - and each of your potential customers tells how much
they're willing to pay. If the sum of offers is over what you want,
you take the money and release the program. If not, nobody pays
anything.

The advantage of this approach is that you can actually sell the
source (if you want), the objectively most valuable piece of your
program. In the windows world you almost have to keep that for
yourself to sell the binary. Linux users would also probably want this
- after all, if it's just a game that can defeat them they're after,
they can just use GnuGo or the Mogo binaries. It's the source which is
really valuable.

The disadvantage is that those users who aren't into comparison
shopping, and just randomly  walk into your side one day and thinks
hey, cool! - you probably won't get much from them. With a
fundable-style contract, you have to sell to all interested parties
simultaneously, and this requires a little marketing. Still, assurance
contracts are new enough that it should be possible to catch a little
attention.

I'd love to see a source code release of Leela, if even just a
snapshot, and I would pay for it. Thanks for Sjeng, by the way.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] The odd 13x13 graph

2008-02-22 Thread Harald Korneliussen
Don Dailey wrote:

 It's also interesting how the graph up to level 11 seems to form 2 very
 straight lines, almost as if they were connected at an angle.

 This must be a by-product of how we started the test. We played only
 the first 4 levels as we were testing the system and that is where the
 bend point is. Then I added levels gradually. I cannot figure
 out why this would cause such strange behavior.


As we're closing in on 2 games, that trend seems just as apparent:
It's not so much a curve as two lines, one below mogo4 and one
above... but are we looking at the right axis?

I wonder if it may have something to do with the anchor.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Bent four in the corner was:Scalability problem of play-out policies

2008-01-23 Thread Harald Korneliussen
Ivan Dubois mentioned  the bent four in the corner shape as a
scalability killer, a situation where more playouts doesn't help
(much), because playouts systematically misevaluate it. As I
understand it, it could be corrected in the tree, but this is very
unlikely to happen until the very end of a game, by which time it may
be too late (Mogo having worked the entire game for that solid 0.5
win, which turns out to be a solid loss instead because of the
life-and-death misevaluation)

I recalled a KGS game of Mogo I'd looked at, where something very
similar happened, and with a little digging I found it again:

http://files.gokgs.com/games/2007/12/1/Ardalan-MoGoBot3.sgf

It turns out it's not the bent four shape, but I suspect it's
another such shape, where more playouts only confirm that these moves
aren't worth including into the tree, so that UCT catches them very
late, if ever.

If these situations can be reliably created by a human, then indeed
they put an upper limit on the real world scalability of a program.

If I should propose a hackish heuristic to deal with such situations,
this is it: At one point, when the problematic shape appeared, the
human must have done a move that to the computer seemed horribly bad.
Why did he do that? Doesn't he see that my shape is alive?. When
such situations occur, there are two possibilities:

1. The bot is playing a weaker human player, and the move is indeed bad.
2. The bot is playing a stronger human, and the move is actually good.

I think it may be a good idea to do something with the weighting in
these situations, so that the relevant moves are added to the tree. In
worst case, a lot of effort is wasted in proving a bad move bad - but
this should not be so serious, as the bad move will likely mean the
opponent has poor chances of winning anyway. In the best case, the
program's blunder is revealed after the fact. This may still leave
little chance of winning, (if the ld error was severe) but at least
the program's counting won't be off for the rest of the game. Since
today's programs don't care for winning margins, counting errors by
even a single point will spell disaster.

I believe this heurtistic would be cheap in terms of computational
cost, but hard to evaluate/tune. Self-play would not be very
effective...
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] The intelligence can of worms reopened

2008-01-16 Thread Harald Korneliussen
In the thread On average how many board updates/sec can top Go
programs do these days? mingwu said of the way MC/UCT programs work
that he'd hardly call it intelligent.
I've thought (and argued elsewhere) that the MC/UCT approach is
fundamentally more intelligent, in the sense of working more like
human intelligence apperas to do, than traditional game tree search.
How did humans learn to play Go? Well, they tried things, figured out
what worked and what didn't. And that is what a UCT seach does as
well... the main differences is that it's much worse at
generalization, and (for that reason) throws away most of what it's
learned from game to game. Still, it plays well with vastly less a
priori assumptions/knowledge than a traditional minimax-type search.
There is some, of course, but humans also rely on some a priori
knowledge according to the philosophers :-)

To illustrate it a bit, I know of a game in which UCT's advantages are
even more striking. The game designer Nicholas Bentley recently won an
award for a game he calls Mind Ninja (silly name, I know, but bear
with me: it's a very interesting game). It's a generalized
connection/shape-building game, which begins with a negotiation phase,
where one player proposes the pattern to be built, and the other
decides whether he will be the builder or the blocker.
There are extremely many possible patterns. On the regular board,
there are 127 points, which can be empty or coloured - the number of
colours are also part of the pattern rule. By a pattern rule, each
possible position must be either a win for the builder or not (the
blocker wins if there are no more legal moves and the builder hasn't
won). Each partitioning of positions into wins for the builder and
undecided positions/wins for the blocker can constitute a pattern
rule.

Now, I believe MC/UCT can play this game, and play it well. (I'm
working on an implementation). It will adapt to pattern rules it has
never seen before, just like a human. It doesn't need any hints in
the form of evaluation functions, like chess programs use - in fact,
the nature of the game makes applying that approach pretty much
impossible. Like a human, it can consider a pattern rule, and try to
figure out which side has the advantage. Humans can certainly choose
rules which would give them an advantage (I imagine nim-like pattern
rules could work), but with trial and error, so could the program -
and the game has a way of balancing that advantage, too.

Would a MN-playing program be intelligent? I'm pretty sure they would
have said so ten years ago, at least :-) But regardless, I think it
could illustrate the potential of this new algorithm.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Re: language efficiency

2007-12-18 Thread Harald Korneliussen
Some thinking out loud here on the topic of languages and efficiency:

I'd like to know how well MoGo would have played if you let it think
for a week for every move. Only it seems to me that is not possible,
because I don't think MoGo will run for a week without crashing.
Crazystone also crashes quite a lot, if I understand the comments in
KGS logs correctly.

You got to wonder. One thing is move quality reduced by traditional
bugs, but when the program periodically is forced to restart and throw
away its calculations because of a crash, it seems to me there's a lot
to gain from using a more reliable programming language.

Also, do we really know all we need to know about algorithmic
approaches? If yes, then it's about implementing them as efficiently
as possible, to maximise the number of playouts per second... C or C++
is probably a good choice. But if not, it probably matters a lot more
about how flexible it is, how easy it is to try out different
approaches. If it runs ten times as slow as the C version, why not
just let it think for ten times as long? That is affordable when
experimenting, right?

But I have to admit, I don't know exactly how I'd go about
implementing a transposition table in Haskell :-/ Perhaps I'll try for
a libEGO binding, then at least if there are stability problems, I can
blame someone else ;-)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Lisp time

2007-12-14 Thread Harald Korneliussen
Don Dailey wrote:

By the way,   I am no fan of C.   I don't like C and have tried some of
the languages on your list of languages that are supposedly faster than
C.

I think you must be getting your information from the web pages for
those languages.   As a general rule any reasonably fast language is
going to claim to be faster than C but shame on you if you believe it.

In theory all languages are equal.   They can all be transformed to
optimized machine code.   I am not talking about the theoretical,  I'm
talking about the reality.And the reality is that right now C is the
choice, whether I like it or not I accept it and hope something better
will come along.

You are right, but there are some cases where alternatives may make sense.
For instance, look at the paper Ian Osgood linked to: They used Haskell to
spit out highly optimized code for Monte Carlo-simulations. There is also
FFTW, which makes very fast FFT code with the help of OCaml.
Code generation is one area where using another language might be
considered. It doesn't have to be a functional one, or even C: I know of a
cryptographic algorithm (Serpent) that has an implementation based on
Perl-generated Ada code.

Prototyping may also be an idea. Especially if one is experimenting
with novel approaches, wouldn't it make sense to make a prototype
in Haskell, for instance? Then you may reimplement it in C, very carefully,
comparing outputs regularly to check for the playing strength-killing bugs
Chess programmers always talk about...

Occasionally there may be some language feature that makes up for the
performance cost. Like the distributed nature of Erlang, or the
software transactional memory libraries that GHC Haskell come with.
If you believe the hype, STM scales much better than locking-based
shared memory.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] low-hanging fruit - yose

2007-12-13 Thread Harald Korneliussen
Mark Boon wrote:
Let me therefore change the discussion a bit to see if this will make
things more clear. Consider a chess-playing program with an
unorthodox search method. When playing a human after while it
announces check-mate in thirty-four moves. Yet the human can clearly
see it's check-mate in four. Ah, one could say, but the computer is
100% confident winning either way so it doesn't care which one it
chooses. It doesn't matter whether a human thinks mate in four is
more beautiful.

Now it so happens that with chess we're pretty confident that when a
chess-program announces check-mate that it is in fact correct. But
what if there could be a sliver of doubt? Maybe the program has no
doubt, but the programmer might. Bugs happen. Wouldn't you say it's
better to choose the shorter mate sequence? Regardless of whether
humans may find it more beautiful?

Any probabilistic algorithm should of course prefer a quick win to a
tedious one, since there is less assumptions it has to make, the less
likely it is that one of them is disastrously wrong. But wouldn't they
actually take that into account in their win estimates?

Anyway, turns to win is a completely different measure than score at end.

But I found something that may interest you. Three researchers at a
Paris university, Julien Kloetzer Hiroyuki Iida and Bruno Bouzy, implemented
a UCT program for playing Amazons. They found that for that game,
performance increased when the algorithm took into account winning
margin in addition to win/loss.

I wonder what that means. Perhaps greedy
strategies work better in Amazons than in Go?

The program lost big in the olympiad, though, against a more traditional
program, so it might be premature to draw conclusions...

Here's the research paper:
www.math-info.univ-paris5.fr/~bouzy/publications/KIB-MCAmazons-CGW07.pdf
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] How does MC do with ladders?

2007-12-12 Thread Harald Korneliussen
Raymond Wold wrote:

I can code an algorithm that evaluates simple ladders correctly.

I'll repeat that. I can code a program that reads ladders better than a
pure MC program without knowledge of ladders. I can beat it. Human
knowledge programmed into a computer that does that one thing, that
basic go skill, better than the MC program.

Are you saying that there is absolutely no way to combine such with an
MC program to make it better? Not just that no one has done it (I don't
know if anyone has) but that it is impossible? Are you saying that
attempts to do so are wasted? If you are, I'd appreciate it if you did
so clearly.

Complaining that MC programs don't read ladders well is a bit like
complaining that Forrest Gump can't tie his shoelaces, it seems to me.
There will be many things an MC program will be good at that we won't
be, and vice versa. It's not unreasonable to believe that progress
with MC/UCT programs will be through making most of its strengths,
rather than try to patch its weaknesses in a way not compatible with
how their brains work.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] How does MC do with ladders?

2007-12-12 Thread Harald Korneliussen
Wed, 12 Dec 2007 07:14:48 -0800 (PST) terry mcintyre wrote:

Heading back to the central idea, of tuning the predicted winning
rates and evaluations: it might be useful to examine lost games, look
for divergence between expectations and reality, repair the predictor,
and test the new predictor against a large database of such blunders.

Sounds a little like Temporal Difference Learning to me. I understand
both MoGo and Crazystone use patterns, do anyone know whether they use
such machine learning techniques to assign weights to them?
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/