> It's a new area and the systems are very complicated. What kind of
> theory are you after, and what would you like it to tell you?

Currently, what seems to happen is (no offense intended, and please
correct me if my incomplete view is wildly off!-):

- have an idea for a great improvement (one thinks..)
- implement&test
- evaluate result statistics
- be surprised by what happens (and happy or sad with the results)
- commit to or abandon the improvement
- try something else

What I'd like to see more of is what seems to work well in science
everywhere:

(a) Understanding emerging from that loop. In particular, one should
    be able to see not just that the change was successful or not, but why.
    And where to look for the next improvement.

(b) Understanding feeding into that loop. In particular, one should be
    able to predict not just that the change is clever (and hence should
    be useful by rights;-), but how it will interact with what is there and
    why that interaction should result in an improvement.

Just the usual model vs reality/theory vs practice thing: having either is
nice, but having both and having them interact is what really gets the
ball rolling. Experiments result in hypotheses, hypotheses result in
experiments, after a while a coherent model emerges that guides
experimentation and is developed or changed to reflect experiments.

>> Currently, it seems as if even many of the tricks that seem to work
>> wonderfully well don't really have a solid basis, so any tournament
>> could throw up a game that puts the whole thing into doubt, by
>> showing a hole in the test coverage big enough to drive a truck
>> through (once pros and others know what to look for). Or not.
>
> Is testing based on a large number of games not solid enough? I don't
> see any alternative with such complicated systems.

The mere experiment-driven approach is likely to accumulate features
until their interaction makes further progress too slow or too fragile to
be practical (just as a merely theory-driven approach is likely to
accumulate pretty concepts that deviate further and further from
useful reality). Combining theory and practice helps to overcome
the limitations of both.

Having theoretic models of what is going on should make it easier
to design a consistent engine (hypotheses are more malleable than
fragile features accumulated over expensive experiments) or to guide
new develpments (tests tell you "yes, it works" or "no, try again"; a
theory might tell you "given these known results, you could try looking
for something else about here").

Abstraction also helps to manage complexity: to take the standard
example for Monte-Carlo, a circle may be described by a large
number of experiments or by a simple formula (and once you suspect
that you're looking at a circle, you have a framework in which to
interpret the experiments; just as a few more directed experiments
will tell you whether trying to interpret the results as a square makes
any sense;-). Without a theory, you can add an arbitrary number of
tests, hoping for something useful to emerge, but how can you tell
whether your test are even in a useful area?

> Another point:
>
> We deliberately restrict the complexity of the generative model (the
> playout function) by keeping it simple, and show that it works on a
> large, representative number of positions. Because the generative
> model is so simple, we can expect the performance we see in private
> testing to be realized in real games.
>
> We need not live in constant fear, at least to the degree that I think
> you are implying.

Of course I was exaggerating!-) But even in the few weeks I've been
looking into this topic so far, I've seen bots lose by playing against
superko in tournaments. And in simple random playouts, after eliminating
ko candidates, I see about 20 superkos in every 10k 9x9 runs, ranging
from under 10 to over 30. I've heard about bots crashing occasionally
until their authors accounted for unexpectedly large game lengths.

Then there is the bias in light playouts, apparently already well known
here, but shrugged off because it seems to work well enough after
Amaf and especially tree search have been added. Or perhaps one
needs to move to heavy playouts? Or just add more computing power?

There are necessary moves that playouts with certain eye definitions
will never even consider (so they can also not emerge as best moves).
Again, that seems to be known, but adding tree search takes care of
that, doesn't it? Well, yes, starting with a tree that considers every
move, then using playouts only to evaluate those moves, will move
the playout blindness one move away, but it is still there, at the tree
horizon. What is the effect of that?

Perhaps it is a double disadvantage that the standard techniques
work so well at the moment, and scale up even better with more
resources to add depth and breadth. Think of ManyFaces: 15
years have passed between its first and its most recent wins against
Monte-Carlo programs, but while the first wins were easy, this
year's wins were not - they involved change and catching up. If
ManyFaces hadn't been so far ahead of Gobble, it might have
incorporated those ideas earlier.

Claus

PS When did ManyFaces first play?



_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to