Don Dailey wrote:
I was looking at many of the posts on the threads about how things
scale with humans and computers and I'm trying to reconcile many of
the various opinions and intuitions.  I think there were many
legitimate points brought up that I appeared to be brushing off.

In computations done by computer, there can usually be a trade-off
between time and memory.  In the discussions, we rarely talked about
memory and how it figures in to the picture.  A lot was said about
just knowing something (where a strong player looks at a position and
 instantly knows a weaker player made a mistake for instance) and the
feeling expresses by many was that this was a barrier that could not
be penetrated by thinking about the position no matter how much time was allowed.

Although I consider the evidence pretty strong for rating curve in
both humans and computer, the model of a fixed strength increase per
doubling is actually a simplification - in the real world it is more complicated than that.

It's important to realize that the ELO formula is based on
assumptions about human playing strength that are only
approximations.  One of those assumptions is that playing strength is
transitive and can be expressed as a single value - a number that we
call a persons "rating."

Nevertheless, intransitivity is a real thing.  The way we sometimes
erroneously think about GO is that you have some fixed strength
expressed as a kyu or dan "number" and that every move is a
reflection of this level of play.

A better model, which is still a simplification, is that a move is
either right or wrong and the stronger you are, the more likely you
will choose the better move. Some move are easy to find and the
weaker players find them, but on average you are faced with moves of
every level of difficulty and the difference between stronger and
weaker players is how many of these positions they solve - kind of
like a big test with a mixture of easy and hard problems and the one
that gets the most answers right wins!

From a purely theoretical point of view, a move really
is either best or not-best but as humans we judge moves on a sliding
scale of "goodness" and refer to some moves as being horrible and
others as being brilliant, good, second best, etc.  On this group we
recently discussed how to define error vs blunder and so on.

The intuition behind judging moves like this is that indeed, some
moves give you better practical chances in the real world.   So if
you are slightly losing, a move my be referred to as "a good try"
because it complicates things, or at least requires the opponent to
find a refutation that in human terms is difficult to find.

Sometimes a good player, or even a computer can instantly find the
right move where a weaker player has no clue and is not likely to
discover the correct principle even given several hours of
meditation.  This has been mentioned a number of times recently.
This an example of a chunk of knowledge having a profound effect on
the quality of a single move.  Even with computers it is possible
that a good life and death routine can discover things (more or less)
instantly that might take a very long time to find with a global
brute force search.

Because knowledge can be imperfectly and unevenly applied, one player
might play some types of positions much better than others.  So even
among players of roughly equal abilities, one player may see at a
glance what another player would have a very difficult time discerning.

What this causes in my opinion is instransitivity.  It doesn't cause
a player to stop improving substantially with time as many
experiments have proved.  But it's a known phenomenon that because of
intransitivity and these knowledge gaps, you might improve much more against a particular opponent (opponents just like yourself for
instance) and much less against other kinds of opponents.

But this is also about memory scalability.  Better players have more
knowledge about the game.  It's very difficult to measure knowledge
quantitatively in humans. How do you have twice as much knowledge in
Go?  How do you test this?  But it's clear that stronger players have
much more knowledge, probably much of it in the form of trained
intuition about go positions in the form of pattern recognition.
Some knowledge is expressed as cute little proverbs of wisdom such as
"the opponent's vital point is my vital point" among others.

Because no two players play alike, and especially computers and
humans, bits of knowledge and processing power have different scaling
characteristics.  Even a particular piece of knowledge could help you
more against one opponent that another.

So let me restate my feelings based on the above considerations:

1. Game playing skill is a function of time.

2. Memory (or knowledge) can proxy for time - saving enormous amounts
of time in many cases.

3. "Technique" is a function of knowledge and how it's organized -
which translates to a big time savings indirectly.  This is really
the ability to apply knowledge.

4. Because these various aspects of game playing ability can be mixed
and matched,  you are sure to get very interesting intransitives.

(snip)


I agree this is quite reasonable and very well said.  The intransitive
nature of ratings for players in complex games is evident. Much of what
you are describing accurately reflects my understanding and explains in
another way what I was trying to express earlier.

Let me take this reasoning a little further.  Rating systems - yes they
are one dimensional whereas reality strength is very much multi-dimensional. Ratings are not perfect, but do pretty good job especially for widely disparate ratings which rarely exhibit intransitivity. Where "widely disparate" could be defined as whatever rating difference is needed to achieve a, let's say, 99% win rate between the higher and lower rated player.

#1 - I completely agree with this; it is only the curve that is
different and in question for each type of player and individual player.

#2 & #3 - These are exactly what I was referring to and especially I
think apply in Go to a significantly greater extent than many other games. This is primarily because there is no relatively simple and reliable evaluation method in conjunction with the huge branching factors and deep play. Other games mentioned such as Chess, Checkers, Armimaa, Othello for example have simpler evaluations that work.

#4 - I completely agree.

Now with regard to #2 and #3, if these can be substitutes for enormous
amounts of time, which usually must be learned over whole games and real
experience (this is what I understand as 'enormous amounts of time'),
doesn't this support my position? When you say 'enormous amounts of time' are you talking about a few doublings of thinking time over one move or more like weeks or months of practice and experience and study, etc.?

The scalability experiment you are doing is very interesting. Though, like I mentioned a couple times I was not characterizing any computer-go engines.
  I wrote:
  > In computer-go where there are so many wildly different techniques
  > being used, some scalable to some degree or another and some not, it
  > doesn't make sense to make generalizations.  Whether a specific
  > program's scalability results in any improvements (linear or otherwise)
  > with time-doubling depends entirely on the algorithms and techniques
  > in use.

Of course we all know many computer-go programs don't scale extremely well. MC Go (and variants) scale fairly well but probably need a lot more knowledge and probably other methods built into them to become good as MoGo is showing. But this doesn't really say anything about human play. In fact, since Go would not succumb to standard game-search techniques, most computer-go programs used fairly simplistic models, pattern matching, combined with some reading - roughly attempting to emulate certain aspects of human play.

So I thought of another way to express some of what I was thinking. Humans play kind of like GNU Go only lots better and can think beyond what they've learned and learn as they play. GNU Go can't get dramatically better by thinking longer, only modestly better. Instead GNU Go must learn (i.e. be programmed with knowledge/techniques - #2 and #3) to get that much better (no offense to the GNU Go team). To some extent, humans are similar and may require #2 and #3 to improve playing skills dramatically and thus challenge a player with a "widely disparate" rating.

So I still question the premise about doubling thinking time for humans. But this is just a hypothesis also - I don't know the answer, but as I said before simply wanted to introduce some reasonable doubt, which any good scientist should always have...

However, I don't think this is a pointless discussion. Something might be gained by understanding whether Go is by its nature in some qualitative way different from other (perfect information) games because then we might use this to help focus program design on these challenges.

Matt
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to