Hi Lukasz

In Rémi's paper about Bradly Terry models he found a way to give a
comparable gamma score to things that were different, for instance:
capturing a stone v.s. a given 3x3 pattern.
His model is much more general, but has less patterns (not at all
around 200K patterns of my system). Additionally, my mask pattern
only system has a reasonable a priori value:

times_played / (times_played + times_postponed).

But there is an additional issue handled by Rémi's system that is
ignored by the naif system: The fact that some competitions are
much harder than others. If the sum of concurrent scores to the
average competition is, say 100, the value of winning a competition
of sum 40 is obviously much less than winning one of sum 300. That
is not included in the simple formula. Therefore, a simple intuitive
idea if adding more for a the hard competition 300/100 and less for
the easy one 40/100. Because there is a feedback as when you change
the scores you are also changing the sums of concurrent scores, there
are so many variables to adjust ~200K and the a priori values are
already reasonable, I prefer to adjust doing small steps since
this is an offline process and speed is irrelevant.

The hits of each move don't change that much.

At the beginning of the adjustment:

Hits of move 1 = 0.382405 acc = 0.382405
Hits of move 2 = 0.113530 acc = 0.495935
Hits of move 3 = 0.067430 acc = 0.563365
Hits of move 4 = 0.048055 acc = 0.611420
Hits of move 5 = 0.036304 acc = 0.647725
Hits of move 6 = 0.028978 acc = 0.676703
Hits of move 7 = 0.023769 acc = 0.700472
Hits of move 8 = 0.019793 acc = 0.720264
Hits of move 9 = 0.016693 acc = 0.736957
Hits of move 10 = 0.014164 acc = 0.751121


At the end of the adjustment: (Additional steps don't
improve further, some value may even decrease.)

Hits of move 1 = 0.409278 acc = 0.409278
Hits of move 2 = 0.115598 acc = 0.524877
Hits of move 3 = 0.062659 acc = 0.587536
Hits of move 4 = 0.044014 acc = 0.631550
Hits of move 5 = 0.033681 acc = 0.665230
Hits of move 6 = 0.027696 acc = 0.692926
Hits of move 7 = 0.022885 acc = 0.715811
Hits of move 8 = 0.019029 acc = 0.734840
Hits of move 9 = 0.015758 acc = 0.750598
Hits of move 10 = 0.012886 acc = 0.763484

What I call the distribution is this: The gamma value of each legal
move defines a probability distribution function over the legal moves.
If we had no information (uniform random distribution) we would expect
that with 20% of the legal moves we hint 20% of the times, etc. So
for 0.1, 0.2 .. 0.9 of the legal moves we hit around 0.1, 0.2 .. 0.9.
Now, what happens when we use our pdf over the legal moves? We can
get 10%, 20% 30%, etc with very few move candidates. But, if the
first move has say 0.17 and the 1+2 = 0.21 and I want to get 0.2
If the first is a match, count 1, if the second is a match count
3/4 = (0.2 - 0.17)/(0.21 - 0.17) if any other move is a match count 0.
Do I really get 0.1, 0.2, .. 0.9, of course, with much less moves ?

No. for a reason I don't understand, I get something like:

Distribution fit expected 0.1 found 0.153164
Distribution fit expected 0.2 found 0.298602
Distribution fit expected 0.3 found 0.433074
Distribution fit expected 0.4 found 0.551575
Distribution fit expected 0.5 found 0.651135
Distribution fit expected 0.6 found 0.727419
Distribution fit expected 0.7 found 0.776876
Distribution fit expected 0.8 found 0.804008
Distribution fit expected 0.9 found 0.823292

So my distribution is distorted, when I try to get 30% of
the "guessing chance" I get 43.3%, but when I try to get
90% I get 82.3%. I don't know why.


Jacques.

Łukasz Lew wrote:

On Sat, Mar 29, 2008 at 3:47 PM, Jacques Basaldúa <[EMAIL PROTECTED]> wrote:
Mark Boon wrote:

> I suppose there's no reason why  it has to be assembler,
> you could just as well generate C-code.

I don't think so. But I don't write that much asm as it may appear.
I see what the compiler writes when I debug and write critical
parts only. And automatically generated code.


> I don't see yet how you go from there to finding patterns.

Depending on the mode, it computes either the hash or the mask.
That mask has to be translated into urgency by a table search.

(Btw. urgency is not just:

  times_played / (times_played + times_postponed)

but that value is used as an a priori value for Bradley Terry
models as Rémi introduced. My adjustment is not as advanced
as what Rémi describes. I just give a weight to a competition
based on the sum of the competing gamma values, and that gives
me another point:

SUM weights_of_played / SUM weights_of_played + weights_of_postponed

Can you tell how you came with such an equation?
Does it improves much?

Thanks
Lukasz

And I advance a step (like 1/4 of the distance) in that direction
and use the new point as the current value. Iterating this way
improves the wins of the best move but distorts the distribution
(I extract lots of statistics at each step) so I stop rather early.)

The macro for searching in an ordered list is an efficient
implementation of the "canonical method".

<Copy/paste from Numerical Recipes in C ISBN 0-521-43 108-5
(c) Cambridge University Press 1988-1992>

void locate(float xx[], unsigned long n, float x, unsigned long *j)
{
  unsigned long ju,jm,jl;
  int ascnd;
  jl=0; //Initialize lower
  ju=n+1; //and upper limits.
  ascnd=(xx[n] >= xx[1]);
  while (ju-jl > 1) { // If we are not yet done,
      jm=(ju+jl) >> 1; //compute a midpoint,
      if (x >= xx[jm] == ascnd)
          jl=jm; // and replace either the lower limit
      else
          ju=jm; // or the upper limit, as appropriate.
  } // Repeat until the test condition is satisfied.
  if (x == xx[1]) *j=1; // Then set the output
  else if(x == xx[n]) *j=n-1;
  else *j=jl;
} // and return.

</Copy/paste>

Improvement 1: Instead of if/else, use conditional moves.

Improvement 2. Since when the value is not found, the while
loop has a fixed number of iterations for a fixed table size,
unroll the loop as a sequence of iterations. As mentioned,
an iteration has only 8 instructions.

Improvement 3: When the value is found, break.
(That wouldn't come for free in C it would
require a additional "if (x = xx[jm] )" )


      mov    eax, ebx
      add    eax, ecx
      shr    eax, 1
      and    al,  0fch       ;; Align to a valid pointer
      cmp    edx, [eax]
      jz    &name&out
      cmovc   ecx, eax    ;; This is true if [eax] > value
      cmovnc  ebx, eax    ;; This is true if the previous is false

Here ecx is a pointer to xx[jl], ebx is a pointer to xx[ju],
edx is x and eax is a pointer to xx[jm].


> I assume you allow some of the points to be undefined
> (also called 'don't care points') but I don't see how. And
> if you allow undefined points, why would you need masks
> of smaller manhatten distances?

I don't use don't care. I mask code in 2 bits: void, own,
opponent, out_of_board. This produces bigger database size,
because the only way to introduce "don't care" is to repeat the
pattern. Since my patterns are learned automatically and database
size is not a problem at all, I didn't see the need of don't care.

The bigger patterns have the disadvantage that around move 150
2/3 of the legal moves have unknown patterns. The smaller distances
improve this. I don't know yet if that is required, because patterns are
mainly intended to break the ice, when complex  shapes emerge, the
moves suggested by the search may be much better candidates.



Jacques.
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/



_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to