Re: Fwd: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-14 Thread Unknown
On Wed, 2007-02-14 at 10:45 +0100, Erik van der Werf wrote:

  From: Nic Schraudolph [EMAIL PROTECTED]
  Date: 12 February 2007 10:45:50 GMT+11:00
  To: computer-go computer-go@computer-go.org
  Subject: Re: [computer-go] Zobrist hashing with easy transformation
  comparison
 
  did you read Anti Huima's paper? The idea looks similar, but
  unfortunately it does not work. I provided a proof of the defect
  on this list (end of 2002 if I remember well). It's not that easy
  to get a working scheme. In fact there is only one combination
  with 8 chunks of data. In 2002 I exchanged a lot of email with
  Nici Schraudolph, and we found the right scheme. We wanted to
  write a paper, but we did not (it takes time, and I had not that
  much time - mathematic and computer go is just a hobby for me).
 
  The bad news: a colleague here has proven that no standard
  segmented Zobrist hash can work in less than 8 chunks - and that's
  without color symmetry. That makes 16 chunks with color, not very
  attractive!
 
  The good news: I've developed a defect-free scheme including color
  symmetry that works in only 6 chunks, and has a very efficient way
  to compute a canonical hash (that is: without having to compute all
  8/16 candidates). No contradiction to the above proof as it's not a
  standard Zobrist hash. This is provably the minimal scheme.
 
  Do sit on me - I really need to finish writing this up! In the
  meantime, as long as you don't need color, the 8-chunk scheme Erik
  posted works fine. Nick's (if I understood it correctly - I took
  r_x, r_y, r_xy to mean reflection about vertical, horizontal, and
  diagonal axis, respectively) has a problem though: for all
  positions p, r_x(r_xy(r_x(r_xy(p == p. Huima's scheme can't
  work because it's really a 4-chunk scheme doubled up for color
  symmetry.

Me too:

#define T_X(h) \
( (((h)  0x) 1 ) \
| (((h)  0x) 1 ) \
)
/* { 2,3,0,1,6,7,4,3} */
#define T_Y(h) \
( ( ((h)  0x) 2 ) \
| ( ((h)  0x) 2 ) \
)
/* { 3,1,2,0,7,5,6,4} */
#define T_P(h) \
( ( ((h)  0x) ) \
| ( ((h)  0x) 3 ) \
| ( ((h)  0x) 3 ) \
)
/* { 0,2,1,3,4,6,5,7} */
#define T_Q(h) \
( ( ((h)  0x) ) \
| ( ((h)  0x) 1 ) \
| ( ((h)  0x) 1 ) \
)

#include stdio.h
#include stdlib.h


int main()
{
unsigned long val0, val1, val2, val3;
unsigned aa,bb,cc;
unsigned at,bt;
char * legend[4] = { (X) ,(Y) ,(P) , (Q) };

val0 = 0x12345678;

at = bt = 4;
for (aa=0; aa 4; aa++ ) {
switch(aa) {
case 0:val1 = T_X(val0); break;
case 1:val1 = T_Y(val0); break;
case 2:val1 = T_P(val0); break;
case 3:val1 = T_Q(val0); break;
}
for (bb=0; bb 4; bb++ ) {
if (bb==aa) continue;
if (bb  bt) at =4;
switch(bb) {
case 0:val2 = T_X(val1); break;
case 1:val2 = T_Y(val1); break;
case 2:val2 = T_P(val1); break;
case 3:val2 = T_Q(val1); break;
}
for (cc=0; cc 4; cc++ ) {
if (cc==bb) continue;
switch(cc) {
case 0:val3 = T_X(val2); break;
case 1:val3 = T_Y(val2); break;
case 2:val3 = T_P(val2); break;
case 3:val3 = T_Q(val2); break;
}
if (aa!=at) fprintf(stdout,%8lx -%s- %8lx
,val0,legend[aa],val1);
else fprintf(stdout, -%s- 
,legend[aa], val1);

if (bb!=bt) fprintf(stdout, -%s- %8lx
, legend[bb],val2);
else fprintf(stdout, -%s- 
, legend[bb]);
fprintf(stdout, -%s- %8lx\n,legend[cc],val3);
at = aa; bt = bb; ;
}
}
}

exit(0);
}

/**
Legend: T_X() and T_Y() transform the hash value, when flipping the X
and Y coordinates; T_P() and T_Q()
flipping over the diagonals. The rest is trivial.

HTH,
AvK


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-13 Thread Jacques Basaldúa

Hi Erik


And what distinguishes database look up from things like transposition
table look up? Why wouldn't one use database look up during tree
search?


The interest in rotation/mirror. In database search, what is good for
a position is good for the mirror/rotation. In tree search, rotation
of the full board does not happen, and if it does (in a very simple
position) its pointless to detect because it is not the same position.


4. The analysis based on uniform distributions are
 fundamentally flawed because the move sequences are not
 random in go, particularly, the BWBWBW sequence can only
 be avoided by pass moves which do not occur in real games
 globally before the end of the game.



So?


Of course. The most important superko that really happens in strong
games is triple ko. Detecting triple ko with probability=1 and a 32
bit hash is not a trivial question. And it requires B keys to have
some difference with W keys. Note that to be _sure_ that a move is 
legal, you only have to check 6x32 bit keys. That would be impossible 
if the hashes were pure random. You simply cannot control any 
combination of 19x19x2=722 hashes of length 6, but you can control 
BWBWBW even of length 7. Since the test is faster than any another and 
is also, unlike the others, with error probability=0 it is simply the 
best. Of course, it does not consider board repetitions of longer 
cycles. I have searched for (the possibility of) such cycles in my 50K 
games database and did not find any. I would love to see a game where 
both players are at least shodan, in which a superko other than triple 
ko limits the legal options. If someone has such a game, please post it.



I'm speculating here because I seem to be missing some context, but
AFAICS almost any Go-specific application will have more legal moves
hash keys than bits for the hash. 


Of course. That's because full global search is intractable. But local
search with 16 empty cells (which can be either B or W, and therefore
you only have 32 keys) is not. That's what I mean when I say when it 
makes sense. Again, I do use 64 bit hashes.



And my favorite:

if the hash calculation is a major factor in the performance of 
your program, you should seriously consider adding some knowledge.


We have seen this when talking about programming language. Each time
someone cares about details and builds programs that are optimized
from the day of their birth, they are:

 a. Dirty and hard to maintain. (IMO only patched programs are
hard to maintain and they are patched because important issues
were thought too late.)

 b. Caring about stupid issues == ignoring Big Algorithmic
Improvements. (IMO If you don't really care about details you
don't really control the big picture. And more important, you
only have to do it once, if you do it well. That will give you a 
lot of time for thinking about algorithms.)



Jacques.


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-12 Thread Jacques Basaldúa

Erik van der Werf wrote:

And then we get another small questions with a 
dangerous answer...


1. What makes a question big or small is not a personal
preference, but the number of millions times it happens
during a game.

1a. Database lookup. Happens very few times (Unfortunately,
I must say, because I like the knowledge integration
part and I think it is one of my strong points.)

2a. Local search + legal moves identification (see superko)
This happens billions of times per hour of go computing.

Therefore, finding optimal solutions for 1a is a small question
and for 1b its is not a small question at all. Of course, if your
program is hyper-knowledge driven or does not search, this may 
be different.


2. Using an unnecessarily big hash key slows down the program.
On a 32 bit platform, for using a 64 bit key you have to choose 
between bad (2x32) and worse (FPU 64 bit integers). Due to this
slow down, you will fail to complete searches you could have 
done with a smaller hash. Therefore, the program may suffer 
more from code obesity than from collisions. Don't take this
as if I didn't use 64 bit keys. I do. But only when it is a 
good idea.


3. In the cases where a 32 key is enough, a local search 
with 32 stones or detecting superko, it works with error=0

and therefore, it is better than any hash size. 0 is smaller
than 2^-64.

4. The analysis based on uniform distributions are 
fundamentally flawed because the move sequences are not 
random in go, particularly, the BWBWBW sequence can only

be avoided by pass moves which do not occur in real games
globally before the end of the game.

And finally a question: Why do you admit that having two 
identical hashes is a flaw (which is obvious) and do not

admit that having 32 hashes not forming a base is not?

When you have 31 independent hashes, the probability
that a new key forms a base is 1/2. If it does, you can
use this set with error probability = 0, if it doesn't
you will get collisions because the new key is not 
independent. It's the same as having two identical keys.
If the key is dependent, reject it and get another random 
key. That does not change any other statistical properties.

It could have been a random set, many sets pass as they are
others get some correction.

Jacques.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-12 Thread Erik van der Werf

On 2/12/07, Jacques Basaldúa [EMAIL PROTECTED] wrote:

Erik van der Werf wrote:

 And then we get another small questions with a
 dangerous answer...

1. What makes a question big or small is not a personal
preference, but the number of millions times it happens
during a game.


Can't take a joke? IMO informed hash key generation is interesting
from an academic viewpoint, and a fun exercise. However, in my
experience anything to do with this subject tends to lead only to
relatively small over-all improvements (unless you have a crappy
implementation to start with), and when not used with great care and a
good understanding of the consequences of the domain-specific
optimizations then it easily weakens the performance. To give an
example, if you optimize your keys for global tree-search (using,
e.g., the BCH construction), I would not be surprised if they perform
sub-par on things like opening book construction, or shape hashing.



 1a. Database lookup. Happens very few times (Unfortunately,
 I must say, because I like the knowledge integration
 part and I think it is one of my strong points.)


And what distinguishes database look up from things like transposition
table look up? Why wouldn't one use database look up during tree
search?




 2a. Local search + legal moves identification (see superko)
 This happens billions of times per hour of go computing.


Forget about using hashes for superko/legal move detection, it's a
non-issue. For this you should only use hashes as a last resort.



 Therefore, finding optimal solutions for 1a is a small question
and for 1b its is not a small question at all. Of course, if your
program is hyper-knowledge driven or does not search, this may
be different.

2. Using an unnecessarily big hash key slows down the program.


Hardly. IMO, if the hash calculation is a major factor in the
performance of your program, you should seriously consider adding some
knowledge.

In Migos I had 96-bit hashes and the update time was never an
important factor in the performance. Moreover, I could recompute all
relevant symmetrical hashes from scratch whenever needed without
severely hurting the performance.


On a 32 bit platform, for using a 64 bit key you have to choose
between bad (2x32) and worse (FPU 64 bit integers). Due to this
slow down, you will fail to complete searches you could have
done with a smaller hash. Therefore, the program may suffer
more from code obesity than from collisions. Don't take this
as if I didn't use 64 bit keys. I do. But only when it is a
good idea.


And what fraction of your searches is affected by your new
superior-strength hash keys?


snip


0 is smaller than 2^-64.


Thanks for sharing that :-)



4. The analysis based on uniform distributions are
fundamentally flawed because the move sequences are not
random in go, particularly, the BWBWBW sequence can only
be avoided by pass moves which do not occur in real games
globally before the end of the game.


So?



And finally a question: Why do you admit that having two
identical hashes is a flaw (which is obvious) and do not
admit that having 32 hashes not forming a base is not?


I have no idea what you're talking about here.



When you have 31 independent hashes, the probability
that a new key forms a base is 1/2. If it does, you can
use this set with error probability = 0, if it doesn't
you will get collisions because the new key is not
independent. It's the same as having two identical keys.
If the key is dependent, reject it and get another random
key. That does not change any other statistical properties.
It could have been a random set, many sets pass as they are
others get some correction.


I'm speculating here because I seem to be missing some context, but
AFAICS almost any Go-specific application will have more legal moves /
hash keys than bits for the hash. Consequently for all practical
purposes you will never be able to set up a complete independent set
of hash keys, and in the rare case where you could do it (e.g., on a
small board) you might just as well store the complete state as a
hash.

E.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-11 Thread Antoine de Maricourt



Please do.
I will put it on a web page. But I need some time. My job keeps me very 
busy right now.

But I'm not sure I
will post the statistical analysis (it was almost ten hand writen pages,
and I'm not sure I still have them).

Have You performed an empirical test for collisions?
No, analysis was analytic. I've used the scheme in different ways, and 
since I knew were was the defect I put extra code to protect from the 
defect. This proved to be usefull... I was able to catch collisions at 
low rate in practice, but this rate would have been unacceptable if I 
had not been able to detect them.


The defect is as follow: if you have 2 different board configurations, 
the probability that they have the same hash key can be as low as 1/256 
(for a 64-bit key) if the difference between the 2 configurations has 
self symmetries. Anti Huima's scheme had the same defect, except the 
probability was 1. That's why I've been able to isolate it: I always had 
collisions between the same positions, and it didn't depend on the way 
random bits were generated.


Antoine
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-11 Thread Antoine de Maricourt



On 2/10/07, Łukasz Lew [EMAIL PROTECTED] wrote:

On 2/10/07, Antoine de Maricourt [EMAIL PROTECTED] wrote:
 If there is strong interest, I can post the scheme.
Please do.


Since Antoine claims there is only on solution I might as well post 
mine ;-)


mirroring: [abcdefgh] - [hgfedcba]
rotation: [abcdefgh] - [cdefghab]

This scheme follows trivially from dividing the square in 8 triangular
regions, and assigning each a letter. If you want to include color
symmetry you need to change the operators (xor doesn't work any more)
or increase the number of segments.

Erik


This is one out of the 3 possibilities that were left once we eliminated 
obvious defects (the ones the original proposal by Anti Huima suffered). 
However, if my analysis was right, this scheme was the one that 
introduces the biggest weakness in the key. That's why I didn't keep it.


Antoine.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-10 Thread Erik van der Werf

On 2/10/07, Łukasz Lew [EMAIL PROTECTED] wrote:

On 2/10/07, Antoine de Maricourt [EMAIL PROTECTED] wrote:
 If there is strong interest, I can post the scheme.
Please do.


Since Antoine claims there is only on solution I might as well post mine ;-)

mirroring: [abcdefgh] - [hgfedcba]
rotation: [abcdefgh] - [cdefghab]

This scheme follows trivially from dividing the square in 8 triangular
regions, and assigning each a letter. If you want to include color
symmetry you need to change the operators (xor doesn't work any more)
or increase the number of segments.

Erik
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-10 Thread Jacques Basaldúa

As Antoine de Maricourt says, this weakens the key.
I think it is a serious problem and it is a dangerous answer to a small
question. I compute hashes and patterns for database lookup eight
at a time, which is faster (much more optimizable) than one
after the other. Then I also use the smallest as the canonical
and simply discard the others. Since database lookup is so fast
that's not a problem at all. Other situations where hashes are needed
do not require verifying rotations, so the advantage of Nick's idea
is very small (IMO).

But the question is: Does someone do the opposite, i.e. playing
with the hash values to make then *stronger*? Even with 32 bit
hashes, triple ko can be detected with probability = 1 for any
sequence of alternating BWBW or WBWB of length less than 8.
(Tripe Ko is a sequence of length 6) I don't know if that is published.
In fact, unless you have a computer allowing to store a 2^64 bit
table, strong collision analysis can only be done for 32 bit hashes
(With my hardware where I can manage a 4 gigabit table decently.
Checking that all 4x4x2(colors) squares in a 19x19 array form a
base and replacing the invalid hashes as needed takes about 15
hours on a P4D at 3.4GHz) Of course, once you have a strong
32 bit hash set you can fill the higher 32 bits with random bits
to have a 64 bit set.

Jacques.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-10 Thread Erik van der Werf

On 2/10/07, Jacques Basaldúa [EMAIL PROTECTED] wrote:
snip

But the question is: Does someone do the opposite, i.e. playing
with the hash values to make then *stronger*?


And then we get another small questions with a dangerous answer...

Just search the archive for BCH construction.

E.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-09 Thread Łukasz Lew

On 2/10/07, Antoine de Maricourt [EMAIL PROTECTED] wrote:

Hi,

did you read Anti Huima's paper? The idea looks similar, but
unfortunately it does not work. I provided a proof of the defect on this
list (end of 2002 if I remember well). It's not that easy to get a
working scheme. In fact there is only one combination with 8 chunks of
data. In 2002 I exchanged a lot of email with Nici Schraudolph, and we
found the right scheme. We wanted to write a paper, but we did not (it
takes time, and I had not that much time - mathematic and computer go is
just a hobby for me).

After having the right scheme, the tricky part is to perform a
statistical analysis: unfortunately introducing constraints to deal with
symetries weakens the hash key. The probability of collision becomes non
uniform and depends on the board configuration. In short: if you take
two different random board configurations, then the probability that
they have the same key becomes higher if one of the configuration has
self symetries.

If there is strong interest, I can post the scheme.

Please do.


But I'm not sure I
will post the statistical analysis (it was almost ten hand writen pages,
and I'm not sure I still have them).

Have You performed an empirical test for collisions?

Best,
Łukasz Lew



Antoine.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Zobrist hashing with easy transformation comparison

2007-02-09 Thread Nick Apperson

Sorry for posting then,  I didn't realize that it had been tried.  I may
work through the problem and try to get it to work in order to fully
understand why it in fact does not work.  If by some miracle I manage to get
something working with a collision rate of 1/(2^61) I'll certainly post it.
Thanks for the info.

- Nick

On 2/9/07, Erik van der Werf [EMAIL PROTECTED] wrote:


Nick,

The basic idea of what you're describing is well known. It was first
published by Antti Huima several years ago. Unfortunately though, his
implementation was flawed. I didn't check your code but likely it
suffers from a similar defect. It is possible to fix the defects in
Huima's scheme. If you ignore colour symmetry it's relatively easy to
find a defect-free xor-based scheme that works with 8 segments. The
more interesting part is in finding the minimal number of segments,
and then proving correctness.

Nic Schraudolph has done some interesting work on this, but as far as
I know he still hasn't published it. Maybe if we all sit on him we can
finally get him to finish it :-)

Erik


On 2/9/07, Nick Apperson [EMAIL PROTECTED] wrote:
 Hey all,

 I was thinking about our conversation and i decided to design a
zobrist
 class that allows for easy comparison to check and see if 2 different
 zobrist values are equivalent after a rotation etc...  It updates the
 zobrist in such a way that it can transform them and after trying 8
 possiblilities, it can determine if they are equal.  There are actually
 quite a few optimizations that could be performed on it, but I just
wanted
 to post the basic idea.   See the following source code for the basic
idea:

  http://www.nicholasapperson.com/computergo/zobrist.h

  Essentially it works so that it hashes the played stone as each of the
 rotated forms and stores that in each of the 8 bytes used.  The bytes
are
 arranged in an order such that flipping the 2 DWORDs results in the
zobrist
 value for if there was a rotation in the x axis, flipping WORDs results
in a
 reflection across the y axis and swapping bytes results in a reflection
 across the y=x line.  Order obviously matters in the case of xy so the
 transformation is assumed to take place after the other
reflections.  For
 example:

  if we have 0x1234567890ABCDEF as a zobrist value,

  0x90ABCDEF12345678 would be the zobrist of the position with a
reflection
 across the x-axis


  Anyway, I'm sure there is a bug or two, but I wanted to get your alls
 thoughts.

  - Nick
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/