Re: Analysis different on different CPU architectures

Eran Lambooij Tue, 28 Jan 2025 22:35:54 -0800

Hi thanks for the swift reply,

> Could you explain what a and b you compare ? The final "Error total EMG" 
> values only ? Or more of them, up to the equity of every best move for 
> instance and declare the analysis corrupt if any of them differ too much ?


I compare every probability in the analysis and if any of them differs too much
I declare the analysis corrupt. This was under the (maybe naive) assumption
that the analysis of a match is stable (which it is under the same CPU 
architecture).
I opted for a rather strict equivalence to be able to ensure proper analysis by
clients and weed out corrupt clients as fast as possible.

> The factors you mention below, and some others, will indeed lead to 
> different results. In the case of the final error total I would guess 
> that a discrepancy of more than 0.001 is very common but 0.020 quite 
> rare.
The 0.020 is for any float in the move analysis (for the 'summary' I used a 
higher
threshold). I can do some tests if needed with different values, but currently I
don't see the value in that.

>> As you can imagine this is surprising behaviour. I would love to hear 
>> your thoughts, and if you need more information I am more than happy 
>> to help with debugging. I have quite some knowledge on HPC, working 
>> with floating point computations as well as SIMD, etc. If you would 
>> ask me I would first look into SIMD, as that is the most likely place 
>> for slight differences between the different implementations. It might 
>> also just be a configuration issue, but I am starting to doubt that.
> 
> Scalar vs. 4-way SIMD like SSE2 or NEON vs. 8-way AVX will cause small 
> differences.
> 
> Another factor is that by default gnubg is built with the -ffast-math 
> option. You could try to build it without it and see how much it helps 
> for accuracy. It would be slower but not by too much.

I currently use the ubuntu gnubg package in the docker image. Is there 
any guide (or maybe a docker image) on how to set up a proper gnubg 
environment? If I remember correctly I also need to create
the bear-off databases? and you probably also need to set some extra
environment variables.

> Another source of differences is that for sorting moves the libc qsort() 
> function is used. It may differ slightly from OS to OS and is instable 
> anyway.
> 
> This may lead to different choices of moves when there are exactly equal 
> equities (in late bearoff or earlier if moves a followed by a 
> double/pass).
> 
> It might be useful to use another, stable, algorithm that may even be 
> faster if it is more suited to the kind of short arrays of moves it 
> would be applied.

I would use a sorting network, there are some really fast implementations for
those and they should be stable. You can easily map moves + equities to 
integers and use that for fast comparison.

Eran

Re: Analysis different on different CPU architectures

Reply via email to