Timothee Mathieu <timothee.math...@inria.fr> writes:

> Hello,
>
> Sorry I did not believe that the precise values were relevant but here they 
> are. 
> The average cumulative reward (score) of the agent for exactly the same 
> script and using Guix for the environment is 1658.3733235021457 on Arch 
> Laptop and 1820.325441905902 on the Ubuntu one. But I think due to the 
> feedback loop of such simulator (if there is a small difference in the action 
> at time t, it can imply a large difference at the end of the process) this 
> could be due to a small difference in the computations.

Hi Timothee,

this is not really what I had in mind. Reproducibility of Guix is simply
that the produced outputs written to the gnu store are the same given
same inputs. That an arbitrary script ran produces different value is
not something that can be mapped to that easily. Hence I would be
interested to see a diff between the store closures of both shells,
starting with difference in hashes of output paths, and ending with diff
of their contents. One common issue can be with different locale, some
of the outputs will depend on the locale. Especially if guix was once
installed with the guix install script and once with the distro package
manager, there could be a discrepancy. I don't know how library you're
using is determining what algorithm to choose for randomness, so I
cannot say if differences in the outputs that only change because of
locale can matter.

If there is no difference in store closure of the shell (meaning all
paths referenced by the shell), you're likely not debugging an issue of
Guix, but of something else. Ie. the library you're using could be using
different random implementations based on something that is available on
one distro, but not on the other.

Regards
Rutherther

Reply via email to