Timothee Mathieu <timothee.math...@inria.fr> writes: > Hello, > > Sorry I did not believe that the precise values were relevant but here they > are. > The average cumulative reward (score) of the agent for exactly the same > script and using Guix for the environment is 1658.3733235021457 on Arch > Laptop and 1820.325441905902 on the Ubuntu one. But I think due to the > feedback loop of such simulator (if there is a small difference in the action > at time t, it can imply a large difference at the end of the process) this > could be due to a small difference in the computations.
Hi Timothee, this is not really what I had in mind. Reproducibility of Guix is simply that the produced outputs written to the gnu store are the same given same inputs. That an arbitrary script ran produces different value is not something that can be mapped to that easily. Hence I would be interested to see a diff between the store closures of both shells, starting with difference in hashes of output paths, and ending with diff of their contents. One common issue can be with different locale, some of the outputs will depend on the locale. Especially if guix was once installed with the guix install script and once with the distro package manager, there could be a discrepancy. I don't know how library you're using is determining what algorithm to choose for randomness, so I cannot say if differences in the outputs that only change because of locale can matter. If there is no difference in store closure of the shell (meaning all paths referenced by the shell), you're likely not debugging an issue of Guix, but of something else. Ie. the library you're using could be using different random implementations based on something that is available on one distro, but not on the other. Regards Rutherther