Hi, I'm kind of shocked I think I found a fix.
I looked a little bit, and it seems like the random number generator behaves differently between s390x and x86 Eventually the test calls dask.util.random_state_data(1, 42) on x86 it starts with this on two different machines: In [9]: random_state_data(1, 0) Out[9]: [array([2357136044, 2546248239, 3071714933, 3626093760, 2588848963, 3684848379, 2340255427, 3638918503, 1819583497, 2678185683, 2774094101, 1650906866, 1879422756, 1277901399, 3830135878, 243580376, 4138900056, 1171049868, 1646868794, 2051556033, 3400433126, 3488238119, 2271586391, 2061486254, 2439732824, 1686997841, 3975407269, 3590930969, 305097549, 1449105480, 374217481, 2783877012, 86837363, 1581585360, 3576074995, and on s390x it starts with In [16]: random_state_data(1, 42) Out[16]: [array([1725751647, 3007179467, 1543725811, 244708654, 1794401211, 1205115335, 3165536665, 338480024, 1725362215, 2031624818, 3527536423, 3606353689, 1251073550, 3394605429, 1471725021, 1961717077, 1672274585, 1743491620, 2537440437, 2191564966, 2500216069, 889024526, 16993528, 1474942136, 3958708949, 2650621168, 635132726, 2164863744, 3205794862, 3146973694, 345633582, 2688816030, 3419529805, 961385884, 359683718, Digging further there's a place where they convert some random bytes into unsigned int 32s using numpy.frombuffer. I added some code to force little endian byte order on the buffer, and after that random_state_data returns the same list of values on s390x. So hopefully will then generate the same random values as x86_64. Roughly the likely change. =================================================================== --- dask-2023.8.0+dfsg.orig/dask/utils.py +++ dask-2023.8.0+dfsg/dask/utils.py @@ -426,7 +426,9 @@ def random_state_data(n: int, random_sta random_state = np.random.RandomState(random_state) random_data = random_state.bytes(624 * n * 4) # `n * 624` 32-bit integers - l = list(np.frombuffer(random_data, dtype=np.uint32).reshape((n, - 1))) + dt = np.dtype(np.uint32) + dt = dt.newbyteorder("<") + l = list(np.frombuffer(random_data, dtype=dt).reshape((n, -1))) assert len(l) == n return l
signature.asc
Description: This is a digitally signed message part