Hi, I'm kind of shocked I think I found a fix.
I looked a little bit, and it seems like the random number generator
behaves differently between s390x and x86
Eventually the test calls dask.util.random_state_data(1, 42)
on x86 it starts with this on two different machines:
In [9]: random_state_data(1, 0)
Out[9]:
[array([2357136044, 2546248239, 3071714933, 3626093760, 2588848963,
3684848379, 2340255427, 3638918503, 1819583497, 2678185683,
2774094101, 1650906866, 1879422756, 1277901399, 3830135878,
243580376, 4138900056, 1171049868, 1646868794, 2051556033,
3400433126, 3488238119, 2271586391, 2061486254, 2439732824,
1686997841, 3975407269, 3590930969, 305097549, 1449105480,
374217481, 2783877012, 86837363, 1581585360, 3576074995,
and on s390x it starts with
In [16]: random_state_data(1, 42)
Out[16]:
[array([1725751647, 3007179467, 1543725811, 244708654, 1794401211,
1205115335, 3165536665, 338480024, 1725362215, 2031624818,
3527536423, 3606353689, 1251073550, 3394605429, 1471725021,
1961717077, 1672274585, 1743491620, 2537440437, 2191564966,
2500216069, 889024526, 16993528, 1474942136, 3958708949,
2650621168, 635132726, 2164863744, 3205794862, 3146973694,
345633582, 2688816030, 3419529805, 961385884, 359683718,
Digging further there's a place where they convert some random bytes
into unsigned int 32s using numpy.frombuffer.
I added some code to force little endian byte order on the buffer, and
after that random_state_data returns the same list of values on s390x.
So hopefully will then generate the same random values as x86_64.
Roughly the likely change.
===================================================================
--- dask-2023.8.0+dfsg.orig/dask/utils.py
+++ dask-2023.8.0+dfsg/dask/utils.py
@@ -426,7 +426,9 @@ def random_state_data(n: int, random_sta
random_state = np.random.RandomState(random_state)
random_data = random_state.bytes(624 * n * 4) # `n * 624` 32-bit
integers
- l = list(np.frombuffer(random_data, dtype=np.uint32).reshape((n, -
1)))
+ dt = np.dtype(np.uint32)
+ dt = dt.newbyteorder("<")
+ l = list(np.frombuffer(random_data, dtype=dt).reshape((n, -1)))
assert len(l) == n
return l
signature.asc
Description: This is a digitally signed message part

