Hi,

I'm kind of shocked I think I found a fix.

I looked a little bit, and it seems like the random number generator
behaves differently between s390x and x86

Eventually the test calls dask.util.random_state_data(1, 42)

on x86 it starts with this on two different machines:

In [9]: random_state_data(1, 0)
Out[9]: 
[array([2357136044, 2546248239, 3071714933, 3626093760, 2588848963,
        3684848379, 2340255427, 3638918503, 1819583497, 2678185683,
        2774094101, 1650906866, 1879422756, 1277901399, 3830135878,
         243580376, 4138900056, 1171049868, 1646868794, 2051556033,
        3400433126, 3488238119, 2271586391, 2061486254, 2439732824,
        1686997841, 3975407269, 3590930969,  305097549, 1449105480,
         374217481, 2783877012,   86837363, 1581585360, 3576074995,


and on s390x it starts with 

In [16]: random_state_data(1, 42)
Out[16]: 
[array([1725751647, 3007179467, 1543725811,  244708654, 1794401211,
        1205115335, 3165536665,  338480024, 1725362215, 2031624818,
        3527536423, 3606353689, 1251073550, 3394605429, 1471725021,
        1961717077, 1672274585, 1743491620, 2537440437, 2191564966,
        2500216069,  889024526,   16993528, 1474942136, 3958708949,
        2650621168,  635132726, 2164863744, 3205794862, 3146973694,
         345633582, 2688816030, 3419529805,  961385884,  359683718,


Digging further there's a place where they convert some random bytes
into unsigned int 32s using numpy.frombuffer.

I added some code to force little endian byte order on the buffer, and
after that random_state_data returns the same list of values on s390x.
So hopefully will then generate the same random values as x86_64.

Roughly the likely change.

===================================================================
--- dask-2023.8.0+dfsg.orig/dask/utils.py
+++ dask-2023.8.0+dfsg/dask/utils.py
@@ -426,7 +426,9 @@ def random_state_data(n: int, random_sta
         random_state = np.random.RandomState(random_state)
 
     random_data = random_state.bytes(624 * n * 4)  # `n * 624` 32-bit
integers
-    l = list(np.frombuffer(random_data, dtype=np.uint32).reshape((n, -
1)))
+    dt = np.dtype(np.uint32)
+    dt = dt.newbyteorder("<")
+    l = list(np.frombuffer(random_data, dtype=dt).reshape((n, -1)))
     assert len(l) == n
     return l

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to