mosche commented on PR #25187:
URL: https://github.com/apache/beam/pull/25187#issuecomment-1413542619
🤯 wth, below the output of the following statements, somehow `rdd.persist()`
corrupts the data 💥
```Java
dataset.persist(storageLevel);
System.out.println("\nContent of persisted dataset (1st eval)");
dataset.foreach(printValue);
System.out.println("\nContent of persisted dataset (2nd eval)");
dataset.foreach(printValue);
System.out.println("\nContent of rdd (1st eval)");
dataset.rdd().foreach(printValue);
System.out.println("\nContent of rdd (2nd eval)");
dataset.rdd().foreach(printValue);
System.out.println("\nContent of persisted rdd (1st eval)");
dataset.rdd().persist().foreach(printValue);
System.out.println("\nContent of persisted rdd (2nd eval)");
dataset.rdd().persist().foreach(printValue);
```
```
Content of persisted dataset (1st eval)
key=k3 {@1797166500}, value=[0] {@621459392}
key=k5 {@522613115}, value=[2147483647, -2147483648] {@50229760}
key=k1 {@935655811}, value=[3, 4] {@1059339408}
key=k2 {@1698407107}, value=[66, -33] {@1059339408}
Content of persisted dataset (2nd eval)
key=k3 {@491772294}, value=[0] {@1519705721}
key=k1 {@1801297116}, value=[3, 4] {@929237815}
key=k2 {@1278247606}, value=[66, -33] {@929237815}
key=k5 {@1642777670}, value=[2147483647, -2147483648] {@822592932}
Content of rdd (1st eval)
key=k5 {@1706950017}, value=[2147483647, -2147483648] {@935984436}
key=k1 {@830413362}, value=[3, 4] {@1431942185}
key=k3 {@1807383422}, value=[0] {@1220036903}
key=k2 {@1334173112}, value=[66, -33] {@1431942185}
Content of rdd (2nd eval)
key=k1 {@1977617218}, value=[3, 4] {@1966593054}
key=k2 {@131385719}, value=[66, -33] {@1966593054}
key=k3 {@966124643}, value=[0] {@4762028}
key=k5 {@1477114665}, value=[2147483647, -2147483648] {@942837752}
Content of persisted rdd (1st eval)
key=k3 {@2050847337}, value=[0] {@83038769}
key=k5 {@906982325}, value=[2147483647, -2147483648] {@714957170}
key=k1 {@654493203}, value=[66, -33] {@1747607045}
key=k2 {@287079803}, value=[66, -33] {@1747607045}
Content of persisted rdd (2nd eval)
key=k1 {@654493203}, value=[66, -33] {@1747607045}
key=k2 {@287079803}, value=[66, -33] {@1747607045}
key=k5 {@906982325}, value=[2147483647, -2147483648] {@714957170}
key=k3 {@2050847337}, value=[0] {@83038769}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]