Hi Aljoschaļ¼
The java.lang.Array hashCode depends on the reference instead of the
content. If the keyBy field contains an array, Two records are hash-partitioned
to different stream although their keys are equal.
int a1[] = new int[]{1, 2}; // hashcode is : 5592464
int a2[] = new int[]{1, 2}; // hashcode is 1830712962
streaming job example:
Tuple2<byte[], Integer>[] soures = new Tuple2[]{new Tuple2("a".getBytes(), 2),
new Tuple2("a".getBytes(), 5)};
final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
env.fromElements(soures)
.keyBy(0)
.sum(1)
.map(new MapFunction<Tuple2<byte[], Integer>, Tuple2<String, Integer>>() {
@Override
public Tuple2<String, Integer> map(Tuple2<byte[], Integer> value) throws
Exception {
return new Tuple2<>(new String(value.f0), value.f1);
}
}).print();
env.execute();
Expected result is: (a, 7), not the actual result. What do you think
about this case?
Best Regards!
Xu Pingyong