I think the idea is that SerializableCoder should be updated to expect that all values it encodes do implement equals() since this seems to be the much more common case then classes that don't implement a useful equals. It would be possible to add a useful check to DirectRunner that any value that says its consistent with equals actually obeys its contract.
On Mon, Nov 27, 2017 at 4:03 PM, Eugene Kirpichov <[email protected]> wrote: > Not sure where you see the contradiction? consistentWithEquals says > "Whenever the encoded bytes of two values are equal, then the original > values are equal according to {@code Objects.equals()}." - which is clearly > false for Serializable's in general: it's possible that serialized form of > "a" and "b" is the same bytes, but !a.equals(b), e.g. if this class does > not implement equals() or if it uses reference equality. > > On Mon, Nov 27, 2017 at 3:55 PM Mairbek Khadikov <[email protected]> > wrote: > >> Hi all, >> >> Currently SerializableCoder#consistentWithEquals returns false, which >> contradicts it's own documentation "{@link SerializableCoder} does not >> guarantee a deterministic encoding, as Java serialization may produce >> different binary encodings for two equivalent objects". >> >> In practice, it leads to failures in org.apache.beam.sdk.util. >> MutationDetectors$CodedValueMutationDetector. For example, some >> libraries cache hash values https://github.com/google/protobuf/pull/3939/ >> files#diff-24c442a23a43e041e4f50d324638fedbR142 >> >> I'm thinking about changing SerializableCoder#consistentWithEquals to >> return true, so that the structural value would be the object itself >> instead of it the serialized bytes. Thoughts? >> >> WIP branch https://github.com/mairbek/beam/tree/ser >> >> Sincerely, >> Mairbek >> >> -- >> >> Mairbek Khadikov | Software Engineer | [email protected] >> >>
