[
https://issues.apache.org/jira/browse/BEAM-5866?focusedWorklogId=159980&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-159980
]
ASF GitHub Bot logged work on BEAM-5866:
----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Oct/18 13:08
Start Date: 29/Oct/18 13:08
Worklog Time Spent: 10m
Work Description: kanterov commented on issue #6845: [BEAM-5866] Fix
`Row#equals`
URL: https://github.com/apache/beam/pull/6845#issuecomment-433903088
@kennknowles I was thinking about this problem over the weekend. Everything
becomes more simple with `OurBytesTypeWithGoodEquals`. I pushed the
implementation in
https://github.com/kanterov/beam/commit/08f300e800ef056238360dcea33a47f95ddb5d3d.
I think overall it's the best approach.
**API** It preserves the ability to change internal representation without
any surprises or breaking APIs.
**Correctness** It's clear from the beginning what is
`Map<OurBytesTypeWithGoodEquals, T>`, no matter how it's implemented.
`Map<byte[], ?>` is tricky, and trivial implementation is buggy.
**Performance** I did JMH benchmarks, it's similar to `deepEquals`:
```
RowBenchmark.rowWithStorageAndDeepEquals thrpt 10 0.014 ± 0.001
ops/ns
RowBenchmark.rowWithStorageAndByteArrayWrapper thrpt 10 0.015 ± 0.001
ops/ns
RowBenchmark.rowWithGetterAndPojoEquals thrpt 10 0.022 ± 0.001
ops/ns
```
Interesting learning is that the most performant approach would be
generating POJOs based on a schema, this could be done without breaking
proposed API.
**Simplicity** `Row` implementation is more simple because it doesn't need
custom `deepEquals`, or specialized collection implementations.
**Map<BYTES, ?>** It's naturally supported without handling it as a special
case
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 159980)
Time Spent: 5h 40m (was: 5.5h)
> RowCoder doesn't implement structuralValue
> ------------------------------------------
>
> Key: BEAM-5866
> URL: https://issues.apache.org/jira/browse/BEAM-5866
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Reporter: Gleb Kanterov
> Assignee: Gleb Kanterov
> Priority: Major
> Time Spent: 5h 40m
> Remaining Estimate: 0h
>
> These two properties fail for RowCoder with `BYTES` field, or `Map<BYTES, ?>`
> field.
> {code}
> public static <T> void testConsistentWithEquals(Coder<T> coder, T example) {
> assumeTrue(coder.consistentWithEquals());
> byte[] bytes = encodeBytes(coder, example);
> // even if the coder is non-deterministic, if the encoded bytes match,
> // coder is consistent with equals, decoded values must be equal
> T out0 = decodeBytes(coder, bytes);
> T out1 = decodeBytes(coder, bytes);
> assertEquals("If the encoded bytes match, decoded values must be equal",
> out0, out1);
> assertEquals(
> "If two values are equal, their hash codes must be equal",
> out0.hashCode(),
> out1.hashCode());
> }
> public static <T> void testStructuralValueConsistentWithEquals(Coder<T>
> coder, T example) {
> byte[] bytes = encodeBytes(coder, example);
> // even if coder is non-deterministic, if the encoded bytes match,
> // structural values must be equal
> Object out0 = coder.structuralValue(decodeBytes(coder, bytes));
> Object out1 = coder.structuralValue(decodeBytes(coder, bytes));
> assertEquals("If the encoded bytes match, structural values must be
> equal", out0, out1);
> assertEquals(
> "If two values are equal, their hash codes must be equal",
> out0.hashCode(),
> out1.hashCode());
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)