[
https://issues.apache.org/jira/browse/BEAM-5866?focusedWorklogId=159625&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-159625
]
ASF GitHub Bot logged work on BEAM-5866:
----------------------------------------
Author: ASF GitHub Bot
Created on: 27/Oct/18 13:48
Start Date: 27/Oct/18 13:48
Worklog Time Spent: 10m
Work Description: kanterov edited a comment on issue #6845: [BEAM-5866]
Override structuralValue in Row and Map coders
URL: https://github.com/apache/beam/pull/6845#issuecomment-433621507
@reuvenlax @kennknowles Thanks for the review. I tried to fix `Row#equals`
with custom `deepEquals`, but it turns out to be terrible in cases of
`List<byte[]>`, `List<List<byte[]>>`, `Map<?, byte[]>`, `Map<byte[], ?>`, etc.,
because it requires traversing row schema, or using a lot of reflection due to
type erasure.
I did a different thing, stored `byte[]` as `StructuralByteArray`, and
`deepEquals` isn't needed anymore. One drawback is that `Row#getValue` changes
return type of `BYTES` field.
Pros are:
- simple and fast `Row#equals` implementation
- lawful `Map<>` and `List<>` returned by `Row#getValue`
Cons:
- breaking `Row#getValue` for `BYTES`
- a little bit of inconsistency with `Row#getBytes`
- storing extra object wrapping `byte[]`
I have a few suggestions I want to discuss:
- use `ByteBuffer` instead of `StructuralByteArray` to reduce the amount of
"custom" types
- store `BYTES` as `byte[]` instead of `StructuralByteArray` in the
POJO-backed implementation of `Row` to reduce extra overhead
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 159625)
Time Spent: 3h 20m (was: 3h 10m)
> RowCoder doesn't implement structuralValue
> ------------------------------------------
>
> Key: BEAM-5866
> URL: https://issues.apache.org/jira/browse/BEAM-5866
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Reporter: Gleb Kanterov
> Assignee: Gleb Kanterov
> Priority: Major
> Time Spent: 3h 20m
> Remaining Estimate: 0h
>
> These two properties fail for RowCoder with `BYTES` field, or `Map<BYTES, ?>`
> field.
> {code}
> public static <T> void testConsistentWithEquals(Coder<T> coder, T example) {
> assumeTrue(coder.consistentWithEquals());
> byte[] bytes = encodeBytes(coder, example);
> // even if the coder is non-deterministic, if the encoded bytes match,
> // coder is consistent with equals, decoded values must be equal
> T out0 = decodeBytes(coder, bytes);
> T out1 = decodeBytes(coder, bytes);
> assertEquals("If the encoded bytes match, decoded values must be equal",
> out0, out1);
> assertEquals(
> "If two values are equal, their hash codes must be equal",
> out0.hashCode(),
> out1.hashCode());
> }
> public static <T> void testStructuralValueConsistentWithEquals(Coder<T>
> coder, T example) {
> byte[] bytes = encodeBytes(coder, example);
> // even if coder is non-deterministic, if the encoded bytes match,
> // structural values must be equal
> Object out0 = coder.structuralValue(decodeBytes(coder, bytes));
> Object out1 = coder.structuralValue(decodeBytes(coder, bytes));
> assertEquals("If the encoded bytes match, structural values must be
> equal", out0, out1);
> assertEquals(
> "If two values are equal, their hash codes must be equal",
> out0.hashCode(),
> out1.hashCode());
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)