[ 
https://issues.apache.org/jira/browse/BEAM-5866?focusedWorklogId=159980&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-159980
 ]

ASF GitHub Bot logged work on BEAM-5866:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Oct/18 13:08
            Start Date: 29/Oct/18 13:08
    Worklog Time Spent: 10m 
      Work Description: kanterov commented on issue #6845: [BEAM-5866] Fix 
`Row#equals`
URL: https://github.com/apache/beam/pull/6845#issuecomment-433903088
 
 
   @kennknowles I was thinking about this problem over the weekend. Everything 
becomes more simple with `OurBytesTypeWithGoodEquals`. I pushed the 
implementation in 
https://github.com/kanterov/beam/commit/08f300e800ef056238360dcea33a47f95ddb5d3d.
 I think overall it's the best approach.
   
   **API** It preserves the ability to change internal representation without 
any surprises or breaking APIs.
   
   **Correctness** It's clear from the beginning what is 
`Map<OurBytesTypeWithGoodEquals, T>`, no matter how it's implemented. 
`Map<byte[], ?>` is tricky, and trivial implementation is buggy.
   
   **Performance** I did JMH benchmarks, it's similar to `deepEquals`:
   ```
   RowBenchmark.rowWithStorageAndDeepEquals        thrpt   10  0.014 ± 0.001  
ops/ns
   RowBenchmark.rowWithStorageAndByteArrayWrapper  thrpt   10  0.015 ± 0.001  
ops/ns
   RowBenchmark.rowWithGetterAndPojoEquals         thrpt   10  0.022 ± 0.001  
ops/ns
   ```
   
   Interesting learning is that the most performant approach would be 
generating POJOs based on a schema, this could be done without breaking 
proposed API.
   
   **Simplicity** `Row` implementation is more simple because it doesn't need 
custom `deepEquals`, or specialized collection implementations.
   
   **Map<BYTES, ?>** It's naturally supported without handling it as a special 
case

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 159980)
    Time Spent: 5h 40m  (was: 5.5h)

> RowCoder doesn't implement structuralValue
> ------------------------------------------
>
>                 Key: BEAM-5866
>                 URL: https://issues.apache.org/jira/browse/BEAM-5866
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Gleb Kanterov
>            Assignee: Gleb Kanterov
>            Priority: Major
>          Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> These two properties fail for RowCoder with `BYTES` field, or `Map<BYTES, ?>` 
> field. 
> {code}
>   public static <T> void testConsistentWithEquals(Coder<T> coder, T example) {
>     assumeTrue(coder.consistentWithEquals());
>     byte[] bytes = encodeBytes(coder, example);
>     // even if the coder is non-deterministic, if the encoded bytes match,
>     // coder is consistent with equals, decoded values must be equal
>     T out0 = decodeBytes(coder, bytes);
>     T out1 = decodeBytes(coder, bytes);
>     assertEquals("If the encoded bytes match, decoded values must be equal", 
> out0, out1);
>     assertEquals(
>         "If two values are equal, their hash codes must be equal",
>         out0.hashCode(),
>         out1.hashCode());
>   }
>   public static <T> void testStructuralValueConsistentWithEquals(Coder<T> 
> coder, T example) {
>     byte[] bytes = encodeBytes(coder, example);
>     // even if coder is non-deterministic, if the encoded bytes match,
>     // structural values must be equal
>     Object out0 = coder.structuralValue(decodeBytes(coder, bytes));
>     Object out1 = coder.structuralValue(decodeBytes(coder, bytes));
>     assertEquals("If the encoded bytes match, structural values must be 
> equal", out0, out1);
>     assertEquals(
>         "If two values are equal, their hash codes must be equal",
>         out0.hashCode(),
>         out1.hashCode());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to