[ 
https://issues.apache.org/jira/browse/BEAM-5184?focusedWorklogId=136991&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136991
 ]

ASF GitHub Bot logged work on BEAM-5184:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Aug/18 15:30
            Start Date: 22/Aug/18 15:30
    Worklog Time Spent: 10m 
      Work Description: lukecwik edited a comment on issue #6257: [BEAM-5184] 
Multimap side inputs with duplicate keys and values are being lost
URL: https://github.com/apache/beam/pull/6257#issuecomment-415074170
 
 
   This seems to break Dataflow, it's side input handling is different.
   
   The Jenkins logs for 
`org.apache.beam.sdk.transforms.ViewTest.testMultimapSideInputWithNonDeterministicKeyCoder`
 fail with
   
   ```
   Expected: iterable over [<KV{apple, 1}>, <KV{apple, 1}>, <KV{apple, 2}>, 
<KV{banana, 3}>, <KV{blackberry, 3}>] in any order
        but: No item matches: <KV{apple, 1}> in [<KV{apple, 2}>, <KV{apple, 
1}>, <KV{blackberry, 3}>, <KV{banana, 3}>]
   ```
   
   I'll try to take a look as to why this is failing as the error message is 
implying a comparison issue since all the values do exist in the actual output 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 136991)
    Time Spent: 1h 40m  (was: 1.5h)

> Multimap side inputs with duplicate keys and values are being lost
> ------------------------------------------------------------------
>
>                 Key: BEAM-5184
>                 URL: https://issues.apache.org/jira/browse/BEAM-5184
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Luke Cwik
>            Assignee: Vaclav Plajt
>            Priority: Major
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Side inputs with duplicate values are being lost due to the usage of a set 
> based multimap.
> [https://github.com/apache/beam/blob/05fb694f265dda0254d7256e938e508fec9ba098/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionViews.java#L293]
>  
> Originating thread: 
> [https://lists.apache.org/thread.html/48bae7cf71bf6851622cdee0e8bc8619c79c4c2273ed63f288202169@%3Cdev.beam.apache.org%3E]
>  
> Please update the existing tests to exercise this scenario as well: 
> https://github.com/apache/beam/blob/9f23ffc97535e7255245f3852b9d2f0939df5a0a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ViewTest.java#L507



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to