Hi Beam devs,

I'm working on Euphoria DSL, where we implemented `BroadcastHashJoin` using 
side-inputs. But our test shows some missing data. We use `View.asMultimap()` 
to get our join-small-side to view in form of `PCollectionView<Map<K, 
Iterable<T>>>`. Then some duplicated key-value (the same key and value as some 
other element) gets lost. That is of course unfortunate behavior when doing 
joins. I believe that it all nails down to:

https://github.com/apache/beam/blob/05fb694f265dda0254d7256e938e508fec9ba098/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionViews.java#L293


Where `HashMultimap` is used to gather all the elements to a `Multimap<K, V>`.  
Which do not allow duplicate key-value pairs. Do you also feel this is a bug? 
And if yes, then we would like to fix it by replacing `HashMultimap` with 
`ArrayListMultimap` which allows allows duplicate key-value pairs.


We can thing of some workarounds. But we prefer to do the fix, if possible.


So what are your opinions? And how should we proceed?


Thank you.

Vaclav Plajt


Je dobré vedet, ze tento e-mail a prílohy jsou duverné. Pokud spolu jednáme o 
uzavrení obchodu, vyhrazujeme si právo nase jednání kdykoli ukoncit. Pro 
fanousky právní mluvy - vylucujeme tím ustanovení obcanského zákoníku o 
predsmluvní odpovednosti. Pravidla o tom, kdo u nás a jak vystupuje za 
spolecnost a kdo muze co a jak podepsat naleznete 
zde<https://onas.seznam.cz/cz/podpisovy-rad-cz.html>

You should know that this e-mail and its attachments are confidential. If we 
are negotiating on the conclusion of a transaction, we reserve the right to 
terminate the negotiations at any time. For fans of legalese-we hereby exclude 
the provisions of the Civil Code on pre-contractual liability. The rules about 
who and how may act for the company and what are the signing procedures can be 
found here<https://onas.seznam.cz/cz/signature-rules.html>.

Reply via email to