Think this should solve my problem.

Thanks Evan ans Luke!

On Thu, 11 Aug 2022 at 1:49 AM, Luke Cwik via user <[email protected]>
wrote:

> Use CoGroupByKey to join the two PCollections and emit only the first
> value of each iterable with the key.
>
> Duplicates will appear as iterables with more then one value while keys
> without duplicates will have iterables containing exactly one value.
>
> On Wed, Aug 10, 2022 at 12:25 PM Shivam Singhal <
> [email protected]> wrote:
>
>> I have two PCollections, CollectionA & CollectionB of type KV<String,
>> Byte[]>.
>>
>>
>> I would like to merge them into one PCollection but CollectionA &
>> CollectionB might have some elements with the same key. In those repeated
>> cases, I would like to keep the element from CollectionA & drop the
>> repeated element from CollectionB.
>>
>> Does anyone know a simple method to do this?
>>
>> Thanks,
>> Shivam Singhal
>>
>

Reply via email to