Think this should solve my problem. Thanks Evan ans Luke!
On Thu, 11 Aug 2022 at 1:49 AM, Luke Cwik via user <[email protected]> wrote: > Use CoGroupByKey to join the two PCollections and emit only the first > value of each iterable with the key. > > Duplicates will appear as iterables with more then one value while keys > without duplicates will have iterables containing exactly one value. > > On Wed, Aug 10, 2022 at 12:25 PM Shivam Singhal < > [email protected]> wrote: > >> I have two PCollections, CollectionA & CollectionB of type KV<String, >> Byte[]>. >> >> >> I would like to merge them into one PCollection but CollectionA & >> CollectionB might have some elements with the same key. In those repeated >> cases, I would like to keep the element from CollectionA & drop the >> repeated element from CollectionB. >> >> Does anyone know a simple method to do this? >> >> Thanks, >> Shivam Singhal >> >
