Was able to get it to use ArrayList by doing List<List<Integer>> result = new ArrayList<List<Integer>>();
Then storing my keys in a separate array that I'll pass in as a side input to key for the list of lists. Thanks for the help, lemme know more in the future about how coders work and instantiate and I'd love to help contribute by adding some new coders. - Shannon On Thu, Jul 11, 2019 at 4:59 PM Shannon Duncan <[email protected]> wrote: > Will do. Thanks. A new coder for deterministic Maps would be great in the > future. Thank you! > > On Thu, Jul 11, 2019 at 4:58 PM Rui Wang <[email protected]> wrote: > >> I think Mike refers to ListCoder >> <https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/ListCoder.java> >> which >> is deterministic if its element is the same. Maybe you can search the repo >> for examples of ListCoder? >> >> >> -Rui >> >> On Thu, Jul 11, 2019 at 2:55 PM Shannon Duncan < >> [email protected]> wrote: >> >>> So ArrayList doesn't work either, so just a standard List? >>> >>> On Thu, Jul 11, 2019 at 4:53 PM Rui Wang <[email protected]> wrote: >>> >>>> Shannon, I agree with Mike on List is a good workaround if your element >>>> within list is deterministic and you are eager to make your new pipeline >>>> working. >>>> >>>> >>>> Let me send back some pointers to adding new coder later. >>>> >>>> >>>> -Rui >>>> >>>> On Thu, Jul 11, 2019 at 2:45 PM Shannon Duncan < >>>> [email protected]> wrote: >>>> >>>>> I just started learning Java today to attempt to convert our python >>>>> pipelines to Java to take advantage of key features that Java has. I have >>>>> no idea how I would create a new coder and include it in for beam to >>>>> recognize. >>>>> >>>>> If you can point me in the right direction of where it hooks together >>>>> I might be able to figure that out. I can duplicate MapCoder and try to >>>>> make changes, but how will beam know to pick up that coder for a >>>>> groupByKey? >>>>> >>>>> Thanks! >>>>> Shannon >>>>> >>>>> On Thu, Jul 11, 2019 at 4:42 PM Rui Wang <[email protected]> wrote: >>>>> >>>>>> It could be just straightforward to create a SortedMapCoder for >>>>>> TreeMap. Just add checks on map instances and then change >>>>>> verifyDeterministic. >>>>>> >>>>>> If this is a common need we could just submit it into Beam repo. >>>>>> >>>>>> [1]: >>>>>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java#L146 >>>>>> >>>>>> On Thu, Jul 11, 2019 at 2:26 PM Mike Pedersen <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> There isn't a coder for deterministic maps in Beam, so even if your >>>>>>> datastructure is deterministic, Beam will assume the serialized bytes >>>>>>> aren't deterministic. >>>>>>> >>>>>>> You could make one using the MapCoder as a guide: >>>>>>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java >>>>>>> Just change it such that the exception in VerifyDeterministic is >>>>>>> removed and when decoding it instantiates a TreeMap or such instead of a >>>>>>> HashMap. >>>>>>> >>>>>>> Alternatively, you could just represent your key as a sorted list of >>>>>>> KV pairs. Lookups could be done using binary search if necessary. >>>>>>> >>>>>>> Mike >>>>>>> >>>>>>> Den tor. 11. jul. 2019 kl. 22.41 skrev Shannon Duncan < >>>>>>> [email protected]>: >>>>>>> >>>>>>>> So I'm working on essentially doing a word-count on a complex data >>>>>>>> structure. >>>>>>>> >>>>>>>> I tried just using a HashMap as the Structure, but that didn't work >>>>>>>> because it is non-deterministic. >>>>>>>> >>>>>>>> However when Given a LinkedHashMap or TreeMap which is >>>>>>>> deterministic the SDK complains that it's non-deterministic when >>>>>>>> trying to >>>>>>>> use it as a key for GroupByKey. >>>>>>>> >>>>>>>> What would be an appropriate Map style data structure that would be >>>>>>>> deterministic enough for Apache Beam to accept it as a key? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Shannon >>>>>>>> >>>>>>>
