that sounds good, I'll try that. thanks!
On Fri, Nov 15, 2013 at 10:45 AM, Josh Wills <[email protected]> wrote: > One way, of course, is to do a group by key and force all of the records > to a single reducer. > > Post-sort, I believe it's a safe assumption that the records will be > processed by a DoFn in sorted order, although it's not necessarily the case > that records with the same value of the key (if that ever happens in your > data) will be processed in the same shard/DoFn. > > J > > > On Fri, Nov 15, 2013 at 8:38 AM, Hrishikesh P > <[email protected]>wrote: > >> Hello - >> >> >> In the parallelDo-DoFn processing, is it possible to ensure that the >> records in the PTable will be processed in the given order? I have a PTable >> of long and bytes (PTable<Long, ByteBuffer>) which is sorted by the long >> value and I want to make sure that when the DoFn#process is called, the >> records will be processed in the sorted order, as there may be a dependency >> between the records. >> >> >> I thought of a few options, like storing the sorted results to a text >> file and using the file to process the records in the DoFn or using a table >> to track the records being processed but wasn't sure if they would give >> correct results and was wondering if there is a better approach. >> >> >> Thanks. >> > >
