I think I am confused as to what you're going for. A parallelDo over the PGroupedTable should do exactly what you described. You get key, Iterable<DataRecord> for a single key, at which point you can do whatever you want in the DoFn. That's exactly what i had to do on a flow at work, where I do a groupByKey on a PTable, then in the ensuing parallelDo, create a List out of the Iterable<Record> and do some aggregate functions over it.
On Thu, Apr 28, 2016 at 2:59 PM Robinson, Landon - Landon < [email protected]> wrote: > Crunch Gurus, > > We need to process some data in order, so parallelDo shouldn’t work for > this approach. We’ve looked at SequentialDo, but not sure how exactly to > make it work…(Not much documentation on it). > *DataRecord is a java object with getters and setters.* > > Right now, we have a PGroupedTable<String, DataRecord> where the String > keys in the PGT are linked to multiple DataRecord objects (standard PGT > behavior). > What we need to do now is loop through all records for a particular key, > sort them, and do some simple calculations. > > *What is the best way/standard way to process a PgroupedTable so that > records corresponding to the same key are all kept together and processed?* > > Right now we know how to crack open a PGT in the local code and flip > through it (the SingleUseIterable), but we need to make a new dataset out > of it, not just play with it. > > Any direction or guidance would be appreciated! > --------------------------------------------------------------------------- > Landon Robinson > Big Data & Hadoop Engineer > IT Business Intelligence, Lowe’s Companies Inc. > --------------------------------------------------------------------------- > NOTICE: All information in and attached to the e-mails below may be > proprietary, confidential, privileged and otherwise protected from improper > or erroneous disclosure. If you are not the sender's intended recipient, > you are not authorized to intercept, read, print, retain, copy, forward, or > disseminate this message. If you have erroneously received this > communication, please notify the sender immediately by phone (704-758-1000) > or by e-mail and destroy all copies of this message electronic, paper, or > otherwise. > > *By transmitting documents via this email: Users, Customers, Suppliers and > Vendors collectively acknowledge and agree the transmittal of information > via email is voluntary, is offered as a convenience, and is not a secured > method of communication; Not to transmit any payment information E.G. > credit card, debit card, checking account, wire transfer information, > passwords, or sensitive and personal information E.G. Driver's license, > DOB, social security, or any other information the user wishes to remain > confidential; To transmit only non-confidential information such as plans, > pictures and drawings and to assume all risk and liability for and > indemnify Lowe's from any claims, losses or damages that may arise from the > transmittal of documents or including non-confidential information in the > body of an email transmittal. Thank you. * >
