How do you need the data in the DoFn? One easy way of doing this might be a MapSideJoin[1] but that would probably require similar keys for what you are doing with the data and might not fit with adding supplementary data in the DoFn like you are intending.
[1] - http://crunch.apache.org/user-guide.html#mapjoin On Wed, Feb 24, 2016 at 2:04 PM, Robinson, Landon - Landon < [email protected]> wrote: > Crunch Gurus, > > Say I have a small data set of key/value pairs I’m reading into a > Pcollection. I want to give that small set as a supplementary data set to > DoFns for comparisons. > I’ve done this before with hardcoded String arrays and such, but wanted to > know what best practice is for taking the contents of a very small > Pcollection, and handing it as an object to a DoFn. > > I know I can turn it into a Hashmap and pass it as an argument/param, but > is there a recommended way in Crunch? Thanks! > --------------------------------------------------------------------------- > Landon Robinson > Big Data & Hadoop Engineer > IT Business Intelligence, Lowe’s Companies Inc. > --------------------------------------------------------------------------- > NOTICE: All information in and attached to the e-mails below may be > proprietary, confidential, privileged and otherwise protected from improper > or erroneous disclosure. If you are not the sender's intended recipient, > you are not authorized to intercept, read, print, retain, copy, forward, or > disseminate this message. If you have erroneously received this > communication, please notify the sender immediately by phone (704-758-1000) > or by e-mail and destroy all copies of this message electronic, paper, or > otherwise. > > *By transmitting documents via this email: Users, Customers, Suppliers and > Vendors collectively acknowledge and agree the transmittal of information > via email is voluntary, is offered as a convenience, and is not a secured > method of communication; Not to transmit any payment information E.G. > credit card, debit card, checking account, wire transfer information, > passwords, or sensitive and personal information E.G. Driver's license, > DOB, social security, or any other information the user wishes to remain > confidential; To transmit only non-confidential information such as plans, > pictures and drawings and to assume all risk and liability for and > indemnify Lowe's from any claims, losses or damages that may arise from the > transmittal of documents or including non-confidential information in the > body of an email transmittal. Thank you. * >
