Turn the PCollection into a ReadableData object, which is serializable and can be passed into a DoFn and read in during initialization for use during processing. That's how the MapsideJoin stuff is implemented. On Wed, Feb 24, 2016 at 12:19 PM Micah Whitacre <[email protected]> wrote:
> How do you need the data in the DoFn? One easy way of doing this might be > a MapSideJoin[1] but that would probably require similar keys for what you > are doing with the data and might not fit with adding supplementary data in > the DoFn like you are intending. > > [1] - http://crunch.apache.org/user-guide.html#mapjoin > > On Wed, Feb 24, 2016 at 2:04 PM, Robinson, Landon - Landon < > [email protected]> wrote: > >> Crunch Gurus, >> >> Say I have a small data set of key/value pairs I’m reading into a >> Pcollection. I want to give that small set as a supplementary data set to >> DoFns for comparisons. >> I’ve done this before with hardcoded String arrays and such, but wanted >> to know what best practice is for taking the contents of a very small >> Pcollection, and handing it as an object to a DoFn. >> >> I know I can turn it into a Hashmap and pass it as an argument/param, but >> is there a recommended way in Crunch? Thanks! >> >> --------------------------------------------------------------------------- >> Landon Robinson >> Big Data & Hadoop Engineer >> IT Business Intelligence, Lowe’s Companies Inc. >> >> --------------------------------------------------------------------------- >> NOTICE: All information in and attached to the e-mails below may be >> proprietary, confidential, privileged and otherwise protected from improper >> or erroneous disclosure. If you are not the sender's intended recipient, >> you are not authorized to intercept, read, print, retain, copy, forward, or >> disseminate this message. If you have erroneously received this >> communication, please notify the sender immediately by phone >> (704-758-1000) or by e-mail and destroy all copies of this message >> electronic, paper, or otherwise. >> >> *By transmitting documents via this email: Users, Customers, Suppliers >> and Vendors collectively acknowledge and agree the transmittal of >> information via email is voluntary, is offered as a convenience, and is not >> a secured method of communication; Not to transmit any payment information >> E.G. credit card, debit card, checking account, wire transfer information, >> passwords, or sensitive and personal information E.G. Driver's license, >> DOB, social security, or any other information the user wishes to remain >> confidential; To transmit only non-confidential information such as plans, >> pictures and drawings and to assume all risk and liability for and >> indemnify Lowe's from any claims, losses or damages that may arise from the >> transmittal of documents or including non-confidential information in the >> body of an email transmittal. Thank you. * >> > >
