No, just the number of tasks involved in each job. The structure should remain the same.
J On Mon, Oct 12, 2015 at 3:44 PM, Ravi Kolluri <[email protected]> wrote: > > Thanks Josh! > > My question was more about how the planner organizes the map-reduce > computation. Would the crunch job composition change based on input size? > > thanks, > Ravi > > > On Mon, Oct 12, 2015 at 3:38 PM, Josh Wills <[email protected]> wrote: > >> Hey Ravi, >> >> The number of reducers used in the various stages of the MR job can >> change if you don't hard-code them using groupByKey(int numReducers) or >> groupByKey(GroupingOptions) (or the equivalent settings via the >> JoinStrategy classes for joins). The planner will try to estimate the >> number of bytes to be processed and aims to process 1GB of data per >> reducer. If you do hard-code the number of reduce tasks, the planner will >> respect your wishes no matter what the input size is. >> >> Josh >> >> On Mon, Oct 12, 2015 at 2:31 PM, Ravi Kolluri <[email protected]> wrote: >> >>> Hello Crunch users, >>> >>> I have a question about what parameters go into the Crunch planner. >>> >>> Lets say I have a crunch job with a set of input tables, and a fixed set >>> of calls to parallelDo and groupBy operations. Does the crunch execution >>> plan stay fixed independent of the size distribution of the inputs? >>> >>> thanks, >>> Ravi >>> >>> >>> *DISCLAIMER:* The contents of this email, including any attachments, >>> may contain information that is confidential, proprietary in nature, >>> protected health information (PHI), or otherwise protected by law from >>> disclosure, and is solely for the use of the intended recipient(s). If you >>> are not the intended recipient, you are hereby notified that any use, >>> disclosure or copying of this email, including any attachments, is >>> unauthorized and strictly prohibited. If you have received this email in >>> error, please notify the sender of this email. Please delete this and all >>> copies of this email from your system. Any opinions either expressed or >>> implied in this email and all attachments, are those of its author only, >>> and do not necessarily reflect those of Nuna Health, Inc. >> >> >> > > *DISCLAIMER:* The contents of this email, including any attachments, may > contain information that is confidential, proprietary in nature, protected > health information (PHI), or otherwise protected by law from disclosure, > and is solely for the use of the intended recipient(s). If you are not the > intended recipient, you are hereby notified that any use, disclosure or > copying of this email, including any attachments, is unauthorized and > strictly prohibited. If you have received this email in error, please > notify the sender of this email. Please delete this and all copies of this > email from your system. Any opinions either expressed or implied in this > email and all attachments, are those of its author only, and do not > necessarily reflect those of Nuna Health, Inc. >
