You can do it both ways: at the DoFn level or at the pipeline level.
For global settings, go with the pipeline level. For individual jobs/tasks, go
DoFn Level.
Pipeline Level:
Configuration crunchConf = getConf();
crunchConf.set("mapred.job.queue.name", "batch");
Pipeline pipeline = new MRPipeline(TransformKronosMR.class, “My Pipeline"
,crunchConf);
DoFn Level (as mentioned):
@Override
public void configure(Configuration conf) {
conf.set("mapreduce.map.java.opts", "-Xmx3900m");
conf.set("mapreduce.reduce.java.opts", "-Xmx3900m");
conf.set("mapreduce.map.memory.mb", "4096");
conf.set("mapreduce.reduce.memory.mb", "4096");
}
---------------------------------------------------------------------------
[cid:3940A2A6-14FD-49BC-AC41-B9E206378684]
Landon Robinson
Big Data/Hadoop Engineer
Lowe’s Companies Inc. | IT Business Intelligence
---------------------------------------------------------------------------
From: Micah Whitacre <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Tuesday, October 13, 2015 at 3:55 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: Hadoop Configuration from DoFn
Luke,
Generally that configuration should be set on the Configuration object passed
to Pipeline vs on the individual DoFns. The configure(...) method is called
when re-instantiating the DoFn on the Map/Reduce task and at that point those
memory settings wouldn't be honored.
On Tue, Oct 13, 2015 at 2:52 PM, Luke Hansen
<[email protected]<mailto:[email protected]>> wrote:
Does anyone know if this is the right way to configure Hadoop from a Crunch
DoFn? This didn't seem to affect anything.
Thanks!
@Override
public void configure(Configuration conf) {
conf.set("mapreduce.map.java.opts", "-Xmx3900m");
conf.set("mapreduce.reduce.java.opts", "-Xmx3900m");
conf.set("mapreduce.map.memory.mb", "4096");
conf.set("mapreduce.reduce.memory.mb", "4096");
}
NOTICE: All information in and attached to the e-mails below may be
proprietary, confidential, privileged and otherwise protected from improper or
erroneous disclosure. If you are not the sender's intended recipient, you are
not authorized to intercept, read, print, retain, copy, forward, or disseminate
this message. If you have erroneously received this communication, please
notify the sender immediately by phone (704-758-1000) or by e-mail and destroy
all copies of this message electronic, paper, or otherwise.
By transmitting documents via this email: Users, Customers, Suppliers and
Vendors collectively acknowledge and agree the transmittal of information via
email is voluntary, is offered as a convenience, and is not a secured method of
communication; Not to transmit any payment information E.G. credit card, debit
card, checking account, wire transfer information, passwords, or sensitive and
personal information E.G. Driver's license, DOB, social security, or any other
information the user wishes to remain confidential; To transmit only
non-confidential information such as plans, pictures and drawings and to assume
all risk and liability for and indemnify Lowe's from any claims, losses or
damages that may arise from the transmittal of documents or including
non-confidential information in the body of an email transmittal. Thank you.