Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.
The "GettingStartedYARN" page has been changed by thomasjungblut: http://wiki.apache.org/hama/GettingStartedYARN?action=diff&rev1=1&rev2=2 <<TableOfContents(5)>> + + + If you are a total newb to Hama, please go directly to the Full Walkthrough section. == Requirements == @@ -11, +14 @@ For additional information consult our CompatibilityTable. - This tutorial requires Hadoop 0.23.0 already correctly setuped. + This tutorial requires Hadoop 0.23.0 already correctly installed. If you haven't done this yet, please follow the official documentation http://hadoop.apache.org/common/docs/r0.23.0/ - == How to submit a Hama-YARN job == + == How to write a Hama-YARN job == + The [[BSPModel]] hasn't changed, but the way to submit a job has. + + Basically you just need the following code to submit a Hama-YARN job + + {{{ + HamaConfiguration conf = new HamaConfiguration(); + conf.set("yarn.resourcemanager.address", "0.0.0.0:8040"); + + YARNBSPJob job = new YARNBSPJob(conf); + job.setBspClass(HelloBSP.class); + job.setJarByClass(HelloBSP.class); + job.setJobName("Serialize Printing"); + job.setMemoryUsedPerTaskInMb(50); + job.setNumBspTask(2); + job.waitForCompletion(false); + }}} + + As you can see, instead of a {{{BSPJob}}} you are starting a {{{YARNBSPJob}}}. + + The {{{YARNBSPJob}}} offers an extended API for running on YARN. For example you can set the amount of memory used by a task with + + {{{ + job.setMemoryUsedPerTaskInMb(50); + }}} + + == How to configure a job == + + There are some configuration values that the job needs to have in order to submit sucessfully to YARN infrastructure. + + The importantest configuration is the {{{yarn.resourcemanager.address}}}. This should point to the address (hostname+port) where your ResourceManager runs, for example {{{localhost:8040}}}. + + Another important configuration value is the amount of memory used by the BSPApplicationMaster. You can configure a base amount of memory for the application master with this configuration key + {{{ + hama.appmaster.memory.mb + }}} + + By default, this is set to 100mb. + + The total amount of memory used by the ApplicationMaster is calculated as follows + + {{{ + int memoryInMb = 3 * this.getNumBspTask() + conf.getInt("hama.appmaster.memory.mb", 100) + }}} + + This is because the application master spawns 1-3 threads per launched task that each should take 1mb, plus a minimum of base memory usage of 100. + If you face memory issues, you can set this to a higher value. + + == How to submit a job == + + === General === + + You have to ways to submit a job, you can either submit it via shell and a packed jar, or you can submit from a java application. + In both cases you need the hama-yarn jar in the classpath or inside the jar to run correctly. + + === Via Shell === + + {{{ + bin/yarn jar /path_to_jar org.apache.hama.bsp.YarnSerializePrinting + }}} + + In this case the jar in {{{/path_to_jar}}} contains the hama-yarn jar or it is already in the classpath of your Hadoop application. + You have to replace {{{org.apache.hama.bsp.YarnSerializePrinting}}} with the class which contains the main method which runs the Hama Job. + + === Via Java Application === + + Just like in the section above, you have to configure the address of the ResourceManager. + Then you can run this from a Java Application, just put it into a main-method. + + {{{ + HamaConfiguration conf = new HamaConfiguration(); + conf.set("yarn.resourcemanager.address", "0.0.0.0:8040"); + + YARNBSPJob job = new YARNBSPJob(conf); + job.setBspClass(HelloBSP.class); + job.setJarByClass(HelloBSP.class); + job.setJobName("Serialize Printing"); + job.setMemoryUsedPerTaskInMb(50); + job.setNumBspTask(2); + job.waitForCompletion(false); + }}} + + == How to change existing Hama Jobs to run on YARN == + + In case you have the following code + + {{{ + // BSP job configuration + HamaConfiguration conf = new HamaConfiguration(); + BSPJob bsp = new BSPJob(conf); + bsp.waitForCompletion(true); + }}} + + to submit a Hama job. You can just change the {{{BSPJob}}} to {{{YARNBSPJob}}}. + + + = Full Walkthrough = + + This walkthrough guides you step by step to a working Hama BSP application on YARN. + However, you must have correctly installed Hadoop 0.23.x on your machine. + + <TODO> + Make some fancy pictures from eclipse and how to get a jar out of it and submit. +
