Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by stack: http://wiki.apache.org/hadoop/Hbase/MapReduce The comment on the change is: Simplify ------------------------------------------------------------------------------ = Hbase, MapReduce and the CLASSPATH = - !MapReduce jobs deployed to a mapreduce cluster do not usually have access to the configuration under ''$HBASE_CONF_DIR'' nor to hbase classes. + !MapReduce jobs deployed to a mapreduce cluster do not by default have access to the hbase configuration under ''$HBASE_CONF_DIR'' nor to hbase classes. - Any hbase particular configuration not hard-coded into the job jar classes -- e.g. the address of the target hbase master -- that is needed by running maps and/or reduces needs to be either included explicitly in the job jar, by jarring an appropriately configured ''hbase-site.xml'' into a conf subdirectory, or by adding an ''hbase-site.xml'' under ''$HADOOP_HOME/conf'' and copying it across the mapreduce cluster. The same holds true for any hbase classes referenced by the mapreduce job jar. By default the hbase classes are not available on the general mapreduce ''CLASSPATH''. To add them, you have a couple of options. Either include the hadoop-X.X.X-hbase.jar in the job jar under the lib subdirectory or copy the hadoop-X.X.X-hbase.jar to $HADOOP_HOME/lib and copy it across the cluster. - - But the cleanest means of adding hbase configuration and classes to the cluster CLASSPATH is by uncommenting ''HADOOP_CLASSPATH'' in ''$HADOOP_HOME/conf/hadoop-env.sh'' and adding the path to the hbase jar and ''conf'' directory. Then copy the amended configuration across the cluster. You'll need to restart the mapreduce cluster if you want it to notice the new configuration. + You could add ''hbase-site.xml'' to $HADOOP_HOME/conf and add hbase.jar to the $HADOOP_HOME/lib and copy these changes across your cluster but he cleanest means of adding hbase configuration and classes to the cluster CLASSPATH is by uncommenting ''HADOOP_CLASSPATH'' in ''$HADOOP_HOME/conf/hadoop-env.sh'' and adding the path to the hbase jar and ''$HBASE_CONF_DIR'' directory. Then copy the amended configuration across the cluster. You'll need to restart the mapreduce cluster if you want it to notice the new configuration. For example, here is how you would amend ''hadoop-env.sh'' adding hbase classes and the !PerformanceEvaluation class from hbase test classes to the hadoop ''CLASSPATH'': @@ -14, +12 @@ # export HADOOP_CLASSPATH= export HADOOP_CLASSPATH=$HBASE_HOME/build/test:$HBASE_HOME/build/hadoop-0.15.0-dev-hbase.jar}}} - (Expand $HBASE_HOME appropriately in the in accordance with your local environment) + Expand $HBASE_HOME appropriately in the in accordance with your local environment And then, this is how you would run the PerformanceEvaluation MR job to put up 4 clients: - {{{ > $HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 4 }}} + {{{ > $HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 4 + }}} - (The PerformanceEvaluation class wil be found on the CLASSPATH because you added $HBASE_HOME/build/test to HADOOP_CLASSPATH) + The PerformanceEvaluation class wil be found on the CLASSPATH because you added $HBASE_HOME/build/test to HADOOP_CLASSPATH = Hbase as MapReduce job data source and sink =
