[ 
https://issues.apache.org/jira/browse/CHUKWA-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979203#action_12979203
 ] 

Eric Yang commented on CHUKWA-575:
----------------------------------

My configuration for running hbase+pig:

Single node hadoop+hbase+chukwa:

{noformat}
export PIG_PATH=sandbox/pig-0.8.0
export PIG_CLASSPATH=${HBASE_CONF_DIR}:${HADOOP_CONF_DIR}
export HBASE_HOME=sandbox/hbase-0.20.6
export CHUKWA_HOME=sandbox/chukwa-trunk

./pig 
-Dpig.additional.jars=${PIG_PATH}/pig-0.8.0-core.jar:${HBASE_HOME}/hbase-0.20.6.jar
 ${CHUKWA_HOME}/script/pig/ClusterSummary.pig
{noformat}

Real cluster + Mapreduce:

create a jar file containing hbase-site.xml, and call it hbase-conf.jar

{noformat}
export PIG_PATH=sandbox/pig-0.8.0
export PIG_CLASSPATH=${HBASE_CONF_DIR}:${HADOOP_CONF_DIR}
export HBASE_HOME=sandbox/hbase-0.20.6
export CHUKWA_HOME=sandbox/chukwa-trunk

./pig 
-Dpig.additional.jars=${PIG_PATH}/pig-0.8.0-core.jar:${HBASE_HOME}/hbase-0.20.6.jar:hbase-conf.jar
 ${CHUKWA_HOME}/script/pig/ClusterSummary.pig
{noformat}

See if you can get this to work, try to run the script in grunt mode line by 
line, and inspect which STORE statement is causing problem.

> Cluster Summarization script
> ----------------------------
>
>                 Key: CHUKWA-575
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-575
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: scripts
>         Environment: Java 6, Mac OS X 10.6
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>             Fix For: 0.5.0
>
>         Attachments: CHUKWA-575.patch
>
>
>  Chukwa record metrics from name node, data node, job tracker, task tracker, 
> etc, but the raw metrics does not help determine all aspect of the cluster 
> health.  For now, we have the following metrics in HBase:
>  * System
>  *   Disk
>  *   Memory
>  *   Network
>  * HDFS
>  *   Name Node
>  *   Data Node
>  * Map Reduce
>  *   Job Tracker
>  *   Task Tracker
> We can further analyze the data to provide a summary for the cluster as these 
> categories:
>  * System - Performance profile of how busy the nodes are in the cluster
>  * HDFS - Capacity of the disk storage, and health of the data in the file 
> system
>  * MapReduce - Capacity of the processing pipeline, and health of the 
> processing system

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to