[jira] Commented: (CHUKWA-575) Cluster Summarization script

Ari Rabkin (JIRA) Mon, 03 Jan 2011 18:54:10 -0800

    [ 
https://issues.apache.org/jira/browse/CHUKWA-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977087#action_12977087
 ]


Ari Rabkin commented on CHUKWA-575:
-----------------------------------

Tried this, got errors.  

I started with a clean HBase, let it collect metrics from the default adaptors 
for a bit. Ran the script manually. The Pig-spawned tasks all fail. I got the 
following on the Reduce side:

java.io.IOException: java.lang.IllegalArgumentException: Row key is invalid
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:438)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:401)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:381)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:251)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
        at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.IllegalArgumentException: Row key is invalid
        at org.apache.hadoop.hbase.client.Put.(Put.java:79)
        at org.apache.hadoop.hbase.client.Put.(Put.java:69)
        at 
org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:355)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
        at 
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
        at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:436)
        ... 7 more


------

Is it possible to run the scripts in local mode for debugging, and have them 
still pull data from HBase? How do I configure that? I tried a bunch of things 
and got nowhere.

> Cluster Summarization script
> ----------------------------
>
>                 Key: CHUKWA-575
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-575
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: scripts
>         Environment: Java 6, Mac OS X 10.6
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>             Fix For: 0.5.0
>
>         Attachments: CHUKWA-575.patch
>
>
>  Chukwa record metrics from name node, data node, job tracker, task tracker, 
> etc, but the raw metrics does not help determine all aspect of the cluster 
> health.  For now, we have the following metrics in HBase:
>  * System
>  *   Disk
>  *   Memory
>  *   Network
>  * HDFS
>  *   Name Node
>  *   Data Node
>  * Map Reduce
>  *   Job Tracker
>  *   Task Tracker
> We can further analyze the data to provide a summary for the cluster as these 
> categories:
>  * System - Performance profile of how busy the nodes are in the cluster
>  * HDFS - Capacity of the disk storage, and health of the data in the file 
> system
>  * MapReduce - Capacity of the processing pipeline, and health of the 
> processing system

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CHUKWA-575) Cluster Summarization script

Reply via email to