[
https://issues.apache.org/jira/browse/CHUKWA-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977087#action_12977087
]
Ari Rabkin commented on CHUKWA-575:
-----------------------------------
Tried this, got errors.
I started with a clean HBase, let it collect metrics from the default adaptors
for a bit. Ran the script manually. The Pig-spawned tasks all fail. I got the
following on the Reduce side:
java.io.IOException: java.lang.IllegalArgumentException: Row key is invalid
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:438)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:401)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:381)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:251)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.IllegalArgumentException: Row key is invalid
at org.apache.hadoop.hbase.client.Put.(Put.java:79)
at org.apache.hadoop.hbase.client.Put.(Put.java:69)
at
org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:355)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:436)
... 7 more
------
Is it possible to run the scripts in local mode for debugging, and have them
still pull data from HBase? How do I configure that? I tried a bunch of things
and got nowhere.
> Cluster Summarization script
> ----------------------------
>
> Key: CHUKWA-575
> URL: https://issues.apache.org/jira/browse/CHUKWA-575
> Project: Chukwa
> Issue Type: New Feature
> Components: scripts
> Environment: Java 6, Mac OS X 10.6
> Reporter: Eric Yang
> Assignee: Eric Yang
> Fix For: 0.5.0
>
> Attachments: CHUKWA-575.patch
>
>
> Chukwa record metrics from name node, data node, job tracker, task tracker,
> etc, but the raw metrics does not help determine all aspect of the cluster
> health. For now, we have the following metrics in HBase:
> * System
> * Disk
> * Memory
> * Network
> * HDFS
> * Name Node
> * Data Node
> * Map Reduce
> * Job Tracker
> * Task Tracker
> We can further analyze the data to provide a summary for the cluster as these
> categories:
> * System - Performance profile of how busy the nodes are in the cluster
> * HDFS - Capacity of the disk storage, and health of the data in the file
> system
> * MapReduce - Capacity of the processing pipeline, and health of the
> processing system
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.