[jira] [Commented] (CHUKWA-667) Optimize the HBase schema for Ganglia queris

Eric Yang (JIRA) Sun, 12 Apr 2015 11:48:56 -0700

    [ 
https://issues.apache.org/jira/browse/CHUKWA-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491643#comment-14491643
 ]


Eric Yang commented on CHUKWA-667:
----------------------------------

Hi Sreepathi,

Metrics for the whole day will update the same row.  However, row is just a 
reference pointer to the actual data block.  This reduces the number of lookup 
to the data block.  Cell appends to the new data in memory or WAL log and spill 
to disk during compaction.  This design reduces the stress point of monotonic 
increasing index.  It will reach optimal balanced regions after 1 year of 
running because we partition by day.  Partition by numeric number is better 
than metric group prefix because metric group prefix can generate uneven size 
of regions because some metric group contains more metrics than others.  For 
this reason, the design added day as prefix of the row key.

> Optimize the HBase schema for Ganglia queris
> --------------------------------------------
>
>                 Key: CHUKWA-667
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-667
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Processors
>    Affects Versions: 0.6.0
>            Reporter: Saisai Shao
>             Fix For: 0.7.0
>
>         Attachments: CHUKWA-667.patch
>
>
> Chukwa HBase table schema is designed for HICC, it cannot be fully adapted to 
> Ganglia web frontend for several reasons:
> (1) cannot fastly retrieve all the cluster and related host names.
> (2) system metrics have no attributes, like type, unit, so it is hard to 
> explain the collected metrics by code.
> (3) lack of data cosolidate function, choosing metric for a large time range 
> (like 30 days) will fetch all the data and draw graph, which will largely 
> lose performance.
> We will redesign the table schema that will be better adapted to Ganglia web 
> frontend queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CHUKWA-667) Optimize the HBase schema for Ganglia queris

Reply via email to