[jira] Commented: (CHUKWA-22) Need index for chukwa sequence files

Eric Yang (JIRA) Sat, 18 Jul 2009 20:29:36 -0700

    [ 
https://issues.apache.org/jira/browse/CHUKWA-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732961#action_12732961
 ]


Eric Yang commented on CHUKWA-22:
---------------------------------

Building index file would not be sufficient to serve chukwa data straight from 
HDFS for long term operation.  The cost for keeping index in memory will 
eventually require yet another distributed system to manage the index files.  
Instead of reinvent the wheel, chukwa should adopt a big table like solution 
like hbase to manage the data regions.

mapreduce-to-hbase example (http://wiki.apache.org/hadoop/Hbase/MapReduce) 
looks like exactly what Chukwa needs.  Hbase table schema for chukwa could look 
like this:

Table: SystemMetrics-[TimeType]
Column Family: cpu
Column Family: memory
Column Family: disk
Column Family: temperature
Column Family: network
Column Family: default
Column Family: log

Each row represent 1 minute average, 5 minutes average, etc.  This is 
determined on the time type.

Example of a column could be: idle:hostname1, busy:hostname1, idle:hostname2, 
busy: hostname2

log column family keeps the raw log entries for log viewing.


> Need index for chukwa sequence files
> ------------------------------------
>
>                 Key: CHUKWA-22
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-22
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: Data Processors
>         Environment: Redhat EL 5.1 and Java 6
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>
> Chukwa has ability to collect large volume of data, but the lack of index 
> prevents Chukwa front end to serve data straight from HDFS.  This jira is the 
> place holder for designing a indexing service for Chukwa.  The plan is to 
> create indexing service base on available software like lucene or katta.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CHUKWA-22) Need index for chukwa sequence files

Reply via email to