[jira] [Created] (GORA-413) Support creation of dynamic columns within Gora datastore mapping designs

Lewis John McGibbney (JIRA) Thu, 26 Feb 2015 12:07:33 -0800

Lewis John McGibbney created GORA-413:
-----------------------------------------


             Summary: Support creation of dynamic columns within Gora datastore 
mapping designs
                 Key: GORA-413
                 URL: https://issues.apache.org/jira/browse/GORA-413
             Project: Apache Gora
          Issue Type: New Feature
          Components: gora-hbase
    Affects Versions: 0.6
            Reporter: Lewis John McGibbney
             Fix For: 0.7


The conversation taking place on [dynamically generating HBase 
columns|http://www.mail-archive.com/dev%40gora.apache.org/msg05754.html] has 
raised an issue that new functionality needs to be added in order to achieve 
this.
The main driver for this issue coming to light is that Chukwa logs need to 
dynamically create many many columns over time directly dependent on the number 
of data chunks we get. Each data chunk has a [Sequence ID], this sequenceID 
should be the column name.

The table design will look like this

{code}

Row Key: [Invert Date]:[Data Type]:[Primary Key]
Column Family: log
Column Name: [Sequence ID]
Timestamp: [log entry timestamp]

Example:

Row Key: 2132013102:TT:host1.example.com
Column Family: log
Column Name: 1230
Cell Value: 2013-01-23 12:01:30 INFO This is a log entry.
Timestamp: 1358942490
{code}

The inverted date allow the table to be partitioned by hour or day of the month 
or month more easily.
The usage of column name for consecutive sequence to allow fast retrieval in a 
linear scan. This format is typically good for retrieve a hour worth of logs 
fast for a node. Hence, if we are doing batch scanning of the table in a 
rolling window via map reduce job at every hour interval, we get a even spread 
the work load to multiple map reduce tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (GORA-413) Support creation of dynamic columns within Gora datastore mapping designs

Reply via email to