[
https://issues.apache.org/jira/browse/GORA-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339073#comment-14339073
]
Lewis John McGibbney commented on GORA-413:
-------------------------------------------
I suggest that we start with HBase and move to other datastores based on us
conquering this one first.
> Support creation of dynamic columns within Gora datastore mapping designs
> -------------------------------------------------------------------------
>
> Key: GORA-413
> URL: https://issues.apache.org/jira/browse/GORA-413
> Project: Apache Gora
> Issue Type: New Feature
> Components: gora-hbase
> Affects Versions: 0.6
> Reporter: Lewis John McGibbney
> Fix For: 0.7
>
>
> The conversation taking place on [dynamically generating HBase
> columns|http://www.mail-archive.com/dev%40gora.apache.org/msg05754.html] has
> raised an issue that new functionality needs to be added in order to achieve
> this.
> The main driver for this issue coming to light is that Chukwa logs need to
> dynamically create many many columns over time directly dependent on the
> number of data chunks we get. Each data chunk has a [Sequence ID], this
> sequenceID should be the column name.
> The table design will look like this
> {code}
> Row Key: [Invert Date]:[Data Type]:[Primary Key]
> Column Family: log
> Column Name: [Sequence ID]
> Timestamp: [log entry timestamp]
> Example:
> Row Key: 2132013102:TT:host1.example.com
> Column Family: log
> Column Name: 1230
> Cell Value: 2013-01-23 12:01:30 INFO This is a log entry.
> Timestamp: 1358942490
> {code}
> The inverted date allow the table to be partitioned by hour or day of the
> month or month more easily.
> The usage of column name for consecutive sequence to allow fast retrieval in
> a linear scan. This format is typically good for retrieve a hour worth of
> logs fast for a node. Hence, if we are doing batch scanning of the table in a
> rolling window via map reduce job at every hour interval, we get a even
> spread the work load to multiple map reduce tasks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)