[ 
https://issues.apache.org/jira/browse/CHUKWA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321852#comment-14321852
 ] 

Eric Yang commented on CHUKWA-734:
----------------------------------

This would be something really great to have.  My recommendation is to write a 
Gora Writer class which extends PipelineWriter.  Timestamp or time partition 
are primary element of a log file, however, it is not a good idea to store 
monotonic increasing sequence row key in hbase or any of the Big table style 
database.  What would you recommend to be design for primary key and how it 
could ensure HBase region server are spread evenly?  We have another JIRA, 
CHUKWA-667 which talks about the design of row key.  I am not satisfied with 
the row key design that I outlined.  Having Gora in the mix may enable some 
interesting optimization.

> Gora Storage System for Chuckwa Logs
> ------------------------------------
>
>                 Key: CHUKWA-734
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-734
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.6.0
>            Reporter: Lewis John McGibbney
>             Fix For: 0.6.0
>
>
> I would like to build a Gora-backed log-to-datastore module for Chuckwa. I am 
> going to work on this today.
> Gora is an in-memory data modeling and storage abstraction 
> http://gora.apache.org
> Gora powers the Apache Nutch 2.X software which generates a bunch of log 
> data. Having a Chuckwa monitoring tool for Nutch would be grand.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to