Any reason not to directly write to HBase instead of a log file that will end up in HBase anyway?

When I say log I don't necessarily mean a raw log file... but rather some metadata about the request.. ie, uri, ip, cookie, session, user, country, browser...

On 7/26/11 12:19 PM, Stack wrote:
On Tue, Jul 26, 2011 at 7:39 AM, Mark<[email protected]>  wrote:
So my first question is, would HBase fit our use case? If not
can anyone offer some advice on what would/should be used?

You mean HBase as the sink for your log emitters?

The pattern I usually see is that there is intermediary, a flume or
scribe pushing the logs up into hdfs and then the log events are
hoisted up into hbase to field queries [1, 2]

Assuming HBase does fit our use case does anyone have any suggestions on
what type of tables/columns would be needed?

All the metadata in one columnfamily and all of the data into another
though sounds like you are all small data so one cf should do you.
Figure how you are going to query it.  That'll define your schema.

St.Ack

1. This link describes the reporting UI for fb a analytics platform
2. Description of how the backend works:
http://nosql.mypopescu.com/post/3657671463/facebook-builds-hbase-based-real-time-analytics

Reply via email to