Hello, list!

I'm happy to announce Zohmg, a data store for aggregation of multi-dimensional
time series data built on top of Hadoop, Dumbo and HBase. Data is imported
with a mapreduce job and is exported through an HTTP API.

A typical use-case for Zohmg is the analysis of Apache log files. The analyst
would be interested in breaking down pageviews by path, user agent, country of
origin, etc. In-house at Last.fm, we have successfully demo'd an installation
that served access data in realtime for millions of paths broken down by
several dimension.

The README at http://github.com/zohmg/zohmg/blob/master/README contains an
example project to get you started.

You can browse the source at http://github.com/zohmg/zohmg/tree/master and get
the release tar ball from http://github.com/zohmg/zohmg/tarball/release-0.2.0

Zohmg works best on Hadoop 0.20.0 and HBase 0.20.0-alpha slash trunk.

Help is available on IRC -- #zohmg on Freenode -- and at the user mailing
list: http://groups.google.com/group/zohmg-user

We're happy to accept contributions! Fork the code, use the issue tracker at
http://github.com/zohmg/zohmg/issues and join the dev mailing list:
http://groups.google.com/group/zohmg-dev

Cheers,
Fredrik

Reply via email to