Hello, list! I'm happy to announce Zohmg, a data store for aggregation of multi-dimensional time series data built on top of Hadoop, Dumbo and HBase. Data is imported with a mapreduce job and is exported through an HTTP API.
A typical use-case for Zohmg is the analysis of Apache log files. The analyst would be interested in breaking down pageviews by path, user agent, country of origin, etc. In-house at Last.fm, we have successfully demo'd an installation that served access data in realtime for millions of paths broken down by several dimension. The README at http://github.com/zohmg/zohmg/blob/master/README contains an example project to get you started. You can browse the source at http://github.com/zohmg/zohmg/tree/master and get the release tar ball from http://github.com/zohmg/zohmg/tarball/release-0.2.0 Zohmg works best on Hadoop 0.20.0 and HBase 0.20.0-alpha slash trunk. Help is available on IRC -- #zohmg on Freenode -- and at the user mailing list: http://groups.google.com/group/zohmg-user We're happy to accept contributions! Fork the code, use the issue tracker at http://github.com/zohmg/zohmg/issues and join the dev mailing list: http://groups.google.com/group/zohmg-dev Cheers, Fredrik