unless you need low latency access to all of this time series, it might be a more cost efficient path to store large archives of the data in plain HDFS.
The scanning can be done more efficiently in a lot of cases in MapReduce + HDFS. Some links: OSCON-data presentation (good TVA story here): http://www.slideshare.net/jpatanooga/oscon-data-2011-lumberyard http://www.slideshare.net/cloudera/hadoop-as-the-platform-for-the-smartgrid-at-tva Engineering Literature: http://openpdc.codeplex.com/ Josh On Thu, May 17, 2012 at 7:23 PM, Rita <[email protected]> wrote: > Hello, > > Currently, using hbase to store sensor data -- basically large time series > data hitting close to 2 billion rows for a type of sensor. I was wondering > how hbase differs from HDF (http://www.hdfgroup.org/HDF5/) file format. > Most of my operations are scanning a range and getting its values but it > seems I can achieve this usind HDF. Does anyone have experience with this > file container format and shed some light? > > > > > -- > --- Get your facts first, then you can distort them as you please.-- -- Twitter: @jpatanooga Solution Architect @ Cloudera hadoop: http://www.cloudera.com
