On Thu, Jan 21, 2010 at 1:09 AM, Dan Washusen <d...@reactive.org> wrote: > Have you read the bigtable paper linked off the front page of HBase? It > does a good job of explaining the concepts. Basically it's a distributed > sorted map (think java.util.NavigableMap but split over many machines). If > you know the key of the row you are looking for HBase can fetch it very > quickly. If you don't know the key you'll have to resort to scanning all > the rows to find the data you are interested in (just like a SQL query that > can't take advantage of an index)... > > Do the queries need to immediately reflect any writes or is it sufficient > for them to become eventually consistent? If you can live with eventual > consistency then you could write some map reduce jobs that duplicate a > master table into reporting tables (like you would for data > warehousing/reporting on a RDMS). > > I'm sure some of the more experienced users will have more insight but that > might get you started... > > Cheers, > Dan > > p.s. bold text doesn't seem to come through the mailing list... > > 2010/1/21 canucks <anh...@gmail.com> > >> >> Hi, >> >> i'm pretty interested in learning hbase. what i want to do is store >> financial data for analytical/graphing/displaying purposes. there hundreds >> of millions of rows and of course, i want fast response when retrieving the >> data. >> >> if i were to do it in a RDBMS it would be >> REPORT, MARKET, OPERATING_DATE, OPERATING_INTERVAL, HOUR_ENDING >> VALUE >> where the bolded column name are PK. if i were to store this in hbase >> would >> it look like this? >> >> REPORT.MARKET.OPERATING_DATE.OPERATING_INTERVAL.HOUR_ENDING.TIMESTAMP{ >> VALUE: 92.29 >> } >> >> so that i can do queries like below: >> - give me all reports with the name of "ABC" >> - give me all the values where OPERATING_DATE is from jan-01-2010 to >> jan-10-2010 >> - give me all the values where OPERATING_DATE is from jan-01-2010 to >> jan-10-2010 and HOUR_ENDING is between 5 and 10 (or simply 5 or variations >> thereof) >> >> in short, is hbase the wrong way to go about it or would it yield better >> performance? also, you folks happen to know any good links/articles on >> hbase table & schema? >> >> thanks >> -- >> View this message in context: >> http://old.nabble.com/learning-hbase---schema-design-advice-tp27252203p27252203.html >> Sent from the HBase User mailing list archive at Nabble.com. >> >> > I went looking for a paper "how to convert my RDBMS mindset to a key-value store midset" Here is something that got me started.
http://s-expressions.com/2009/03/08/hbase-on-designing-schemas-for-column-oriented-data-stores/