The fastest way to get 500-1000 records is to organize them (via their rowkey) so that they are sorted together and then use a single scan operation with scan caching enabled. That I'd be surprised if that takes more than 100ms or so and that time won't change at all as your data volume is growing.
Also, how big was the dataset you tested with? After a year you'll have 29bn records (assuming 80m/day). With 50m records you do not need to bother with HBase. ________________________________ From: Oleg Ruchovets <[email protected]> To: [email protected] Sent: Friday, March 2, 2012 11:37 AM Subject: hbase vs mysql read benchmark Hi. We are going to write to hbase 50-80 million records on daily basis. Actually write performance is not critical. We are going read relatively small amount of data 500-1000 records per one request. We use multiGet api (our assumption is fastest way to get 500-1000 records is it correct?). Running a benchmark we get a results: Read from hbase takes 2 seconds. Reading the same records from MySql tooks 1 second. Difference is 100%. The questions is what is the way to speed up hbase read operation. Thanks in advance. Oleg.
