Also, check how balanced your region servers are accross all the nodes On Sat, Dec 22, 2012 at 8:50 AM, Varun Sharma <[email protected]> wrote:
> Note that adding nodes will improve throughput and not latency. So, if your > client application for benchmarking is single threaded, do not expect an > improvement in number of reads per second by just adding nodes. > > On Sat, Dec 22, 2012 at 8:23 AM, Michael Segel <[email protected] > >wrote: > > > I thought it was Doug Miel who said that HBase doesn't start to shine > > until you had at least 5 nodes. > > (Apologies if I misspelled Doug's name.) > > > > I happen to concur and if you want to start testing scalability, you will > > want to build a bigger test rig. > > > > Just saying! > > > > > > Oh and you're going to have a hot spot on that row key. > > Maybe do a hashed UUID ? > > > > I would suggest that you consider the following: > > > > Create N number of rows... where N is a very large number of rows. > > Then to generate your random access, do a full table scan to get the N > row > > keys in to memory. > > Using a random number generator, generate a random number and pop that > > row off the stack so that the next iteration is between 1 and (N-1). > > Do this 200K times. > > > > Now time your 200K random fetches. > > > > It would be interesting to see how it performs getting an average of a > > 'couple' of runs... then increase the key space by an order of magnitude. > > (Start w 1 million rows, 10 million rows, 100 million rows.... ) > > > > In theory... if properly tuned. One should expect near linear results . > > That is to say the time it takes to get() a row across the data space > > should be consistent. Although I wonder if you would have to somehow > clear > > the cache? > > > > > > Sorry, just a random thought... > > > > -Mike > > > > On Dec 22, 2012, at 10:06 AM, Ted Yu <[email protected]> wrote: > > > > > By '3 datanodes', did you mean that you also increased the number of > > region > > > servers to 3 ? > > > > > > When your test was running, did you look at Web UI to see whether load > > was > > > balanced ? You can also use Ganglia for such purpose. > > > > > > What version of HBase are you using ? > > > > > > Thanks > > > > > > On Sat, Dec 22, 2012 at 7:43 AM, Dalia Sobhy < > [email protected] > > >wrote: > > > > > >> Dear all, > > >> > > >> I am testing a simple hbase application on a cluster of multiple > nodes. > > >> > > >> I am especially testing the scalability performance, by measuring the > > time > > >> taken for random reads > > >> > > >> Data size: 200,000 row > > >> Row key : 0,1,2 very simple row key incremental > > >> > > >> But i don't know why by increasing the cluster size, I see the same > > time. > > >> > > >> For ex: > > >> 2 Datanodes: 1000 random read: 1.757 sec > > >> 3 datanodes: 1000 random read: 1.7 sec > > >> > > >> So any help plzzz ?? > > >> > > >> > > > > >
