8% percentile. sorry. 8% requests do land under 3ms or less.
On Wed, Apr 20, 2011 at 12:06 PM, Ted Dunning <[email protected]> wrote: > What is meant by 8% quartile? 75th %-ile? 98%-ile? Should quartile have > been quantile? > > On Wed, Apr 20, 2011 at 12:00 PM, Dmitriy Lyubimov <[email protected]>wrote: > >> Ok actually we do have 1 region for these exact tables... so back to >> square one. >> >> FWIW i do get 8% quartile under 3ms TTLB. So it is algorithmically >> sound it seems. question is why outliers spread is so much longer than >> in tests on one machine. must be network. What else. >> >> >> On Wed, Apr 20, 2011 at 10:06 AM, Dmitriy Lyubimov <[email protected]> >> wrote: >> > Got it. This must be the reason. Cause it is a laugh check, and i do >> > see 6 regions for 40 rows so it can span them, although i can't >> > confirm it for sure. It may be due to how table was set up or due to >> > some time running them and rotating some data there. The uniformly >> > distributed hashes are used for the keys so that it is totally >> > plausible 40 rows will get into 6 different regions. >> > >> > Ok i'll take it for working theory for now. >> > >> > Is there a way to set max # of regions per table? I guess the method >> > in the manual is to set max region size. Which means i probably need >> > to re-create the table with one region to get back to 1 region? or >> > maybe there's a way to get it back to one region without recreating >> > it, such as major compaction? >> > >> > thanks. >> > -d >> > >> > On Wed, Apr 20, 2011 at 9:55 AM, Stack <[email protected]> wrote: >> >> On Wed, Apr 20, 2011 at 9:49 AM, Dmitriy Lyubimov <[email protected]> >> wrote: >> >>> Ok. Let me ask a question. >> >>> >> >>> When scan is performed and it obviously covers several regions, are >> >>> scan performance calls done in sinchronous succession or they are done >> >>> in parallel? >> >>> >> >> >> >> The former. >> >> >> >> >> >>> Assuming scan is returning 40 results but for some weird reason it >> >>> goes to 6 regions and caching is set to 100 (so it can take all of >> >>> them) are individual region request latencies summed or it would be >> >>> max(region request latency)? >> >>> >> >> >> >> Summed. >> >> >> >> The 40 rows are not contiguous in the same region? If not, the cost >> >> of client setting up new scanner against next region will be inline w/ >> >> your read timing (at least an rpc per region). >> >> >> >> St.Ack >> >> >> >>> Thank you very much. >> >>> -D >> >>> >> >>> On Tue, Apr 19, 2011 at 6:28 PM, Ted Dunning <[email protected]> >> wrote: >> >>>> For a tiny test like this, everything should be in memory and latency >> >>>> should be very low. >> >>>> >> >>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <[email protected]> >> wrote: >> >>>>> PS so what should latency be for reads in 0.90, assuming moderate >> thruput? >> >>>>> >> >>>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <[email protected]> >> wrote: >> >>>>>> for this test, there's just no more than 40 rows in every given >> table. >> >>>>>> This is just a laugh check. >> >>>>>> >> >>>>>> so i think it's safe to assume it all goes to same region server. >> >>>>>> >> >>>>>> But latency would not depend on which server call is going to, would >> >>>>>> it? Only throughput would, assuming we are not overloading. >> >>>>>> >> >>>>>> And we clearly are not as my single-node local version runs quite ok >> >>>>>> response times with the same throughput. >> >>>>>> >> >>>>>> It's something with either client connections or network latency or >> >>>>>> ... i don't know what it is. I did not set up the cluster but i >> gotta >> >>>>>> troubleshoot it now :) >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <[email protected]> >> wrote: >> >>>>>>> How many regions? How are they distributed? >> >>>>>>> >> >>>>>>> Typically it is good to fill the table some what and then drive >> some >> >>>>>>> splits and balance operations via the shell. One more split to >> make >> >>>>>>> the regions be local and you should be good to go. Make sure you >> have >> >>>>>>> enough keys in the table to support these splits, of course. >> >>>>>>> >> >>>>>>> Under load, you can look at the hbase home page to see how >> >>>>>>> transactions are spread around your cluster. Without splits and >> local >> >>>>>>> region files, you aren't going to see what you want in terms of >> >>>>>>> performance. >> >>>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> >> >> > >> >
