I think related to this would be HBASE-12790 where we would do a round robin scheduling and thus helps the shorter scans also to get some time slice to get the execution cycle.
On Thu, May 14, 2015 at 7:12 AM, Matteo Bertozzi <[email protected]> wrote: > @nick we have already something like that, which is HBASE-10993 > and it is basically reordering requests based on how many scan.next you > did. > (see the picture) > http://blog.cloudera.com/wp-content/uploads/2014/11/hbase-multi-f2.png > the problem is that we can't eject requests in execution and we are not > heavy enough on removing request from the queue and send a retry to the > client in case someone with more priority is in. > > Matteo > > > On Wed, May 13, 2015 at 6:38 PM, Nick Dimiduk <[email protected]> wrote: > > > I guess what I'm thinking of is more about scheduling than > > quota/throttling. I don't want my online requests to sit in a queue > behind > > MR requests while the MR work build up to it's quota amount. I want a > > scheduler to do time-slicing of operations, with preferential treatment > > given to online work over long scan ("analytical") work. For example, all > > scan RPC's "known" to cover "lots" of Cells get de-prioritized vs gets > and > > short scans. Maybe this is synthesized with an RPC annotation marking it > as > > "long" vs "short" -- MR scans are marked "long". I'm not sure, and I need > > to look more closely at recent scan improvements. IIRC, there's a > heartbeat > > now, which maybe is a general mechanism allowing for long operations to > not > > stomp over short operations. Heartbeat implies the long-running scan is > > coming up for air from time to time, allowing itself to be interrupted > and > > defer to higher priority work. This isn't preemption, but does allow for > an > > upper bound on how long the next queued task waits. > > > > On Wed, May 13, 2015 at 6:11 PM, Matteo Bertozzi < > [email protected]> > > wrote: > > > > > @nick what would you like to have? a match on a Job ID or something > like > > > that? > > > currently only user/table/namespace are supported, > > > but group support can be easily added. > > > not sure about a job-id or job-name since we don't have that info on > the > > > scan. > > > > > > On Wed, May 13, 2015 at 6:04 PM, Nick Dimiduk <[email protected]> > > wrote: > > > > > > > Sorry. Yeah, sure, I can ask over there. > > > > > > > > The throttle was set by user in these tests. You cannot directly > > > > > throttle a specific job, but do have the option to set the throttle > > > > > for a table or a namespace. That might be sufficient for you to > > > > > achieve your objective (unless those jobs are run by one user and > > > > > access the same table.) > > > > > > > > > > > > Maybe running as different users is the key, but this seems like a > very > > > > important use-case to support -- folks doing aggregate analysis > > > > concurrently on an online table. > > > > > > > > On Wed, May 13, 2015 at 5:53 PM, Stack <[email protected]> wrote: > > > > > > > > > Should we add in your comments on the blog Govind: i.e. the answers > > to > > > > > Nicks' questions? > > > > > St.Ack > > > > > > > > > > On Wed, May 13, 2015 at 5:48 PM, Govind Kamat <[email protected] > > > > > > wrote: > > > > > > > > > > > > This is a great demonstration of these new features, thanks > for > > > > > > pointing it > > > > > > > out Stack. > > > > > > > > > > > > > > I'm curious: what percentile latencies are this reported? Does > > the > > > > > > > non-throttled user see significant latency improvements in the > > 95, > > > > > 99pct > > > > > > > when the competing, scanning users are throttled? MB/s and > req/s > > > are > > > > > > > managed at the region level? Region server level? Aggregate? > > > > > > > > > > > > The latencies reported in the post are average latencies. > > > > > > > > > > > > Yes, the non-throttled user sees an across-the-board improvement > in > > > > > > the 95th and 99th percentiles, in addition to the improvement in > > > > > > average latency. The extent of improvement is significant as > well > > > but > > > > > > varies with the throttle pressure, just as in the case of the > > average > > > > > > latencies. > > > > > > > > > > > > The total throughput numbers (req/s) are aggregate numbers > reported > > > by > > > > > > the YCSB client. > > > > > > > > > > > > > These throttle points are by user? Is there a way for us to > say > > > "all > > > > > MR > > > > > > > jobs are lower priority than online queries"? > > > > > > > > > > > > > > > > > > > The throttle was set by user in these tests. You cannot directly > > > > > > throttle a specific job, but do have the option to set the > throttle > > > > > > for a table or a namespace. That might be sufficient for you to > > > > > > achieve your objective (unless those jobs are run by one user and > > > > > > access the same table.) > > > > > > > > > > > > Govind > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > Nick > > > > > > > > > > > > > > On Tue, May 12, 2015 at 1:58 PM, Stack <[email protected]> > > wrote: > > > > > > > > > > > > > > > .. by our Govind. > > > > > > > > > > > > > > > > See here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature > > > > > > > > > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
