I guess what I'm thinking of is more about scheduling than
quota/throttling. I don't want my online requests to sit in a queue behind
MR requests while the MR work build up to it's quota amount. I want a
scheduler to do time-slicing of operations, with preferential treatment
given to online work over long scan ("analytical") work. For example, all
scan RPC's "known" to cover "lots" of Cells get de-prioritized vs gets and
short scans. Maybe this is synthesized with an RPC annotation marking it as
"long" vs "short" -- MR scans are marked "long". I'm not sure, and I need
to look more closely at recent scan improvements. IIRC, there's a heartbeat
now, which maybe is a general mechanism allowing for long operations to not
stomp over short operations. Heartbeat implies the long-running scan is
coming up for air from time to time, allowing itself to be interrupted and
defer to higher priority work. This isn't preemption, but does allow for an
upper bound on how long the next queued task waits.

On Wed, May 13, 2015 at 6:11 PM, Matteo Bertozzi <[email protected]>
wrote:

> @nick what would you like to have? a match on a Job ID or something like
> that?
> currently only user/table/namespace are supported,
> but group support can be easily added.
> not sure about a job-id or job-name since we don't have that info on the
> scan.
>
> On Wed, May 13, 2015 at 6:04 PM, Nick Dimiduk <[email protected]> wrote:
>
> > Sorry. Yeah, sure, I can ask over there.
> >
> > The throttle was set by user in these tests.  You cannot directly
> > > throttle a specific job, but do have the option to set the throttle
> > > for a table or a namespace.  That might be sufficient for you to
> > > achieve your objective (unless those jobs are run by one user and
> > > access the same table.)
> >
> >
> > Maybe running as different users is the key, but this seems like a very
> > important use-case to support -- folks doing aggregate analysis
> > concurrently on an online table.
> >
> > On Wed, May 13, 2015 at 5:53 PM, Stack <[email protected]> wrote:
> >
> > > Should we add in your comments on the blog Govind: i.e. the answers to
> > > Nicks' questions?
> > > St.Ack
> > >
> > > On Wed, May 13, 2015 at 5:48 PM, Govind Kamat <[email protected]>
> > wrote:
> > >
> > > >  > This is a great demonstration of these new features, thanks for
> > > > pointing it
> > > >  > out Stack.
> > > >  >
> > > >  > I'm curious: what percentile latencies are this reported? Does the
> > > >  > non-throttled user see significant latency improvements in the 95,
> > > 99pct
> > > >  > when the competing, scanning users are throttled? MB/s and req/s
> are
> > > >  > managed at the region level? Region server level? Aggregate?
> > > >
> > > > The latencies reported in the post are average latencies.
> > > >
> > > > Yes, the non-throttled user sees an across-the-board improvement in
> > > > the 95th and 99th percentiles, in addition to the improvement in
> > > > average latency.  The extent of improvement is significant as well
> but
> > > > varies with the throttle pressure, just as in the case of the average
> > > > latencies.
> > > >
> > > > The total throughput numbers (req/s) are aggregate numbers reported
> by
> > > > the YCSB client.
> > > >
> > > >  > These throttle points are by user? Is there a way for us to say
> "all
> > > MR
> > > >  > jobs are lower priority than online queries"?
> > > >  >
> > > >
> > > > The throttle was set by user in these tests.  You cannot directly
> > > > throttle a specific job, but do have the option to set the throttle
> > > > for a table or a namespace.  That might be sufficient for you to
> > > > achieve your objective (unless those jobs are run by one user and
> > > > access the same table.)
> > > >
> > > > Govind
> > > >
> > > >
> > > >  > Thanks,
> > > >  > Nick
> > > >  >
> > > >  > On Tue, May 12, 2015 at 1:58 PM, Stack <[email protected]> wrote:
> > > >  >
> > > >  > > .. by our Govind.
> > > >  > >
> > > >  > > See here:
> > > >  > >
> > > >
> > >
> >
> https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature
> > > >  > >
> > > >  > > St.Ack
> > > >  > >
> > > >
> > >
> >
>

Reply via email to