On Wed, Feb 12, 2014 at 12:46 PM, Gary Helmling <ghelml...@gmail.com> wrote:
> > > > 'Repurpose' might not be the way I would put it. > > > > Coprocessors were and are a means for internal server extension by mixin. > > The original problem we solved was needing to subclass HRegionServer and > > other classes to extend core HBase functions, but having more than one > > otherwise orthogonal extension that users want to use. Now we can mix in > > multiple extensions with a framework that has some simple rules for > > cooperation between the extensions. > > > > We return to the earlier state of affairs with modules. Sure, we can plug > > in an alternate behavior with a module that subclasses and extends the > > default, say flush strategy, but we can't then instantiate multiple > modules > > into the same slot, both subclassing the same base but doing different > > things. > > > > > > I agree the ability to compose coprocessors in order to extend behavior is > a key capability that we should not throw out. > > I think the current Observer APIs could probably do with a bit of > reorganization to make them a little more accessible and comprehensible. I > think there is also an emerging need to see if we can define some subset of > these APIs that we can stabilize for easier public consumption, while > keeping the rest of the APIs free to evolve as needed as HBase internals > change (since these are an extension mechanism for internal behaviors). > I'm not sure we've really seen enough commonality emerge yet to say what > those APIs are though. We could try to define the public subset as those > involved in client requests, but flush and compaction, for example, can > also be triggered by client requests. And my own use of coprocessor APIs > lately has been focused on overriding the flush and compaction behaviors, > not on client requests. > > I think the best place to start is by breaking up some of the current APIs, > grouping them around behaviors or areas of functionality. Whether we call > some of these "coprocessors" and others "plugins" is a question of > branding. I do think it's important to figure out which we can stabilize > and offer longer term contracts for. But whatever we call them, I strongly > agree that we should maintain the "mixin" / composition approach and not > return to a simple fixed inheritance scheme. > I've always considered coprocessors to be the "kernel modules" of the HBase world. They give you way more power than user-space programming, but come with the cautions that if you make a mistake, you'll crash your whole system or trigger unexpected behavior. Given that, I don't think we should really be spending too much effort on coprocessor API stability. If we make this a requirement, it can hamper the ability of the HBase core developers to make good changes which really improve the system. I don't think we're at the level of maturity as a project where this is the right tradeoff, as of yet. For what it's worth, the Linux kernel module API is also not stable/compatible between versions. This document is a good read: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/stable_api_nonsense.txt I do think we should seek to keep the interfaces stable through *patch* level releases -- a bug fix shouldn't break a coprocessor API. But between minor releases that add new features, it seems like an unnecessary restriction. -Todd -- Todd Lipcon Software Engineer, Cloudera