You should be telling those customers that use of coprocessors "voids the warranty". They are a convenience for HBase project developers and advanced users, not a license for random devs to upload code into the server and then expect vendor support. It should be obvious on the face of it that is not a good idea, and so therefore not why coprocessors are in the HBase code in the first place.
On Sat, May 17, 2014 at 8:02 AM, Kevin O'dell <kevin.od...@cloudera.com>wrote: > Andrew, > > HBase-4047 is a great idea(even if it is three years old). I have had > numerous customers implement Co-Procs and take down every RS in a > spectacular fashion from JVM crashes to performance crawling so slow that > jobs fail out. I will raise this internally and see if we can get some > extra traction. > > > On Sat, May 17, 2014 at 9:33 AM, Andrew Purtell <apurt...@apache.org> > wrote: > > > Great, see HBASE-4047. In the best of the open source tradition, there > > hasn't been anyone sufficiently motivated to do the work necessary > (current > > use cases are "good enough"), but that someone can always come along. > > Perhaps that is yourself. > > > > > > On Sat, May 17, 2014 at 5:39 AM, Michael Segel < > michael_se...@hotmail.com > > >wrote: > > > > > You have to understand… > > > > > > I do see the importance of the hook to allow for a trigger to implement > > > 3rd party code on the server side. > > > No argument there. > > > > > > Its just how the current implementation doesn’t sandbox the code so > that > > > it limits the potential for harm to the RS. > > > > > > In simple terms you can isolate the code in to a separate jvm and use > IPC > > > to connect the sandbox to the RS when a trigger occurs. > > > > > > In C/C++ you’d have shared memory segments, something you don’t really > > > have in Java. (You could use C and then put a JNI wrapper around > this…) > > > > > > Which goes to my point… this is something that is solvable. You just > need > > > to think about it… > > > > > > You talk about RDBMSs. Triggers themselves are not an equivalent > analogy. > > > You can have a trigger that then calls some code written in an SPL and > > > you’re ok. You can control the SPL environment so that you limit the > risk > > > of the server crashing. > > > (SPL == Stored Procedure Language) > > > > > > If you’re running third party code from your trigger that is written > in > > > C/C++ or Java, then you have other issues. > > > > > > Sybase’s Adaptive Server had some serious issues and a poorly written > > > C/C++ code could cause serious performance issues… Informix IDS took a > > > different approach and didn’t have those issues. And I’m aging myself > > > because most here probably never worked with either Sybase or Informix > … > > ;-) > > > > > > So using your RDBMS analogy… you have two different approaches. One > > worked > > > … well enough, but was problematic. The other worked better and had > less > > > issues and was more secure. > > > > > > One of the reasons why this is important… the longer the current > > > implementation is in the wild, the longer and harder it will take to > fix. > > > > > > > > > On May 17, 2014, at 11:44 AM, qiang tian <tian...@gmail.com> wrote: > > > > > > > My small 2 cents...:-) > > > > > > > > Hook/coprocessor is useful mechanism to interacting with a system for > > > > things that cannot be done via API. For end user, the tradeoff > > factors > > > > like performance, security, reliability etc can be control by upper > > > layer' > > > > policy. > > > > e.g. In RDBMS, the end user has limited usage case for triggers, > which > > > > eliminates the security factor at all, and the performance tradeoff > is > > > > given to end user to decide. so from evolution's perspective, > > > > hook/coprocessor for end user could be controlled by query engine > layer > > > > like Phoenix. > > > > > > > > For internal user, hook better not be used widely unless it is a MUST > > or > > > > strong flexibility/plugability is required. e.g. things can be part > of > > > the > > > > core better not use it. > > > > > > > > thanks. > > > > > > > > > > > > > > > > On Sat, May 17, 2014 at 4:04 PM, Michael Segel < > > > michael_se...@hotmail.com>wrote: > > > > > > > >> Andrew, > > > >> > > > >> Is ‘magical fairy dust’ a reference to some new synthetic drug you > > take > > > at > > > >> raves? > > > >> But lets get back to reality. > > > >> > > > >> > > > >> Lets try this again; simply put… the coprocessor runs on the same > JVM > > as > > > >> the RS, therefore you have an unacceptable level of risk. > > > >> That inherent risk means that you cannot run HBase with end-user > > > >> coprocessors enabled when you want to have a stable and somewhat > > secure > > > >> environment. > > > >> > > > >> The simple truth is that you need to decouple the end-user code > > > >> (coprocessor) from the RS. > > > >> Its not a difficult concept to understand, and while reasonable, it > > > would > > > >> mean a major rewrite and work done on co-processors. > > > >> > > > >> Will de-coupling the user-space from the RS remove all risk? No. > And > > > no, > > > >> I’m not suggesting that. > > > >> But its a critical piece to the puzzle. > > > >> > > > >> Its not just security, but also reliability. > > > >> > > > >> > > > >> On May 17, 2014, at 4:43 AM, Andrew Purtell <apurt...@apache.org> > > > wrote: > > > >> > > > >>> Michael, > > > >>> > > > >>> As you know, we have implemented security features with > coprocessors > > > >>> precisely because they can be interposed on internal actions to > make > > > >>> authoritative decisions in-process. Coprocessors are a way to have > > > >>> composable internal extensions. They don't have and probably never > > will > > > >>> have magic fairy security dust. We do trust the security > coprocessor > > > code > > > >>> because it was developed by the project. That is not the same thing > > as > > > >>> saying you can have 'security' and execute arbitrary user code > > > in-process > > > >>> as a coprocessor. Just want to clear that up for you. > > > >>> > > > >>>> will want to allow system coprocessors but then write a > coprocessor > > > that > > > >>> reject user coprocessors. > > > >>> > > > >>> That's a reasonable point. > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> On Sat, May 17, 2014 at 12:13 AM, Michael Segel > > > >>> <michael_se...@hotmail.com>wrote: > > > >>> > > > >>>> Until you move the coprocessor out of the RS space and into its > own > > > >>>> sandbox… saying security and coprocessor in the same sentence is a > > > joke. > > > >>>> Oh wait… you were serious… :-( > > > >>>> > > > >>>> I’d say there’s a significant rethink on coprocessors that’s > > required. > > > >>>> > > > >>>> Anyone running a secure (kerberos) cluster, will want to allow > > system > > > >>>> coprocessors but then write a coprocessor that reject user > > > coprocessors. > > > >>>> > > > >>>> Just putting it out there… > > > >>>> > > > >>>> On May 15, 2014, at 2:13 AM, Andrew Purtell <apurt...@apache.org> > > > >> wrote: > > > >>>> > > > >>>>> Because coprocessor APIs are so tightly bound with internals, if > we > > > >> apply > > > >>>>> suggested rules like as mentioned on HBASE-11054: > > > >>>>> > > > >>>>> I'd say policy should be no changes to method apis across > minor > > > >>>>> versions > > > >>>>> > > > >>>>> This will lock coprocessor based components to the limitations of > > the > > > >> API > > > >>>>> as we encounter them. Core code does not suffer this limitation, > we > > > are > > > >>>>> otherwise free to refactor and change internal methods. For > > example, > > > if > > > >>>> we > > > >>>>> apply this policy to the 0.98 branch, then we will have to > abandon > > > >>>> further > > > >>>>> security feature development there and move to trunk only. This > is > > > >>>> because > > > >>>>> we already are aware that coprocessor APIs as they stand are > > > >> insufficient > > > >>>>> still. > > > >>>>> > > > >>>>> Coprocessor APIs are a special class of internal method. We have > > had > > > a > > > >>>>> tension between allowing freedom of movement for developing them > > out > > > >> and > > > >>>>> providing some measure of stability for implementors for a while. > > > >>>>> > > > >>>>> It is my belief that the way forward is something like > HBASE-11125. > > > >>>> Perhaps > > > >>>>> we can take this discussion to that JIRA and have this long > overdue > > > >>>>> conversation. > > > >>>>> > > > >>>>> Regarding security features specifically, I would also like to > call > > > >> your > > > >>>>> attention to HBASE-11127. I think security has been an optional > > > feature > > > >>>>> long enough, it is becoming a core requirement for the project, > so > > > >> should > > > >>>>> be moved into core. Sure, we can therefore sidestep any issues > with > > > >>>>> coprocessor API sufficiency for hosting security features. > However, > > > in > > > >> my > > > >>>>> opinion we should pursue both HBASE-11125 and HBASE-11127; the > > first > > > to > > > >>>>> provide the relative stability long asked for by coprocessor API > > > users, > > > >>>> the > > > >>>>> latter to cleanly solve emerging issues with concurrency and > > > >> versioning. > > > >>>>> > > > >>>>> > > > >>>>> -- > > > >>>>> Best regards, > > > >>>>> > > > >>>>> - Andy > > > >>>>> > > > >>>>> Problems worthy of attack prove their worth by hitting back. - > Piet > > > >> Hein > > > >>>>> (via Tom White) > > > >>>> > > > >>>> > > > >>> > > > >>> > > > >>> -- > > > >>> Best regards, > > > >>> > > > >>> - Andy > > > >>> > > > >>> Problems worthy of attack prove their worth by hitting back. - Piet > > > Hein > > > >>> (via Tom White) > > > >> > > > >> > > > > > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > > > > > -- > Kevin O'Dell > Systems Engineer, Cloudera > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)