And you should consult a lawyer before you make a statement like that… These are exposed APIs and Cloudera, Hortonworks, MapR, Pivotal, even Intel… if they still have licensed customers.. all have to support their releases.
BTW, I think I’m the only person who’s given a talk trying to explain the dangers of coprocessors and why they shouldn’t be used. ;-) On May 18, 2014, at 3:09 AM, Andrew Purtell <apurt...@apache.org> wrote: > You should be telling those customers that use of coprocessors "voids the > warranty". They are a convenience for HBase project developers and advanced > users, not a license for random devs to upload code into the server and > then expect vendor support. It should be obvious on the face of it that is > not a good idea, and so therefore not why coprocessors are in the HBase > code in the first place. > > > On Sat, May 17, 2014 at 8:02 AM, Kevin O'dell <kevin.od...@cloudera.com>wrote: > >> Andrew, >> >> HBase-4047 is a great idea(even if it is three years old). I have had >> numerous customers implement Co-Procs and take down every RS in a >> spectacular fashion from JVM crashes to performance crawling so slow that >> jobs fail out. I will raise this internally and see if we can get some >> extra traction. >> >> >> On Sat, May 17, 2014 at 9:33 AM, Andrew Purtell <apurt...@apache.org> >> wrote: >> >>> Great, see HBASE-4047. In the best of the open source tradition, there >>> hasn't been anyone sufficiently motivated to do the work necessary >> (current >>> use cases are "good enough"), but that someone can always come along. >>> Perhaps that is yourself. >>> >>> >>> On Sat, May 17, 2014 at 5:39 AM, Michael Segel < >> michael_se...@hotmail.com >>>> wrote: >>> >>>> You have to understand… >>>> >>>> I do see the importance of the hook to allow for a trigger to implement >>>> 3rd party code on the server side. >>>> No argument there. >>>> >>>> Its just how the current implementation doesn’t sandbox the code so >> that >>>> it limits the potential for harm to the RS. >>>> >>>> In simple terms you can isolate the code in to a separate jvm and use >> IPC >>>> to connect the sandbox to the RS when a trigger occurs. >>>> >>>> In C/C++ you’d have shared memory segments, something you don’t really >>>> have in Java. (You could use C and then put a JNI wrapper around >> this…) >>>> >>>> Which goes to my point… this is something that is solvable. You just >> need >>>> to think about it… >>>> >>>> You talk about RDBMSs. Triggers themselves are not an equivalent >> analogy. >>>> You can have a trigger that then calls some code written in an SPL and >>>> you’re ok. You can control the SPL environment so that you limit the >> risk >>>> of the server crashing. >>>> (SPL == Stored Procedure Language) >>>> >>>> If you’re running third party code from your trigger that is written >> in >>>> C/C++ or Java, then you have other issues. >>>> >>>> Sybase’s Adaptive Server had some serious issues and a poorly written >>>> C/C++ code could cause serious performance issues… Informix IDS took a >>>> different approach and didn’t have those issues. And I’m aging myself >>>> because most here probably never worked with either Sybase or Informix >> … >>> ;-) >>>> >>>> So using your RDBMS analogy… you have two different approaches. One >>> worked >>>> … well enough, but was problematic. The other worked better and had >> less >>>> issues and was more secure. >>>> >>>> One of the reasons why this is important… the longer the current >>>> implementation is in the wild, the longer and harder it will take to >> fix. >>>> >>>> >>>> On May 17, 2014, at 11:44 AM, qiang tian <tian...@gmail.com> wrote: >>>> >>>>> My small 2 cents...:-) >>>>> >>>>> Hook/coprocessor is useful mechanism to interacting with a system for >>>>> things that cannot be done via API. For end user, the tradeoff >>> factors >>>>> like performance, security, reliability etc can be control by upper >>>> layer' >>>>> policy. >>>>> e.g. In RDBMS, the end user has limited usage case for triggers, >> which >>>>> eliminates the security factor at all, and the performance tradeoff >> is >>>>> given to end user to decide. so from evolution's perspective, >>>>> hook/coprocessor for end user could be controlled by query engine >> layer >>>>> like Phoenix. >>>>> >>>>> For internal user, hook better not be used widely unless it is a MUST >>> or >>>>> strong flexibility/plugability is required. e.g. things can be part >> of >>>> the >>>>> core better not use it. >>>>> >>>>> thanks. >>>>> >>>>> >>>>> >>>>> On Sat, May 17, 2014 at 4:04 PM, Michael Segel < >>>> michael_se...@hotmail.com>wrote: >>>>> >>>>>> Andrew, >>>>>> >>>>>> Is ‘magical fairy dust’ a reference to some new synthetic drug you >>> take >>>> at >>>>>> raves? >>>>>> But lets get back to reality. >>>>>> >>>>>> >>>>>> Lets try this again; simply put… the coprocessor runs on the same >> JVM >>> as >>>>>> the RS, therefore you have an unacceptable level of risk. >>>>>> That inherent risk means that you cannot run HBase with end-user >>>>>> coprocessors enabled when you want to have a stable and somewhat >>> secure >>>>>> environment. >>>>>> >>>>>> The simple truth is that you need to decouple the end-user code >>>>>> (coprocessor) from the RS. >>>>>> Its not a difficult concept to understand, and while reasonable, it >>>> would >>>>>> mean a major rewrite and work done on co-processors. >>>>>> >>>>>> Will de-coupling the user-space from the RS remove all risk? No. >> And >>>> no, >>>>>> I’m not suggesting that. >>>>>> But its a critical piece to the puzzle. >>>>>> >>>>>> Its not just security, but also reliability. >>>>>> >>>>>> >>>>>> On May 17, 2014, at 4:43 AM, Andrew Purtell <apurt...@apache.org> >>>> wrote: >>>>>> >>>>>>> Michael, >>>>>>> >>>>>>> As you know, we have implemented security features with >> coprocessors >>>>>>> precisely because they can be interposed on internal actions to >> make >>>>>>> authoritative decisions in-process. Coprocessors are a way to have >>>>>>> composable internal extensions. They don't have and probably never >>> will >>>>>>> have magic fairy security dust. We do trust the security >> coprocessor >>>> code >>>>>>> because it was developed by the project. That is not the same thing >>> as >>>>>>> saying you can have 'security' and execute arbitrary user code >>>> in-process >>>>>>> as a coprocessor. Just want to clear that up for you. >>>>>>> >>>>>>>> will want to allow system coprocessors but then write a >> coprocessor >>>> that >>>>>>> reject user coprocessors. >>>>>>> >>>>>>> That's a reasonable point. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Sat, May 17, 2014 at 12:13 AM, Michael Segel >>>>>>> <michael_se...@hotmail.com>wrote: >>>>>>> >>>>>>>> Until you move the coprocessor out of the RS space and into its >> own >>>>>>>> sandbox… saying security and coprocessor in the same sentence is a >>>> joke. >>>>>>>> Oh wait… you were serious… :-( >>>>>>>> >>>>>>>> I’d say there’s a significant rethink on coprocessors that’s >>> required. >>>>>>>> >>>>>>>> Anyone running a secure (kerberos) cluster, will want to allow >>> system >>>>>>>> coprocessors but then write a coprocessor that reject user >>>> coprocessors. >>>>>>>> >>>>>>>> Just putting it out there… >>>>>>>> >>>>>>>> On May 15, 2014, at 2:13 AM, Andrew Purtell <apurt...@apache.org> >>>>>> wrote: >>>>>>>> >>>>>>>>> Because coprocessor APIs are so tightly bound with internals, if >> we >>>>>> apply >>>>>>>>> suggested rules like as mentioned on HBASE-11054: >>>>>>>>> >>>>>>>>> I'd say policy should be no changes to method apis across >> minor >>>>>>>>> versions >>>>>>>>> >>>>>>>>> This will lock coprocessor based components to the limitations of >>> the >>>>>> API >>>>>>>>> as we encounter them. Core code does not suffer this limitation, >> we >>>> are >>>>>>>>> otherwise free to refactor and change internal methods. For >>> example, >>>> if >>>>>>>> we >>>>>>>>> apply this policy to the 0.98 branch, then we will have to >> abandon >>>>>>>> further >>>>>>>>> security feature development there and move to trunk only. This >> is >>>>>>>> because >>>>>>>>> we already are aware that coprocessor APIs as they stand are >>>>>> insufficient >>>>>>>>> still. >>>>>>>>> >>>>>>>>> Coprocessor APIs are a special class of internal method. We have >>> had >>>> a >>>>>>>>> tension between allowing freedom of movement for developing them >>> out >>>>>> and >>>>>>>>> providing some measure of stability for implementors for a while. >>>>>>>>> >>>>>>>>> It is my belief that the way forward is something like >> HBASE-11125. >>>>>>>> Perhaps >>>>>>>>> we can take this discussion to that JIRA and have this long >> overdue >>>>>>>>> conversation. >>>>>>>>> >>>>>>>>> Regarding security features specifically, I would also like to >> call >>>>>> your >>>>>>>>> attention to HBASE-11127. I think security has been an optional >>>> feature >>>>>>>>> long enough, it is becoming a core requirement for the project, >> so >>>>>> should >>>>>>>>> be moved into core. Sure, we can therefore sidestep any issues >> with >>>>>>>>> coprocessor API sufficiency for hosting security features. >> However, >>>> in >>>>>> my >>>>>>>>> opinion we should pursue both HBASE-11125 and HBASE-11127; the >>> first >>>> to >>>>>>>>> provide the relative stability long asked for by coprocessor API >>>> users, >>>>>>>> the >>>>>>>>> latter to cleanly solve emerging issues with concurrency and >>>>>> versioning. >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best regards, >>>>>>>>> >>>>>>>>> - Andy >>>>>>>>> >>>>>>>>> Problems worthy of attack prove their worth by hitting back. - >> Piet >>>>>> Hein >>>>>>>>> (via Tom White) >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best regards, >>>>>>> >>>>>>> - Andy >>>>>>> >>>>>>> Problems worthy of attack prove their worth by hitting back. - Piet >>>> Hein >>>>>>> (via Tom White) >>>>>> >>>>>> >>>> >>>> >>> >>> >>> -- >>> Best regards, >>> >>> - Andy >>> >>> Problems worthy of attack prove their worth by hitting back. - Piet Hein >>> (via Tom White) >>> >> >> >> >> -- >> Kevin O'Dell >> Systems Engineer, Cloudera >> > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White)