Re: On coprocessor API evolution

Andrew Purtell Sat, 17 May 2014 19:56:33 -0700

You should be telling those customers that use of coprocessors "voids the
warranty". They are a convenience for HBase project developers and advanced
users, not a license for random devs to upload code into the server and
then expect vendor support. It should be obvious on the face of it that is
not a good idea, and so therefore not why coprocessors are in the HBase
code in the first place.



On Sat, May 17, 2014 at 8:02 AM, Kevin O'dell <kevin.od...@cloudera.com>wrote:

> Andrew,
>
>    HBase-4047 is a great idea(even if it is three years old).  I have had
> numerous customers implement Co-Procs and take down every RS in a
> spectacular fashion from JVM crashes to performance crawling so slow that
> jobs fail out.  I will raise this internally and see if we can get some
> extra traction.
>
>
> On Sat, May 17, 2014 at 9:33 AM, Andrew Purtell <apurt...@apache.org>
> wrote:
>
> > Great, see HBASE-4047. In the best of the open source tradition, there
> > hasn't been anyone sufficiently motivated to do the work necessary
> (current
> > use cases are "good enough"), but that someone can always come along.
> > Perhaps that is yourself.
> >
> >
> > On Sat, May 17, 2014 at 5:39 AM, Michael Segel <
> michael_se...@hotmail.com
> > >wrote:
> >
> > > You have to understand…
> > >
> > > I do see the importance of the hook to allow for a trigger to implement
> > > 3rd party code on the server side.
> > > No argument there.
> > >
> > > Its just how the current implementation doesn’t sandbox the code so
> that
> > > it limits the potential for harm to the RS.
> > >
> > > In simple terms you can isolate the code in to a separate jvm and use
> IPC
> > > to connect the sandbox to the RS when a trigger occurs.
> > >
> > > In C/C++ you’d have shared memory segments, something you don’t really
> > > have in Java.  (You could use C and then put a JNI wrapper around
> this…)
> > >
> > > Which goes to my point… this is something that is solvable. You just
> need
> > > to think about it…
> > >
> > > You talk about RDBMSs. Triggers themselves are not an equivalent
> analogy.
> > > You can have a trigger that then calls some code written in an SPL and
> > > you’re ok. You can control the SPL environment so that you limit the
> risk
> > > of the server crashing.
> > > (SPL == Stored Procedure Language)
> > >
> > > If you’re running  third party code from your trigger that is written
> in
> > > C/C++ or Java, then you have other issues.
> > >
> > > Sybase’s Adaptive Server had some serious issues and a poorly written
> > > C/C++ code could cause serious performance issues… Informix IDS took a
> > > different approach and didn’t have those issues.  And I’m aging myself
> > > because most here probably never worked with either Sybase or Informix
> …
> > ;-)
> > >
> > > So using your RDBMS analogy… you have two different approaches. One
> > worked
> > > … well enough, but was problematic.  The other worked better and had
> less
> > > issues and was more secure.
> > >
> > > One of the reasons why this is important… the longer the current
> > > implementation is in the wild, the longer and harder it will take to
> fix.
> > >
> > >
> > > On May 17, 2014, at 11:44 AM, qiang tian <tian...@gmail.com> wrote:
> > >
> > > > My small 2 cents...:-)
> > > >
> > > > Hook/coprocessor is useful mechanism to interacting with a system for
> > > > things that cannot be done via API.  For end user, the tradeoff
> >  factors
> > > > like performance, security, reliability etc can be control by upper
> > > layer'
> > > > policy.
> > > > e.g. In RDBMS, the end user has limited usage case for triggers,
> which
> > > > eliminates the security factor at all, and the performance tradeoff
> is
> > > > given to end user to decide. so from evolution's perspective,
> > > > hook/coprocessor for end user could be controlled by query engine
> layer
> > > > like Phoenix.
> > > >
> > > > For internal user, hook better not be used widely unless it is a MUST
> > or
> > > > strong flexibility/plugability is required.  e.g. things can be part
> of
> > > the
> > > > core better not use it.
> > > >
> > > > thanks.
> > > >
> > > >
> > > >
> > > > On Sat, May 17, 2014 at 4:04 PM, Michael Segel <
> > > michael_se...@hotmail.com>wrote:
> > > >
> > > >> Andrew,
> > > >>
> > > >> Is ‘magical fairy dust’ a reference to some new synthetic drug you
> > take
> > > at
> > > >> raves?
> > > >> But lets get back to reality.
> > > >>
> > > >>
> > > >> Lets try this again; simply put… the coprocessor runs on the same
> JVM
> > as
> > > >> the RS, therefore you have an unacceptable level of risk.
> > > >> That inherent risk means that you cannot run HBase with end-user
> > > >> coprocessors enabled when you want to have a stable and somewhat
> > secure
> > > >> environment.
> > > >>
> > > >> The simple truth is that you need to decouple the end-user code
> > > >> (coprocessor) from the RS.
> > > >> Its not a difficult concept to understand, and while reasonable, it
> > > would
> > > >> mean a major rewrite and work done on co-processors.
> > > >>
> > > >> Will de-coupling the user-space from the RS remove all risk? No.
>  And
> > > no,
> > > >> I’m not suggesting that.
> > > >> But its a critical piece to the puzzle.
> > > >>
> > > >> Its not just security, but also reliability.
> > > >>
> > > >>
> > > >> On May 17, 2014, at 4:43 AM, Andrew Purtell <apurt...@apache.org>
> > > wrote:
> > > >>
> > > >>> Michael,
> > > >>>
> > > >>> As you know, we have implemented security features with
> coprocessors
> > > >>> precisely because they can be interposed on internal actions to
> make
> > > >>> authoritative decisions in-process. Coprocessors are a way to have
> > > >>> composable internal extensions. They don't have and probably never
> > will
> > > >>> have magic fairy security dust. We do trust the security
> coprocessor
> > > code
> > > >>> because it was developed by the project. That is not the same thing
> > as
> > > >>> saying you can have 'security' and execute arbitrary user code
> > > in-process
> > > >>> as a coprocessor. Just want to clear that up for you.
> > > >>>
> > > >>>> will want to allow system coprocessors but then write a
> coprocessor
> > > that
> > > >>> reject user coprocessors.
> > > >>>
> > > >>> That's a reasonable point.
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Sat, May 17, 2014 at 12:13 AM, Michael Segel
> > > >>> <michael_se...@hotmail.com>wrote:
> > > >>>
> > > >>>> Until you move the coprocessor out of the RS space and into its
> own
> > > >>>> sandbox… saying security and coprocessor in the same sentence is a
> > > joke.
> > > >>>> Oh wait… you were serious… :-(
> > > >>>>
> > > >>>> I’d say there’s a significant rethink on coprocessors that’s
> > required.
> > > >>>>
> > > >>>> Anyone running a secure (kerberos) cluster, will want to allow
> > system
> > > >>>> coprocessors but then write a coprocessor that reject user
> > > coprocessors.
> > > >>>>
> > > >>>> Just putting it out there…
> > > >>>>
> > > >>>> On May 15, 2014, at 2:13 AM, Andrew Purtell <apurt...@apache.org>
> > > >> wrote:
> > > >>>>
> > > >>>>> Because coprocessor APIs are so tightly bound with internals, if
> we
> > > >> apply
> > > >>>>> suggested rules like as mentioned on HBASE-11054:
> > > >>>>>
> > > >>>>>    I'd say policy should be no changes to method apis across
> minor
> > > >>>>> versions
> > > >>>>>
> > > >>>>> This will lock coprocessor based components to the limitations of
> > the
> > > >> API
> > > >>>>> as we encounter them. Core code does not suffer this limitation,
> we
> > > are
> > > >>>>> otherwise free to refactor and change internal methods. For
> > example,
> > > if
> > > >>>> we
> > > >>>>> apply this policy to the 0.98 branch, then we will have to
> abandon
> > > >>>> further
> > > >>>>> security feature development there and move to trunk only. This
> is
> > > >>>> because
> > > >>>>> we already are aware that coprocessor APIs as they stand are
> > > >> insufficient
> > > >>>>> still.
> > > >>>>>
> > > >>>>> Coprocessor APIs are a special class of internal method. We have
> > had
> > > a
> > > >>>>> tension between allowing freedom of movement for developing them
> > out
> > > >> and
> > > >>>>> providing some measure of stability for implementors for a while.
> > > >>>>>
> > > >>>>> It is my belief that the way forward is something like
> HBASE-11125.
> > > >>>> Perhaps
> > > >>>>> we can take this discussion to that JIRA and have this long
> overdue
> > > >>>>> conversation.
> > > >>>>>
> > > >>>>> Regarding security features specifically, I would also like to
> call
> > > >> your
> > > >>>>> attention to HBASE-11127. I think security has been an optional
> > > feature
> > > >>>>> long enough, it is becoming a core requirement for the project,
> so
> > > >> should
> > > >>>>> be moved into core. Sure, we can therefore sidestep any issues
> with
> > > >>>>> coprocessor API sufficiency for hosting security features.
> However,
> > > in
> > > >> my
> > > >>>>> opinion we should pursue both HBASE-11125 and HBASE-11127; the
> > first
> > > to
> > > >>>>> provide the relative stability long asked for by coprocessor API
> > > users,
> > > >>>> the
> > > >>>>> latter to cleanly solve emerging issues with concurrency and
> > > >> versioning.
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Best regards,
> > > >>>>>
> > > >>>>> - Andy
> > > >>>>>
> > > >>>>> Problems worthy of attack prove their worth by hitting back. -
> Piet
> > > >> Hein
> > > >>>>> (via Tom White)
> > > >>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Best regards,
> > > >>>
> > > >>>  - Andy
> > > >>>
> > > >>> Problems worthy of attack prove their worth by hitting back. - Piet
> > > Hein
> > > >>> (via Tom White)
> > > >>
> > > >>
> > >
> > >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> Kevin O'Dell
> Systems Engineer, Cloudera
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: On coprocessor API evolution

Reply via email to