I whipped up a patch for something somewhat like option 3 here: http://gerrit.cloudera.org:8080/4897
Please take a look if you're interested. -Todd On Mon, Oct 31, 2016 at 4:41 PM, Todd Lipcon <[email protected]> wrote: > Hey folks, > > After https://gerrit.cloudera.org/#/c/4765/ finishes code review, the > Kudu daemons will have support for Kerberized RPC. If we start the servers > with --server_require_kerberos, then they'll require that all RPC clients > authenticate themselves via Kerberos. > > Of course authentication isn't that useful on its own. I think the next > step along this direction is to add basic "internal service" authorization: > > - the master should only allow tablet server service principals to > heartbeat/register/etc > - the tserver should only allow master service principals to create/delete > tablets, etc. (with the possible extension to also allow 'root' users to do > so) > > I think there are a couple patterns we could follow here to enforce the > authorization > > *Option 1) Try to enforce ACLs on a per-RPC-service level* > > This is more or less the approach that Hadoop takes. We can make sure that > different RPC use cases are separated into different RPC "services", and > apply an ACL on each service. For example, we can separate the current > 'MasterService' into 'TserverToMasterService' and 'ClientToMasterService', > and then register ACLs on each such service when we start the RPC server. > > Pros: > - when adding a new method, it "automatically" gets the right ACL without > having to remember to add special checks > - enforces some nice separation and grouping of RPCs > > Cons: > - doesn't handle special cases well, like 'DeleteTablet is usually called > by the master, but we also want to allow admins to call the same method for > use in manual recovery tools' > - would require a bunch of RPC methods to be moved around, and likely > break wire compatibility (I believe the RPC header currently specifies the > actual service name for each method call) > > *Option 2) Entirely ad-hoc authorization checks* > > At the top of each RPC method implementation, we can manually check the > authorization. Basically, imagine something like this: > > bool CheckTabletServerRequest(RpcContext* rpc) { > // if 'rpc' is from a valid tablet server, return true > // otherwise, respond to the RPC with an authz error and return false > } > > void MasterServiceImpl::TSHeartbeat(const TSHeartbeatRequestPB* req, > TSHeartbeatResponsePB* resp, > rpc::RpcContext* rpc) { > if (!CheckTabletServerRequest(rpc)) return; > } > > > Pros: > - 100% flexible - easily allows for request-specific special cases > - relies on no "magic", it's obvious when you look at the code how the > authz check is applied. > > Cons: > - easy to forget to check authorization when adding new RPCs. I recall a > CVE or two in the past where we forgot to add the appropriate check in > various internal Hadoop RPCs. > > > *Option 3) Per-method authorization checks, declared in .proto file* > > In the .proto file describing a service, we can add a custom option like: > > rpc TSHeartbeat(TSHeartbeatRequestPB) returns (TSHeartbeatResponsePB) { > option (kudu.authz = "tablet-servers-only") > }; > > When we start the RPC system, we can bind these magic strings to actual > method implementations, and give an error if any RPC method either mentions > an invalid authz policy, or doesn't mention an authz policy at all. > > Pros: > - we can require that all methods specify a policy (even if they specify > 'none'), making it impossible to forget to authorize > - still retains flexibility: we can always create an exotic policy > implementation for an exotic method as necessary (or specify 'none' and do > the custom check from option 2) > - we can allow the same protobuf option to be specified as a service-wide > 'default' for the cases where an entire service has the same authz policy > > Cons: > - maybe a little more confusing to developers who have to go looking at > the .proto file to see which authorization policy is being applied to a > method? > > *Option X? Any other options I missed?* > - I only spent 30m or so thinking about this, so maybe I missed a better > option? Feedback appreciated. > > > I'm leaning towards option 3 here, but would appreciate feedback before I > go too far down that path. > > -Todd > -- > Todd Lipcon > Software Engineer, Cloudera > -- Todd Lipcon Software Engineer, Cloudera
