> On Dec. 30, 2014, 5:46 p.m., Christopher Tubbs wrote: > > shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java, line 210 > > <https://reviews.apache.org/r/29386/diff/4/?file=803175#file803175line210> > > > > Why is username the "short" user name? Is that unique in Kerberos? If > > not, the long version should be used everywhere instead. Otherwise, one > > user can appear to be another in logs, etc. > > > > If "getShortUserName" is not unique, it should avoided everywhere. > > Josh Elser wrote: > Check out: > http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html > > Kerberos principals are of the form: primary/instance@realm. Kerberos > principals are typically categorized as users and services. A user is not > qualified to a single instance (a host) and represent authentication across > the realm. For example, els...@example.com means that I can "roam". > Conversely, a service is typically "fixed" to a specific host. For example, > accumulo/node1.example....@example.com means that there is a process, logged > in as 'accumulo' on the host 'node1.example.com'. That service can't be run > on any other host. Now, an important note if someone actually creates a > principal "accum...@example.com" this is unique with respect to any other > "accumulo/`host`@EXAMPLE.COM" principal. I'm not sure if we need to do > anything else other than convention of kerberos principals, or if we should > be including the instance in "our" username when present. > > This kind of ties back into the SystemCredentials discussion again. > > Christopher Tubbs wrote: > Okay, so a smart configuration would make shortnames unique. However, > UserGroupInformation returns only the `primary` for the short name. This > means that user names will have to be unique across realms and instances. > Right now, you are storing permissions using the short name. So, any user > with the same primary, will be able to masquerade as any other user with the > same primary from a different instance and/or realm, and be able to user > their permissions and authorizations. That's the problem with the shortname > here. That's very unexpected. > > Josh Elser wrote: > Bingo. If you look at how HDFS does their configuration, this is the same > convention. The lack of documentation from me leaves something to be desired > here, and I apologize for that. > > To save you looking at HDFS (if you care not to look), you'll see that an > HDFS process uses a given principal with a special replacement string > `_HOST`. The common convention is to use something like > `dn/_h...@example.com` (the realm is unimportant for this example). This > ensures that the same configuration files can be used across all hosts in the > HDFS instance, and Hadoop dynamically replaces `_HOST` with the FQDN of the > host. Thus, there's an implicit link that all `dn/*@EXAMPLE.COM` can act as > datanodes and this is protected by the fact that access to the KDC is > restricted (you can't make your own user). The circle of trust is two-fold: > having a keytab with the correct principal and that Hadoop is requires that > specific configuration (which restricts the principal). > > Christopher Tubbs wrote: > My concerns here are more about the impact on users, than for the system > credentials. I don't know what HDFS is doing, but if they aren't (minimally) > checking the realm when checking permissions/access on an authenticated > principal, then they are less secure than I think we should be. Referencing > HDFS also seems to imply that we're not so much doing Kerberos, as we are > implementing HDFS-specific Kerberos conventions (which are less secure, with > respect to data authorizations/permissions within Accumulo, than I'm > comfortable with). > > Josh Elser wrote: > bq. if they aren't (minimally) checking the realm when checking > permissions/access on an authenticated principal > > Do you mean the instance instead of the realm? In the case of a single > realm, the KDC is going to verify the correct realm. Assuming you meant the > instance though (the optional "/hostname"), it's typical that a user has the > ability to use their credentials anywhere. Thus, you typically see principals > without instances for actual users. As far as I understand it, that's what > HDFS tends to follow and what I tried to as well. Accumulo doesn't care where > you come from, just what your name is and that you have valid credentials. I > don't think we're being substantially less secure by not including the > instance in the Accumulo principal. > > Christopher Tubbs wrote: > No, I mean the realm, to make it only necessary to guarantee uniqueness > within a realm, vs. across all known realms (more reasonable of a guarantee > to make for a KDC user admin). We could also include the instance (when > specified), if we want to really be careful that users aren't sharing > permissions. > > In my concerns, I'm assuming we authenticate users in any realm. If we > are somehow restricted to a single realm (either by a "permittedRealm" > configuration item or by the nature of Kerberos itself), then realm isn't > that important, but we should discuss more about the instance. My > understanding is that Kerberos authenticates the user by the fully qualified > Kerberos principal (`primary/instance@realm`) in whatever realm they are, but > it doesn't have to be a specific realm (like the same one as the server), and > then we are truncating their identity, essentially binning people from > different realms into the same bucket. It's like authenticating me as > `Christopher Tubbs`, and then assigning me to a bucket called `Christopher` > where I share permissions/authorizations with all other `Christopher`s. > > Josh Elser wrote: > Oh, I apologize, I follow you now. Your concern wasn't clicking for me. > > > My understanding is that Kerberos authenticates the user by the fully > qualified Kerberos principal (primary/instance@realm) in whatever realm they > are, but it doesn't have to be a specific realm (like the same one as the > server), and then we are truncating their identity, essentially binning > people from different realms into the same bucket > > Well, the KDC you're communicating with has to be set up for the realm > being requested (and if one isn't provided, it will delegate to another KDC > or drop you into a default realm, depending on krb5.conf). As I understand > it, if you haven't defined a `default_realm` in `libdefaults` in krb5.conf, > and a user comes in with an incorrect hostname (instance) or realm > specification, the KDC won't authenticate you which keeps them out of > Accumulo completely. I use `default_realm` locally, since I just use a dummy > realm instead of actually matching my laptop. > > In all honesty thought, I haven't thought past single-realm KDC setups. > Is enforcing that clients are a member of the same realm the Accumulo server > principals reside in sufficient? I'm worried about scope-creep of trying to > do multi-realm configuration correct before single realm is adequately > polished. > > Christopher Tubbs wrote: > bq. Is enforcing that clients are a member of the same realm the Accumulo > server principals reside in sufficient? > > Perhaps. Where would we do this? In the site configuration? > > bq. I'm worried about scope-creep of trying to do multi-realm > configuration correct before single realm is adequately polished. > > Understood, but I'm thinking about it from the other side. I don't want > to make assumptions which are valid in a narrow case, but which leave > security holes in a more general case. I'm also coming at this from the > perspective of dealing with X.509 certificates, and understanding the > differences between a CN and a DN. > > If we lock things down to a single realm (so we can safely omit it in our > internal structures), we'd still need to address the `instance` portion. For > that, it sounded like you were saying that `myPrimary/myInstance@myRealm` is > distinct from `myPrimary@myRealm` and could both be valid users according to > the KDC. If that's the case, I think it makes sense for the permissions > handler/authorizer to use the `primary/instance` for the principal and not > just the `primary` (which is what shortname does), because it could have > different permissions. If the user administrator wishes to allow > `myPrimary@myRealm`, then they should create such a user in the KDC (I hope > I'm understanding this correctly.), so we would just use `myPrimary` as the > user principal in Accumulo, but we shouldn't strip the instance off if it is > present. > > Josh Elser wrote: > > > Is enforcing that clients are a member of the same realm the Accumulo > server principals reside in sufficient? > > Perhaps. Where would we do this? In the site configuration? > > Yeah. My thought was to just piggy-back on top of the realm provided in > the kerberos principal. That keeps us from having to introduce a new property > for something we know that might not be entirely sufficient. > > > we'd still need to address the instance portion. For that, it sounded > like you were saying that myPrimary/myInstance@myRealm is distinct from > myPrimary@myRealm and could both be valid users according to the KDC > > Yes, principals are valid (and distinct!) both with and without an > instance. In our case, I believe the instance being distinct is undesirable > (and where I was going with the reference to how Hadoop does things). Any > server with a given principal (or matching a certain principal) is considered > the Accumulo "system" user (along with the `instance.*` check we mentioned > earlier). A simple way to do this (without getting into complicated regex's > defining who is actually considered the system user) is to just treat any > instance also as that user. It brings a bit of coordination required in how > KRB principals are created, but it's the "common" configuration/deployment at > the cost of flexibility. I would envision leveraging something similar to the > `auth_to_local` RULEs > (http://web.mit.edu/kerberos/krb5-devel/doc/admin/conf_files/krb5_conf.html) > like Hadoop does, but I don't *really* want to do that right now (mapping > some set of principal regexs to a "user"). This would let us say th ings like "accumulo/node1.example.com" is "accumulo" as is "old_server/node2.example.com". > > For normal users, convention is that they aren't attached to an instance > (and are valid within the realm), and this implementation would be a > limitation on us for edge cases in KDC configurations. > > > If the user administrator wishes to allow myPrimary@myRealm, then they > should create such a user in the KDC (I hope I'm understanding this > correctly.), so we would just use myPrimary as the user principal in > Accumulo, but we shouldn't strip the instance off if it is present. > > Yes, you are correct. One thing I'm confused about is if there is ever a > case that a user would have an instance in their principal. Not understanding > why this might actually happen pushes me in the direction that truncating > things is ok. That covers "human" users, but "application" users would still > be likely tied to a specific hostname, in which case perhaps I can't punt on > this for now. I really just want to avoid having N `accumulo/hostname` users > in our "database" which would the sum of all Accumulo server processes. The > regex matching would be needed to avoid that. > > Maybe this is experimental until I do that as well? Maybe I shouldn't > commit any of this without that? I'm not completely decided yet, but I'm > erring on the former presently. > > Christopher Tubbs wrote: > bq. My thought was to just piggy-back on top of the realm provided in the > kerberos principal. > > You mean the server's own realm? That makes sense to me. We can document > that they should match, but we'd need to make sure we explicitly check that. > > > bq. For normal users, convention is that they aren't attached to an > instance (and are valid within the realm), and this implementation would be a > limitation on us for edge cases in KDC configurations. > > My concerns here are for normal users. The !SYSTEM user doesn't even have > permissions or authorizations stored in ZK (it shouldn't anyway). I had > assumed the !SYSTEM user would be treated specially after authentication at > the transport layer. I don't think it should rely on the Kerberos principal. > This relates to our other discussion about the SystemToken. > > bq. One thing I'm confused about is if there is ever a case that a user > would have an instance in their principal. > > I can imagine use cases where a user has permission to access a table, > but only from a specific, vetted system. This is analogous to OpenStack and > EC2 security group / firewall rules which allow access only from specific > sources. MySQL also has this concept in its permissions model. > > bq. That covers "human" users, but "application" users would still be > likely tied to a specific hostname, in which case perhaps I can't punt on > this for now. > > Agreed. > > bq. I really just want to avoid having N accumulo/hostname users in our > "database" which would the sum of all Accumulo server processes. The regex > matching would be needed to avoid that. > > I don't think that's the case. The system user doesn't (shouldn't) write > to the ZK user database. Its permissions are evaluated separately, and it > should never have any authorizations. Rather than regex matching, our > discussion around the SystemToken might help resolve this. If the system > credentials (!SYSTEM, SystemToken) are left as-is, then you can keep using > those internally after the transport layer is finished. I wouldn't use the > server's Kerberos principal for the server components. I'd keep using the > existing !SYSTEM principal, but only after the server component is verified > at the transport layer to actually reflect a server component. > > Josh Elser wrote: > > You mean the server's own realm? That makes sense to me. We can > document that they should match, but we'd need to make sure we explicitly > check that. > > Precisely. I plan to add that check. > > > If the system credentials (!SYSTEM, SystemToken) are left as-is, then > you can keep using those internally after the transport layer is finished. > > That's a good point. I was thinking around this, instead of tackling it > directly. If we address the SYSTEM case specifically, is there anything else > we have to do other than switch the shortUserName to the full name > (primary+instance)? I think that would address it. > > Christopher Tubbs wrote: > If we're okay with the restriction that it must be in the same realm, > then that seems all. Just to be clear, are we really sure we want to have > that restriction? It seems like the only reason to restrict it is to avoid > including the realm internally (like in the ZK storage). And, it'll be > problematic if we decide to permit multi-realm authentication in the future. > > We could serialize the whole thing, to future proof, but keep the > restriction to one realm (for now, until we think through the implications of > multi-realm), and conveniently only display without the realm (for example, > in the shell, in whoami, etc.). That way, if we do add multi-realm support > later (by releasing the restriction), we can keep the shorter names for those > in the same realm, and only include the realm when it is different than the > server. > > Josh Elser wrote: > Hiding the realm is a possibility, I believe I need to think on this > and/or look at what other projects have done in this regard. Perhaps I'm just > being obstinant in not wanting to include the realm at all, and we should > always have it. I'm not sure. > > I believe that the above decision is also going to impact what we do > about multiple realms as well. If we store the full principal, the > multi-realm problem goes away. Perhaps that's my sign?
I don't know about signs, but it does seem to be the most future proof. I'm just not sure how important multi-realm support is, or whether it would be nice to assume a default realm, or whatever. I just haven't used Kerberos that much to know what makes sense. - Christopher ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29386/#review66382 ----------------------------------------------------------- On Jan. 6, 2015, 6:14 p.m., Josh Elser wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/29386/ > ----------------------------------------------------------- > > (Updated Jan. 6, 2015, 6:14 p.m.) > > > Review request for accumulo. > > > Bugs: ACCUMULO-2815 > https://issues.apache.org/jira/browse/ACCUMULO-2815 > > > Repository: accumulo > > > Description > ------- > > ACCUMULO-2815 Initial support for Kerberos client authentication. > > Leverage SASL transport provided by Thrift which can speak GSSAPI, which > Kerberos implements. Introduced... > > * An Accumulo KerberosToken which is an AuthenticationToken to validate users. > * Custom thrift processor and invocation handler to ensure server RPCs have a > valid KRB identity and Accumulo authentication. > * A KerberosAuthenticator which extends ZKAuthenticator to support Kerberos > identities seamlessly. > * New ClientConf variables to use SASL transport and pass Kerberos server > principal > * Updated ClientOpts and Shell opts to transparently use a KerberosToken when > SASL is enabled (no extra client work). > > I believe this is the "bare minimum" for Kerberos support. They are also > grossly lacking in unit and integration tests. I believe that I might have > somehow broken the client address string in the server (I saw log messages > with client: null, but I'm not sure if it's due to these changes or not). A > necessary limitation in the Thrift server used is that, like the SSL > transport, the SASL transport cannot presently be used with the > TFramedTransport, which means none of the [half]async thrift servers will > function with this -- we're stuck with the TThreadPoolServer. > > Performed some contrived benchmarks on my laptop (while still using it > myself) to get at big-picture view of the performance impact against "normal" > operation and Kerberos alone. Each "run" was the duration to ingest 100M > records using continuous-ingest, timed with `time`, using 'real'. > > THsHaServer (our default), 6 runs: > > Avg: 10m7.273s (607.273s) > Min: 9m43.395s > Max: 10m52.715s > > TThreadPoolServer (no SASL), 5 runs: > > Avg: 11m16.254s (676.254s) > Min: 10m30.987s > Max: 12m24.192s > > TThreadPoolServer+SASL/GSSAPI (these changes), 6 runs: > > Avg: 13m17.187s (797.187s) > Min: 10m52.997s > Max: 16m0.975s > > The general takeway is that there's about 15% performance degredation in its > initial state which is in the realm of what I expected (~10%). > > > Diffs > ----- > > core/src/main/java/org/apache/accumulo/core/cli/ClientOpts.java f6ea934 > core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java > 6fe61a5 > core/src/main/java/org/apache/accumulo/core/client/impl/ClientContext.java > e75bec6 > core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java > f481cc3 > core/src/main/java/org/apache/accumulo/core/client/impl/MasterClient.java > a9ad8a1 > > core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportKey.java > 6dc846f > > core/src/main/java/org/apache/accumulo/core/client/impl/ThriftTransportPool.java > 5da803b > > core/src/main/java/org/apache/accumulo/core/client/security/tokens/AbstractKerberosToken.java > PRE-CREATION > > core/src/main/java/org/apache/accumulo/core/client/security/tokens/KerberosToken.java > PRE-CREATION > core/src/main/java/org/apache/accumulo/core/conf/Property.java e054a5f > core/src/main/java/org/apache/accumulo/core/rpc/FilterTransport.java > PRE-CREATION > core/src/main/java/org/apache/accumulo/core/rpc/SaslConnectionParams.java > PRE-CREATION > core/src/main/java/org/apache/accumulo/core/rpc/TTimeoutTransport.java > 6eace77 > core/src/main/java/org/apache/accumulo/core/rpc/ThriftUtil.java 09bd6c4 > core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransport.java > PRE-CREATION > > core/src/main/java/org/apache/accumulo/core/rpc/UGIAssumingTransportFactory.java > PRE-CREATION > core/src/main/java/org/apache/accumulo/core/security/Credentials.java > 525a958 > core/src/test/java/org/apache/accumulo/core/cli/TestClientOpts.java ff49bc0 > > core/src/test/java/org/apache/accumulo/core/client/ClientConfigurationTest.java > PRE-CREATION > > core/src/test/java/org/apache/accumulo/core/client/impl/ThriftTransportKeyTest.java > PRE-CREATION > > core/src/test/java/org/apache/accumulo/core/conf/ClientConfigurationTest.java > 40be70f > > core/src/test/java/org/apache/accumulo/core/rpc/SaslConnectionParamsTest.java > PRE-CREATION > > minicluster/src/main/java/org/apache/accumulo/minicluster/impl/MiniAccumuloClusterImpl.java > 27d6b19 > > minicluster/src/main/java/org/apache/accumulo/minicluster/impl/MiniAccumuloConfigImpl.java > 26c23ed > pom.xml ae188a0 > proxy/src/main/java/org/apache/accumulo/proxy/Proxy.java 4b048eb > > server/base/src/main/java/org/apache/accumulo/server/AccumuloServerContext.java > 09ae4f4 > server/base/src/main/java/org/apache/accumulo/server/init/Initialize.java > 046cfb5 > > server/base/src/main/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingInvocationHandler.java > PRE-CREATION > > server/base/src/main/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingWrapper.java > PRE-CREATION > server/base/src/main/java/org/apache/accumulo/server/rpc/TServerUtils.java > 641c0bf > > server/base/src/main/java/org/apache/accumulo/server/rpc/ThriftServerType.java > PRE-CREATION > > server/base/src/main/java/org/apache/accumulo/server/security/SecurityOperation.java > 5e81018 > > server/base/src/main/java/org/apache/accumulo/server/security/SecurityUtil.java > 29e4939 > > server/base/src/main/java/org/apache/accumulo/server/security/SystemCredentials.java > a59d57c > > server/base/src/main/java/org/apache/accumulo/server/security/handler/KerberosAuthenticator.java > PRE-CREATION > > server/base/src/main/java/org/apache/accumulo/server/thrift/UGIAssumingProcessor.java > PRE-CREATION > > server/base/src/test/java/org/apache/accumulo/server/AccumuloServerContextTest.java > PRE-CREATION > > server/base/src/test/java/org/apache/accumulo/server/rpc/TCredentialsUpdatingInvocationHandlerTest.java > PRE-CREATION > > server/base/src/test/java/org/apache/accumulo/server/security/SystemCredentialsTest.java > 4202a7e > server/gc/src/main/java/org/apache/accumulo/gc/SimpleGarbageCollector.java > 93a9a49 > > server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectWriteAheadLogsTest.java > f98721f > > server/gc/src/test/java/org/apache/accumulo/gc/SimpleGarbageCollectorTest.java > 99558b8 > > server/gc/src/test/java/org/apache/accumulo/gc/replication/CloseWriteAheadLogReferencesTest.java > cad1e01 > server/master/src/main/java/org/apache/accumulo/master/Master.java 12195fa > server/tracer/src/main/java/org/apache/accumulo/tracer/TraceServer.java > 7e33300 > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java > d5c1d2f > shell/src/main/java/org/apache/accumulo/shell/Shell.java 58308ff > shell/src/main/java/org/apache/accumulo/shell/ShellOptionsJC.java 8167ef8 > shell/src/test/java/org/apache/accumulo/shell/ShellConfigTest.java 0e72c8c > shell/src/test/java/org/apache/accumulo/shell/ShellOptionsJCTest.java > PRE-CREATION > test/pom.xml b0a926f > test/src/main/java/org/apache/accumulo/test/functional/ZombieTServer.java > eb84533 > > test/src/main/java/org/apache/accumulo/test/performance/thrift/NullTserver.java > 2ebc2e3 > test/src/test/java/org/apache/accumulo/harness/AccumuloClusterIT.java > 8f7e1b7 > test/src/test/java/org/apache/accumulo/harness/MiniClusterHarness.java > abdb627 > test/src/test/java/org/apache/accumulo/harness/MiniClusterKdc.java > PRE-CREATION > test/src/test/java/org/apache/accumulo/harness/SharedMiniClusterIT.java > 2380f66 > > test/src/test/java/org/apache/accumulo/harness/conf/AccumuloMiniClusterConfiguration.java > 11b7530 > > test/src/test/java/org/apache/accumulo/server/security/SystemCredentialsIT.java > fb71f5f > test/src/test/java/org/apache/accumulo/test/ArbitraryTablePropertiesIT.java > aa5c164 > test/src/test/java/org/apache/accumulo/test/CleanWalIT.java 1fcd5a4 > > test/src/test/java/org/apache/accumulo/test/functional/BatchScanSplitIT.java > 221889b > test/src/test/java/org/apache/accumulo/test/functional/KerberosIT.java > PRE-CREATION > test/src/test/resources/log4j.properties cb35840 > > Diff: https://reviews.apache.org/r/29386/diff/ > > > Testing > ------- > > Ensure existing unit tests still function. Accumulo is functional and ran > continuous ingest multiple times using a client with only a Kerberos identity > (no user/password provided). Used MIT Kerberos with Apache Hadoop 2.6.0 and > Apache ZooKeeper 3.4.5. > > > Thanks, > > Josh Elser > >