Hi Francis, First a word of warning -- Hadoop 0.20.203 does not include the append support that HBase needs to avoid data loss in the case of region server failure. I'd _strongly_ recommend you look at running CDH3 (which contains both append support and security) for the moment. There may be an ASF Hadoop 0.20+security+append version release at some point, but there isn't one yet.
Back to the question, you would not want master and region servers to be identified as separate users on HDFS. This would be bound to cause problems (or at least complications) with normal operations. You _would_ want to have each server identified by a unique kerberos principal, however. The default kerberos principal name form supported by secure Hadoop consists of 3 parts: username/hostname@REALM You can customize how this is parsed out if you have specific needs, but I haven't run into that myself. Only the "username" portion is used by HDFS during access control checks. This is referred to as the "short user name" in the HDFS code. Including hostname in the full kerberos principal prevents the KDC from seeing a normal cluster startup as a credential replay attack (and thus rejecting valid logins), among other things. So a configuration for an example cluster might be: Server1: - running Master as hbase/[email protected] Server2: - running Region Server as hbase/[email protected] Server3: - running Region Server as hbase/[email protected] ... This way all HBase files in HDFS wind up being owned by the "hbase" user, and master can read region server logs, region servers can read version and cluster ID files, etc. We've been running HBase with this type of configuration on secure Hadoop (though our own internal versions are a bit hacked up, to put it mildly), with good results for many months. Hope this helps. Gary On Mon, Jun 20, 2011 at 10:55 AM, Francis Christopher Liu < [email protected]> wrote: > Hi, > > I’m working with Hbase 0.90.3 and hadoop 0.20.203. And I was wondering what > the reasons would be to have the master and region server be identified as > different users on hdfs? Is it recommended? > > Thanks, > Francis >
