Dear All, I am confused about the usage of Kerberos on Hadoop 1.0.3.
I have difficulty in finding some documents to configure of the security feature of HADOOP 1.0.3. Specifically, how should I configure the Hadoop, so that I can use Kerberos? The only document that is related with this question is CDH4 Security Guide (https://ccp.cloudera.com/display/CDH4DOC/CDH4+Security+Guide), an instruction about the security configuration for CloudEra Distributed Hadoop. But I am not sure if this guide can be directly used to configure the Apache Hadoop 1.0.3. Afterall, I don't know how many differences exist between the CDH4 and Apache Hadoop 1.0.3. I read some materials published by the hadoop development team, including the documentation posted on the apache website (http://hadoop.apache.org/docs/r1.0.3/index.html) and the "Hadoop Security Design" document proposed by Yahoo! in 2009. Unfortunately, I still can not generate a clear vision after I read those documents. All my questions are derived from one basic question: Are all of the design features in "Hadoop Security Design" included in the release 1.0.3? If not, which of those features are introduced in release 1.0.3? Which features are included in the Hadoop 2.0? Which features are still not implemented? For example, the "Hadoop Security Design" document mentioned three types of tokens (Delegation Token, Block Access Token and Job Token). Did release 1.0.3 support all the three types of tokens? In the 1.0.3 document "hdfs permission guide" (http://hadoop.apache.org/docs/r1.0.3/hdfs_permissions_guide.html), it mentions that "In this release of Hadoop the identity of a client process is just whatever the host operating system says it is. For Unix-like systems, ......In the future there will be other ways of establishing user identity (think Kerberos, LDAP, and others). ......". It seems the 1.0.3 does not fully support Kerberos. If in that case, to what degree does the release 1.0.3 support Kerberos? So my question is: 1. Is there any document comparing the security feature in each release of hadoop with the "Hadoop Security Design" proposed by Yahoo! ? 2. In release 1.0.3, which component of hadoop can use Kerberos to leverage security? In order to use the Kerberos, how should I configure Hadoop? I am not very familiar with Kerberos. So if I have some misunderstanding, please feel free to point out. Thanks! Best regards, Yongzhi
