Repository: incubator-apex-core
Updated Branches:
  refs/heads/master 6de29c12e -> e39c63142


Update security.md

Corrected some typo and rewrote one sentence.

Project: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-apex-core/commit/2e225274
Tree: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/tree/2e225274
Diff: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/diff/2e225274

Branch: refs/heads/master
Commit: 2e22527463071cc059bdb676bde6241ee7fcfeec
Parents: 6de29c1
Author: trusli <[email protected]>
Authored: Tue Mar 22 16:23:41 2016 -0700
Committer: trusli <[email protected]>
Committed: Tue Mar 22 16:23:41 2016 -0700

----------------------------------------------------------------------
 docs/security.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/2e225274/docs/security.md
----------------------------------------------------------------------
diff --git a/docs/security.md b/docs/security.md
index ebdac97..a2b2103 100644
--- a/docs/security.md
+++ b/docs/security.md
@@ -15,7 +15,7 @@ There is Hadoop configuration and CLI configuration. Hadoop 
configuration may be
 
 ###Hadoop Configuration
 
-An Apex application uses delegation tokens to authenticte with the 
ResourceManager (YARN) and NameNode (HDFS) and these tokens are issued by those 
servers respectively. Since the application is long-running,
+An Apex application uses delegation tokens to authenticate with the 
ResourceManager (YARN) and NameNode (HDFS) and these tokens are issued by those 
servers respectively. Since the application is long-running,
 the tokens should be valid for the lifetime of the application. Hadoop has a 
configuration setting for the maximum lifetime of the tokens and they should be 
set to cover the lifetime of the application. There are separate settings for 
ResourceManager and NameNode delegation
 tokens.
 
@@ -46,7 +46,7 @@ application are performed as that user.
 
 #### Using kinit
 
-A Keberos ticket granting ticket (TGT) can be obtained by using the Kerberos 
command `kinit`. Detailed documentation for the command can be found online or 
in man pages. An sample usage of this command is
+A Kerberos ticket granting ticket (TGT) can be obtained by using the Kerberos 
command `kinit`. Detailed documentation for the command can be found online or 
in man pages. An sample usage of this command is
 
     kinit -k -t path-tokeytab-file kerberos-principal
 
@@ -85,7 +85,7 @@ In this section we will see how security works for 
applications built on Apex. W
 
 To launch applications in Apache Apex the command line client dtcli can be 
used. The application artifacts such as binaries and properties are supplied as 
an application package. The client, during the various steps involved to launch 
the application needs to communicate with both the Resource Manager and the 
Name Node. The Resource Manager communication involves the client asking for 
new resources to run the application master and start the application launch 
process. The steps along with sample Java code are described in Writing YARN 
Applications. The Name Node communication includes the application artifacts 
being copied to HDFS so that they are available across the cluster for 
launching the different application containers.
 
-In secure mode the communications with both Resource Manager and Name Node 
requires authentication and the mechanism is Kerberos. Below is an illustration 
showing this.
+In secure mode, the communications with both Resource Manager and Name Node 
requires authentication and the mechanism is Kerberos. Below is an illustration 
showing this.
 
 ![](images/security/image02.png)               
 
@@ -116,11 +116,11 @@ When the application is completely up and running, there 
are different component
 
 Every Apache Apex application has a master process akin to any YARN 
application. In our case it is called STRAM (Streaming Application Master). It 
is a master process that runs in its own container and manages the different 
distributed components of the application. Among other tasks it requests 
Resource Manager for new resources as they are needed and gives back resources 
that are no longer needed. STRAM also needs to communicate with Name Node from 
time-to-time to access the persistent HDFS file system. 
 
-In secure mode STRAM has to authenticate with both Resource Manager and Name 
Node before it can send any requests and this is achieved using Delegation 
Tokens. Since STRAM runs as a managed application master it runs in a Hadoop 
container. This container could have been allocated on any node based on what 
resources were available. Since there is no fixed node where STRAM runs it does 
not have Kerberos credentials and hence unlike the launch client dtcli it 
cannot authenticate with Hadoop services Resource Manager and Name Node using 
Kerberos. Instead, Delegation Tokens are used for authentication.
+In secure mode, STRAM has to authenticate with both Resource Manager and Name 
Node before it can send any requests and this is achieved using Delegation 
Tokens. Since STRAM runs as a managed application master, it runs in a Hadoop 
container. This container could have been allocated on any node based on what 
resources were available. Since there is no fixed node where STRAM runs, it 
does not have Kerberos credentials. Unlike launch client dtcli, it cannot 
authenticate with Hadoop services Resource Manager and Name Node using 
Kerberos. Instead, Delegation Tokens are used for authentication.
 
 #####Delegation Tokens
 
-Delegation tokens are tokens that are dynamically issued by the source and 
clients use them to authenticate with the source. The source stores the 
delegation tokens it has issued in a cache and checks the delegation token sent 
by a client against the cache. If a match is found, the authentication is 
successful else it fails. This is the second mode of authentication in secure 
Hadoop after Kerberos. More details can be found in the Hadoop security design 
document. In this case the delegation tokens are issued by Resource Manager and 
Name Node. STRAM useswould use these tokens to authenticate with them. But how 
does it get them in the first place? This is where the launch client dtcli 
comes in. 
+Delegation tokens are tokens that are dynamically issued by the source and 
clients use them to authenticate with the source. The source stores the 
delegation tokens it has issued in a cache and checks the delegation token sent 
by a client against the cache. If a match is found, the authentication is 
successful else it fails. This is the second mode of authentication in secure 
Hadoop after Kerberos. More details can be found in the Hadoop security design 
document. In this case the delegation tokens are issued by Resource Manager and 
Name Node. STRAM uses would use these tokens to authenticate with them. But how 
does it get them in the first place? This is where the launch client dtcli 
comes in. 
 
 The client dtcli, since it possesses Kerberos credentials as explained in the 
Application Launch section, is able to authenticate with Resource Manager and 
Name Node using Kerberos. It then requests for delegation tokens over the 
Kerberos authenticated connection. The servers return the delegation tokens in 
the response payload. The client in requesting the resource manager for the 
start of the application master container for STRAM seeds it with these tokens 
so that when STRAM starts it has these tokens. It can then use these tokens to 
authenticate with the Hadoop services.
 
@@ -145,4 +145,4 @@ Like STRAM, streaming containers also need to communicate 
with NameNode to use H
 Conclusion
 -----------
 
-We looked at the different security requirements for distributed applications 
when they run in a secure Hadoop environment and looked at how Apex solves this.
\ No newline at end of file
+We looked at the different security requirements for distributed applications 
when they run in a secure Hadoop environment and looked at how Apex solves this.

Reply via email to