Author: gates
Date: Mon Aug  6 18:14:57 2012
New Revision: 1369907

URL: http://svn.apache.org/viewvc?rev=1369907&view=rev
Log:
HCATALOG-427 Document storage-based authorization

Added:
    
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/authorization.xml
Modified:
    incubator/hcatalog/branches/branch-0.4/CHANGES.txt
    
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/site.xml
    
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/skinconf.xml

Modified: incubator/hcatalog/branches/branch-0.4/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/incubator/hcatalog/branches/branch-0.4/CHANGES.txt?rev=1369907&r1=1369906&r2=1369907&view=diff
==============================================================================
--- incubator/hcatalog/branches/branch-0.4/CHANGES.txt (original)
+++ incubator/hcatalog/branches/branch-0.4/CHANGES.txt Mon Aug  6 18:14:57 2012
@@ -24,6 +24,8 @@ Trunk (unreleased changes)
 
   NEW FEATURES
   HCAT-328 HCatLoader should report its input size so pig can estimate the 
number of reducers (traviscrawford via gates)
+  
+  HCAT-427 Document storage-based authorization (lefty via gates)
 
   IMPROVEMENTS
   HCAT-389 hcat_ping (script to check if HCatalog server is running/reachable) 
(mithun via khorgath)

Added: 
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/authorization.xml
URL: 
http://svn.apache.org/viewvc/incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/authorization.xml?rev=1369907&view=auto
==============================================================================
--- 
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/authorization.xml
 (added)
+++ 
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/authorization.xml
 Mon Aug  6 18:14:57 2012
@@ -0,0 +1,209 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" 
"http://forrest.apache.org/dtd/document-v20.dtd";>
+
+<document>
+  <header>
+    <title>Storage Based Authorization</title>
+  </header>
+  <body>
+
+  <!-- ==================================================================== 
--> 
+
+  <section>
+  <title>Default Authorization Model of Hive</title>
+  
+<p>The default authorization model of Hive supports a traditional RDBMS style 
of authorization based on users, groups and roles and granting them permissions 
to do operations on database or table. It is descibed in more detail in <a 
href="http://wiki.apache.org/hadoop/Hive/LanguageManual+Authorization";>https://cwiki.apache.org/Hive/languagemanual-auth.html</a>.</p>
+
+<p>This RDBMS style of authorization is not very suitable for the typical use 
cases in Hadoop because of the following differences in implementation:</p>
+
+<ol>
+  <li>Unlike a traditional RDBMS, Hive is not in complete control of all data 
underneath it. The data is stored in a number of files, and the file system has 
an independent authorization system.</li>
+  <li>Also unlike a traditional RDBMS which doesn’t allow other programs to 
access the data directly, people tend to use other applications that read or 
write directly into files or directories that get used with Hive.</li>
+</ol>
+
+<p>This creates problem scenarios like: </p>
+
+<ol>
+  <li>You grant permissions to a user, but the user can’t access the 
database or file system because they don’t have file system permissions.</li>
+  <li>You remove permissions for a user, but the user can still access the 
data directly through the file system, because they have file system 
permissions.</li>
+</ol>
+
+  </section>
+
+<!-- ==================================================================== -->
+
+  <section>
+  <title>Storage-System Based Authorization Model</title>
+
+<p>The Hive community realizes that there might not be a one-size-fits-all 
authorization model, so it has support for alternative authorization models to 
be plugged in.</p>
+
+<p>In the HCatalog package, we have introduced implementation of an 
authorization interface that uses the permissions of the underlying file system 
(or in general, the storage backend) as the basis of permissions on each 
database, table or partition.</p>
+
+<p>In Hive, when a file system is used for storage, there is a directory 
corresponding to a database or a table. With this authorization model, the 
read/write permissions a user or group has for this directory determine the 
permissions a user has on the database or table. In the case of other storage 
systems such as HBase, the authorization of equivalent entities in the system 
will be done using the system’s authorization mechanism to determine the 
permissions in Hive.</p>
+
+<p>For example, an alter table operation would check if the user has 
permissions on the table directory before allowing the operation, even if it 
might not change anything on the file system.</p>
+
+<p>A user would need write access to the corresponding entity on the storage 
system to do any type of action that can modify the state of the database or 
table. The user needs read access to be able to do any non-modifying action on 
the database or table.</p>
+
+<p>When the database or table is backed by a file system that has a 
Unix/POSIX-style permissions model (like HDFS), there are read(r) and write(w) 
permissions you can set for the owner user, group and ‘other’. The file 
system’s logic for determining if a user has permission on the directory or 
file will be used by Hive.</p>
+
+<p>Details of HDFS permissions are given here: 
+<a 
href="http://hadoop.apache.org/common/docs/r1.0.2/hdfs_permissions_guide.html";>HDFS
 Permissions Guide</a>.</p>
+
+<p>The following table shows the <strong>minimum</strong> permissions required 
for Hive operations under this authorization model:</p>
+<p>&nbsp;</p>
+
+<table>
+        <tr>
+            <th><p class="cell"><br/>Operation</p></th>
+            <th><p class="center">Database Read Access</p></th>
+            <th><p class="center">Database Write Access</p></th>
+            <th><p class="center">Table Read Access</p></th>
+            <th><p class="center">Table Write Access</p></th>
+        </tr>
+        <tr>
+            <td><p class="cell">LOAD</p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+            <td><p class="center">X</p></td>
+        </tr>
+        <tr>
+            <td><p class="cell">EXPORT</p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+            <td><p class="center">X</p></td>
+            <td><p> </p></td>
+        </tr>
+        <tr>
+            <td><p class="cell">IMPORT</p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+            <td><p class="center">X</p></td>
+        </tr>
+        <tr>
+            <td><p class="cell">CREATE TABLE</p></td>
+            <td><p> </p></td>
+            <td><p class="center">X</p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+        </tr>
+        <tr>
+            <td><p class="cell">CREATE TABLE AS SELECT</p></td>
+            <td><p> </p></td>
+            <td><p class="center">X</p></td>
+            <td><p class="center">X<br/>source table</p></td>
+            <td><p> </p></td>
+        </tr>
+        <tr>
+            <td><p class="cell">DROP TABLE</p></td>
+            <td><p> </p></td>
+            <td><p class="center">X</p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+        </tr>
+        <tr>
+            <td><p class="cell">SELECT</p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+            <td><p class="center">X</p></td>
+            <td><p> </p></td>
+        </tr>
+        <tr>
+            <td><p class="cell">ALTER TABLE </p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+            <td><p class="center">X</p></td>
+        </tr>
+        <tr>
+            <td><p class="cell">SHOW TABLES</p></td>
+            <td><p class="center">X</p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+            <td><p> </p></td>
+        </tr>
+</table>
+
+<p>&nbsp;</p>
+<p><strong>Caution:</strong> Hive's current implementation of this 
authorization model does not prevent malicious users from doing bad things. See 
the <a href="authorization.html#Known+Issues">Known Issues</a> section 
below.</p>
+
+  </section>
+
+  <!-- ==================================================================== -->
+
+  <section>
+  <title>Configuring Storage-System Based Authorization</title>
+
+<p>The implementation of the file-system based authorization model is 
available in the HCatalog package. (Support for this is likely to be added to 
the Hive package in the future.) So using this implementation requires 
installing the HCatalog package along with Hive.</p>
+
+<p>The HCatalog jar needs to be added to the Hive classpath. You can add the 
following to hive-env.sh to ensure that it gets added:</p>
+
+<source>export HIVE_AUX_JARS_PATH=&lt;path to hcatalog jar&gt;</source>
+
+<p>The following entries need to be added to hive-site.xml to enable 
authorization: </p>
+
+<source>  &lt;property&gt;
+    &lt;name&gt;hive.security.authorization.enabled&lt;/name&gt;
+    &lt;value&gt;true&lt;/value&gt;
+    &lt;description&gt;enable or disable the hive client 
authorization&lt;/description&gt;
+  &lt;/property&gt;
+
+  &lt;property&gt;
+    &lt;name&gt;hive.security.authorization.manager&lt;/name&gt;
+    
&lt;value&gt;org.apache.hcatalog.security.HdfsAuthorizationProvider&lt;/value&gt;
+    &lt;description&gt;the hive client authorization manager class name.
+    The user defined authorization class should implement interface
+    org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider.
+    &lt;/description&gt;
+  &lt;/property&gt;
+</source>
+
+<p>To disable authorization, set 
<code>hive.security.authorization.enabled</code> to false. To use the default 
authorization model of Hive, don’t set the 
<code>hive.security.authorization.manager</code> property.</p>
+
+  </section>
+
+<!-- ==================================================================== -->
+
+  <section>
+  <title>Creating New Tables or Databases</title>
+
+<p>To create new tables or databases with appropriate permissions, you can 
either use the Hive command line to create the table/database and then modify 
the permissions using a file system operation, or use the HCatalog command line 
(<code>hcat</code>) to create the database/table.</p>
+
+<p>The HCatalog command line tool uses the same syntax as Hive, and will 
create the table or database with a corresponding directory being owned by the 
user creating it, and a group corresponding to the “-g” argument and 
permissions specified in the “-p” argument.</p>
+
+  </section>
+
+<!-- ==================================================================== -->
+
+  <section>
+  <title>Known Issues</title>
+
+<ol>
+  <li>Some metadata operations (mostly read operations) do not check for 
authorization. See <a 
href="https://issues.apache.org/jira/browse/HIVE-3009";>https://issues.apache.org/jira/browse/HIVE-3009</a>.</li>
+  <li>The current implementation of Hive performs the authorization checks in 
the client. This means that malicious users can circumvent these checks.</li>
+  <li>A different authorization provider 
(StorageDelegationAuthorizationProvider) needs to be used for working with 
HBase tables as well. But that is not well tested.</li>
+  <li>Partition files and directories added by a Hive query don’t inherit 
permissions from the table. This means that even if you grant permissions for a 
group to access a table, new partitions will have read permissions only for the 
owner, if the default umask for the cluster is configured as such. See <a 
href="https://issues.apache.org/jira/browse/HIVE-3094";>https://issues.apache.org/jira/browse/HIVE-3094</a>.
 A separate "<code>hdfs chmod</code>" command will be necessary to modify the 
permissions.</li>
+</ol>
+
+  </section>
+
+  </body>
+</document>

Modified: 
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/site.xml
URL: 
http://svn.apache.org/viewvc/incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/site.xml?rev=1369907&r1=1369906&r2=1369907&view=diff
==============================================================================
--- 
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/site.xml
 (original)
+++ 
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/content/xdocs/site.xml
 Mon Aug  6 18:14:57 2012
@@ -47,6 +47,7 @@ See http://forrest.apache.org/docs/linki
     <index label="Storage Formats" href="supportedformats.html" />
     <index label="Dynamic Partitioning" href="dynpartition.html" />
     <index label="Notification" href="notification.html" />    
+    <index label="Authorization" href="authorization.html" />
 
     <api   label="API Docs" href="api/index.html"/>
   </docs>  

Modified: 
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/skinconf.xml
URL: 
http://svn.apache.org/viewvc/incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/skinconf.xml?rev=1369907&r1=1369906&r2=1369907&view=diff
==============================================================================
--- 
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/skinconf.xml 
(original)
+++ 
incubator/hcatalog/branches/branch-0.4/src/docs/src/documentation/skinconf.xml 
Mon Aug  6 18:14:57 2012
@@ -92,7 +92,7 @@ which will be used to configure the chos
   <disable-copyright-footer>false</disable-copyright-footer>
 
 <!-- @inception enable automatic generation of a date-range to current date -->
-  <year inception="true">2011</year>
+  <year inception="true">2012</year>
   <vendor>The Apache Software Foundation</vendor>
  
     
@@ -192,6 +192,14 @@ which will be used to configure the chos
       background-color: #f0f0f0;
       font-family: monospace;
     }
+    p.cell {
+      margin-left: .5em;
+      padding: .25em;
+    }
+    p.center {
+      text-align: center;
+      padding: .25em;
+    }
     <!--Example:
         To override the colours of links only in the footer.
     -->


Reply via email to