Alexandre Linte created HIVE-13819:
--------------------------------------
Summary: Read & eXecute permissions on Database allows to ALTER it.
Key: HIVE-13819
URL: https://issues.apache.org/jira/browse/HIVE-13819
Project: Hive
Issue Type: Bug
Components: Authorization
Affects Versions: 1.2.1
Environment: Hadoop 2.7.2, Hive 1.2.1, Kerberos.
Reporter: Alexandre Linte
Hi,
As the owner of an Hive database I can modify the Hive database metadata
whereas I only has the read and execute permission on the Hive database
repository.
I was expected to not be able to modify these metadata.
Context:
- Hive database configured with the Storage Based Authorization strategy.
- Hive client authorization is disabled.
- Metastore side security is activated.
Permission configuration:
{noformat}
dr-x--x--- - hive9990 hive9990 0 2016-05-20 17:10
/path/to/hive/warehouse/p09990.db
{noformat}
ALTER command as hive9990 user:
{noformat}
hive (p09990)> ALTER DATABASE p09990 SET DBPROPERTIES ('comment'='database
altered');
OK
Time taken: 0.277 seconds
hive (p09990)> DESCRIBE DATABASE EXTENDED p09990;
OK
p09990 hdfs://path/to/hive/warehouse/p09990.db hdfs USER
{comment=database altered}
{noformat}
Configuration of hive-site.xml on the metastore:
{noformat}
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.security.authorization.enabled</name>
<value>false</value>
<description>enable or disable the Hive client authorization</description>
</property>
<property>
<name>hive.security.metastore.authorization.manager</name>
<value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
<description>authorization manager class name to be used in the metastore
for authorization.
The user defined authorization class should implement interface
org.apache.hadoop.hive.ql.security.authorization.HiveMetastoreAuthorizationProvider.
</description>
</property>
<property>
<name>hive.metastore.pre.event.listeners</name>
<value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
<description>This turns on metastore-side security.
</description>
</property>
<property>
<name>hive.security.metastore.authorization.auth.reads</name>
<value>true</value>
<description>If this is true, the metastore authorizer authorizes read
actions on database and table.
</description>
</property>
<property>
<name>hive.security.authorization.manager</name>
<value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
<description>The Hive client authorization manager class name.
The user defined authorization class should implement interface
org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider.
</description>
</property>
<property>
<name>hive.security.authorization.createtable.owner.grants</name>
<value>ALL</value>
<description>the privileges automatically granted to the owner whenever a
table gets created.
An example like "select,drop" will grant select and drop privilege to
the owner of the table</description>
</property>
<property>
<name>hive.users.in.admin.role</name>
<value>hdfs</value>
<description>Comma separated list of users who are in admin role for
bootstrapping.
More users can be added in ADMIN role later.</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/path/to/hive/warehouse/</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<description>Whether to include the current database in the Hive
prompt.</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://hiveserver2http01:9083</value>
<description>Thrift uri for the remote metastore. Used by metastore
client to connect to remote metastore.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>JDBC Driver</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hivedb01/metastore</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>metastore</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>********</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
<description>creates necessary schema on a startup if one doesn't exist.
set this to false, after creating it once</description>
</property>
<property>
<name>hive.metastore.authorization.storage.checks</name>
<value>true</value>
<description>Should the metastore do authorization checks against the
underlying storage
for operations like drop-partition (disallow the drop-partition if the user in
question doesn't have permissions to delete the corresponding directory
on the storage).</description>
</property>
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
<description>If true, the metastore thrift interface will be secured with
SASL. Clients must authenticate with Kerberos.</description>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/path/to/metastore.keytab</value>
<description>The path to the Kerberos Keytab file containing the
metastore thrift server's service principal.</description>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>primary/instance@realm</value>
<description>The service principal for the metastore thrift server. The
special string _HOST will be replaced automatically with the correct host
name.</description>
</property>
<property>
<name>hive.server2.max.start.attempts</name>
<value>30</value>
<description>This number of times HiveServer2 will attempt to start
before exiting, sleeping 60 seconds between retries. The default of 30 will
keep trying for 30 minutes.</description>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>binary</value>
<description>Server transport mode. "binary" or "http".</description>
</property>
<property>
<name>hive.server2.thrift.http.port</name>
<value>10001</value>
<description>Port number when in HTTP mode.</description>
</property>
<property>
<name>hive.server2.thrift.http.path</name>
<value>bdcorp</value>
<description>Path component of URL endpoint when in HTTP
mode.</description>
</property>
<property>
<name>hive.server2.use.SSL</name>
<value>false</value>
<description>Set this to true for using SSL encryption in
HiveServer2</description>
</property>
<property>
<name>hive.server2.keystore.path</name>
<value></value>
<description>SSL certificate keystore location</description>
</property>
<property>
<name>hive.server2.keystore.password</name>
<value></value>
<description>SSL certificate keystore password.</description>
</property>
<property>
<name>hive.server2.authentication.pam.services</name>
<value></value>
<description>List of the underlying pam services that should be used when
auth type is PAM.
A file with the same name must exist in /etc/pam.d</description>
</property>
<property>
<name>hive.server2.thrift.min.worker.threads</name>
<value>5</value>
<description>Minimum number of Thrift worker threads</description>
</property>
<property>
<name>hive.server2.thrift.max.worker.threads</name>
<value>500</value>
<description>Maximum number of Thrift worker threads</description>
</property>
<property>
<name>hive.server2.thrift.worker.keepalive.time</name>
<value>60</value>
<description>Keepalive time (in seconds) for an idle worker thread.
When number of workers > min workers, excess threads are killed after this
time interval.
</description>
</property>
<property>
<name>hive.server2.thrift.http.cookie.auth.enabled</name>
<value>true</value>
<description>When true, HiveServer2 in HTTP transport mode will use
cookie based authentication mechanism.
</description>
</property>
<property>
<name>hive.server2.thrift.http.cookie.max.age</name>
<value>86400s</value>
<description>Maximum age in seconds for server side cookie used by
HiveServer2 in HTTP mode.
</description>
</property>
<property>
<name>hive.server2.thrift.http.cookie.path</name>
<value></value>
<description>Path for the HiveServer2 generated cookies.
</description>
</property>
<property>
<name>hive.server2.thrift.http.cookie.domain</name>
<value></value>
<description>Domain for the HiveServer2 generated cookies.
</description>
</property>
<property>
<name>hive.server2.thrift.http.cookie.is.secure</name>
<value>true</value>
<description>Secure attribute of the HiveServer2 generated cookie.
</description>
</property>
<property>
<name>hive.server2.thrift.http.cookie.is.httponly</name>
<value>true</value>
<description>HttpOnly attribute of the HiveServer2 generated cookie.
</description>
</property>
<property>
<name>hive.server2.async.exec.threads</name>
<value>100</value>
<description>Number of threads in the async thread pool for
HiveServer2</description>
</property>
<property>
<name>hive.server2.async.exec.shutdown.timeout</name>
<value>10</value>
<description>Time (in seconds) for which HiveServer2 shutdown will wait
for async
threads to terminate</description>
</property>
<property>
<name>hive.server2.async.exec.keepalive.time</name>
<value>10</value>
<description>Time (in seconds) that an idle HiveServer2 async thread
(from the thread pool) will wait
for a new task to arrive before terminating</description>
</property>
<property>
<name>hive.server2.long.polling.timeout</name>
<value>5000</value>
<description>Time in milliseconds that HiveServer2 will wait, before
responding to asynchronous calls that use long polling</description>
</property>
<property>
<name>hive.server2.async.exec.wait.queue.size</name>
<value>100</value>
<description>Size of the wait queue for async thread pool in HiveServer2.
After hitting this limit, the async thread pool will reject new
requests.</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>Port number of HiveServer2 Thrift interface.
Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>hiveserver2http01</value>
<description>Bind host on which to run the HiveServer2 Thrift interface.
Can be overridden by setting $HIVE_SERVER2_THRIFT_BIND_HOST</description>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
<description>
Client authentication types.
NONE: no authentication check
LDAP: LDAP/AD based authentication
KERBEROS: Kerberos/GSSAPI authentication
CUSTOM: Custom authentication provider
(Use with property hive.server2.custom.authentication.class)
PAM: Pluggable authentication module.
</description>
</property>
<property>
<name>hive.server2.custom.authentication.class</name>
<value></value>
<description>
Custom authentication class. Used when property
'hive.server2.authentication' is set to 'CUSTOM'. Provided class
must be a proper implementation of the interface
org.apache.hive.service.auth.PasswdAuthenticationProvider. HiveServer2
will call its Authenticate(user, passed) method to authenticate requests.
The implementation may optionally extend Hadoop's
org.apache.hadoop.conf.Configured class to grab Hive's Configuration object.
</description>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>primary/instance@realm</value>
<description>
Kerberos server principal
</description>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/path/to/hiveserver2.keytab</value>
<description>
Kerberos keytab file for server principal
</description>
</property>
<property>
<name>hive.server2.authentication.spnego.principal</name>
<value>primary/instance@realm</value>
<description>
SPNego service principal, optional,
typical value would look like HTTP/[email protected]
SPNego service principal would be used by hiveserver2 when kerberos
security is enabled
and HTTP transport mode is used.
This needs to be set only if SPNEGO is to be used in authentication.
</description>
</property>
<property>
<name>hive.server2.authentication.spnego.keytab</name>
<value>/path/to/spnego.keytab</value>
<description>
keytab file for SPNego principal, optional,
typical value would look like /etc/security/keytabs/spnego.service.keytab,
This keytab would be used by hiveserver2 when kerberos security is enabled
and HTTP transport mode is used.
This needs to be set only if SPNEGO is to be used in authentication.
SPNego authentication would be honored only if valid
hive.server2.authentication.spnego.principal
and
hive.server2.authentication.spnego.keytab
are specified
</description>
</property>
<property>
<name>hive.server2.authentication.ldap.url</name>
<value>setindatabag</value>
<description>
LDAP connection URL
</description>
</property>
<property>
<name>hive.server2.authentication.ldap.baseDN</name>
<value>setindatabag</value>
<description>
LDAP base DN
</description>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
<description>
Setting this property to true will have HiveServer2 execute
Hive operations as the user making the calls to it.
</description>
</property>
<property>
<name>hive.execution.engine</name>
<value>mr</value>
<description>
Chooses execution engine. Options are: mr (Map reduce, default) or tez
(hadoop 2 only)
</description>
</property>
<property>
<name>hive.mapjoin.optimized.hashtable</name>
<value>true</value>
<description>Whether Hive should use a memory-optimized hash table for
MapJoin.
Only works on Tez, because memory-optimized hash table cannot be serialized.
</description>
</property>
<property>
<name>hive.mapjoin.optimized.hashtable.wbsize</name>
<value>10485760</value>
<description>Optimized hashtable (see hive.mapjoin.optimized.hashtable)
uses a chain of buffers to store data.
This is one buffer size. Hashtable may be slightly faster if this is
larger,
but for small joins unnecessary memory will be allocated and then trimmed.
</description>
</property>
<property>
<name>hive.prewarm.enabled</name>
<value>false</value>
<description>
Enables container prewarm for tez (hadoop 2 only)
</description>
</property>
<property>
<name>hive.prewarm.numcontainers</name>
<value>10</value>
<description>
Controls the number of containers to prewarm for tez (hadoop 2 only)
</description>
</property>
<property>
<name>hive.server2.table.type.mapping</name>
<value>CLASSIC</value>
<description>
This setting reflects how HiveServer2 will report the table types for JDBC
and other
client implementations that retrieve the available tables and supported
table types
HIVE : Exposes Hive's native table types like MANAGED_TABLE,
EXTERNAL_TABLE, VIRTUAL_VIEW
CLASSIC : More generic types like TABLE and VIEW
</description>
</property>
<property>
<name>hive.server2.thrift.sasl.qop</name>
<value>auth</value>
<description>Sasl QOP value; Set it to one of following values to enable
higher levels of
protection for HiveServer2 communication with clients.
"auth" - authentication only (default)
"auth-int" - authentication plus integrity protection
"auth-conf" - authentication plus integrity and confidentiality protection
This is applicable only if HiveServer2 is configured to use Kerberos
authentication.
</description>
</property>
<property>
<name>hive.tez.container.size</name>
<value>-1</value>
<description>By default tez will spawn containers of the size of a
mapper. This can be used to overwrite.</description>
</property>
<property>
<name>hive.tez.java.opts</name>
<value></value>
<description>By default tez will use the java opts from map tasks. This
can be used to overwrite.</description>
</property>
<property>
<name>hive.tez.log.level</name>
<value>INFO</value>
<description>
The log level to use for tasks executing as part of the DAG.
Used only if hive.tez.java.opts is used to configure java opts.
</description>
</property>
<property>
<name>hive.tez.smb.number.waves</name>
<value>1</value>
<description>The number of waves in which to run the SMB
(sort-merge-bucket) join.
Account for cluster being occupied. Ideally should be 1 wave.
</description>
</property>
<property>
<name>hive.tez.cpu.vcores</name>
<value>-1</value>
<description>By default Tez will ask for however many CPUs MapReduce is
configured to use per container.
This can be used to overwrite the default.
</description>
</property>
<property>
<name>hive.tez.auto.reducer.parallelism</name>
<value>false</value>
<description>Turn on Tez' auto reducer parallelism feature. When enabled,
Hive will still estimate data sizes and set parallelism estimates.
Tez will sample source vertices' output sizes and adjust the estimates at
runtime as necessary.
</description>
</property>
<property>
<name>hive.auto.convert.join</name>
<value>true</value>
<description>
</description>
</property>
<property>
<name>hive.auto.convert.join.noconditionaltask</name>
<value>true</value>
<description>
</description>
</property>
<property>
<name>hive.auto.convert.join.noconditionaltask.size</name>
<value>1</value>
<description>
</description>
</property>
<property>
<name>hive.vectorized.execution.enabled</name>
<value>true</value>
<description>This flag should be set to true to enable vectorized mode of
query execution. The default value is false.
</description>
</property>
<property>
<name>hive.vectorized.execution.reduce.enabled</name>
<value>false</value>
<description>This flag should be set to true to enable vectorized mode of
the reduce-side of query execution. The default value is true.
</description>
</property>
<property>
<name>hive.cbo.enable</name>
<value>true</value>
<description>When true, the cost based optimizer, which uses the Calcite
framework, will be enabled.
</description>
</property>
<property>
<name>hive.fetch.task.conversion</name>
<value>more</value>
<description>Some select queries can be converted to a single FETCH task,
minimizing latency.
Currently the query should be single sourced not having any subquery and
should not have any aggregations or distincts
(which incur RS – ReduceSinkOperator, requiring a MapReduce task), lateral
views and joins.
</description>
</property>
<property>
<name>hive.fetch.task.conversion.threshold</name>
<value>1073741824</value>
<description>Input threshold (in bytes) for applying
hive.fetch.task.conversion.
If target table is native, input length is calculated by summation of file
lengths.
If it's not native, the storage handler for the table can optionally
implement the org.apache.hadoop.hive.ql.metadata.InputEstimator interface.
A negative threshold means hive.fetch.task.conversion is applied without
any input length threshold.
</description>
</property>
<property>
<name>hive.fetch.task.aggr</name>
<value>false</value>
<description>Aggregation queries with no group-by clause (for example,
select count(*) from src) execute final aggregations in a single reduce task.
If this parameter is set to true, Hive delegates the final aggregation
stage to a fetch task, possibly decreasing the query time.
</description>
</property>
<property>
<name>hive.spark.job.monitor.timeout</name>
<value>60</value>
<description>Timeout for job monitor to get Spark job state.
</description>
</property>
<property>
<name>hive.spark.client.future.timeout</name>
<value>60</value>
<description>Timeout for requests from Hive client to remote Spark driver.
</description>
</property>
<property>
<name>hive.spark.client.connect.timeout</name>
<value>1000</value>
<description>Timeout for remote Spark driver in connecting back to Hive
client.
</description>
</property>
<property>
<name>hive.spark.client.channel.log.level</name>
<value></value>
<description>Channel logging level for remote Spark driver. One of DEBUG,
ERROR, INFO, TRACE, WARN. If unset, TRACE is chosen.
</description>
</property>
<property>
<name>hive.server2.tez.default.queues</name>
<value></value>
<description>
A list of comma separated values corresponding to yarn queues of the same
name.
When hive server 2 is launched in tez mode, this configuration needs to be
set
for multiple tez sessions to run in parallel on the cluster.
</description>
</property>
<property>
<name>hive.server2.tez.sessions.per.default.queue</name>
<value>1</value>
<description>
A positive integer that determines the number of tez sessions that should be
launched on each of the queues specified by
"hive.server2.tez.default.queues".
Determines the parallelism on each queue.
</description>
</property>
<property>
<name>hive.server2.tez.initialize.default.sessions</name>
<value>false</value>
<description>
This flag is used in hive server 2 to enable a user to use hive server 2
without
turning on tez for hive server 2. The user could potentially want to run
queries
over tez without the pool of sessions.
</description>
</property>
<property>
<name>hive.support.sql11.reserved.keywords</name>
<value>true</value>
<description>Whether to enable support for SQL2011 reserved keywords.
When enabled, will support (part of) SQL2011 reserved keywords.
</description>
</property>
<property>
<name>hive.aux.jars.path</name>
<value></value>
<description>A comma separated list (with no spaces) of the jar
files</description>
</property>
</configuration>
{noformat}
Best regards.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)