Hi Hive Team,

As per my understanding, in Hive, you can create two kinds of tables:
Managed and External.

In case of managed table, you own the data and hence when you drop the
table the data is deleted.

In case of external table, you don't have ownership of the data and hence
when you delete such a table, the underlying data is not deleted. Only
metadata is deleted.

Now, recently i have observed that you can not create an external table
over a location on which you don't have write (modification) permissions in
HDFS. I completely fail to understand this.

Use case: It is quite common that the data you are churning is huge and
read-only. So, to churn such data via Hive, will you have to copy this huge
data to a location on which you have write permissions?

Please help.

My data is located in a hdfs folder
(/data/SentimentFiles/SentimentFiles/upload/data/tweets_raw/)  on which I
only have readonly permission. And I am trying to execute the following
command

*CREATE EXTERNAL TABLE tweets_raw (*
*        id BIGINT,*
*        created_at STRING,*
*        source STRING,*
*        favorited BOOLEAN,*
*        retweet_count INT,*
*        retweeted_status STRUCT<*
*        text:STRING,*
*        users:STRUCT<screen_name:STRING,name:STRING>>,*
*        entities STRUCT<*
*        urls:ARRAY<STRUCT<expanded_url:STRING>>,*
*        user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,*
*        hashtags:ARRAY<STRUCT<text:STRING>>>,*
*        text STRING,*
*        user1 STRUCT<*
*        screen_name:STRING,*
*        name:STRING,*
*        friends_count:INT,*
*        followers_count:INT,*
*        statuses_count:INT,*
*        verified:BOOLEAN,*
*        utc_offset:STRING, -- was INT but nulls are strings*
*        time_zone:STRING>,*
*        in_reply_to_screen_name STRING,*
*        year int,*
*        month int,*
*        day int,*
*        hour int*
*        )*
*        ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'*
*        WITH SERDEPROPERTIES ("ignore.malformed.json" = "true")*
*        LOCATION
'/data/SentimentFiles/SentimentFiles/upload/data/tweets_raw/'*
*        ;*

It throws the following error:

FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:java.security.AccessControlException: Permission
denied: user=sandeep, access=WRITE,
inode="/data/SentimentFiles/SentimentFiles/upload/data/tweets_raw":hdfs:hdfs:drwxr-xr-x
        at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
        at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219)
        at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1771)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1755)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1729)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:8348)
        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:1978)
        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.ja
va:1443)
        at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProto
s.java)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)



-- 
Regards,
Sandeep Giri,
+1-(347) 781-4573 (US)
+91-953-899-8962 (IN)
www.CloudxLab.com  (A Hadoop cluster for practicing)

Reply via email to