Hi

[margusja@bigdata29 ~]$ hdfs dfs -count /tmp/files_10k
           1       100000             588895 /tmp/files_10k

Connected to: Apache Hive (version 1.2.1.2.3.4.0-3485)
Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485)
Transaction isolation: TRANSACTION_REPEATABLE_READ
1: jdbc:hive2://bigdata29.webmedia.int:10000/> create external table files_10k (i int) row format delimited fields terminated by '\t' location '/tmp/files_10k';
Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0)
1: jdbc:hive2://bigdata29.webmedia.int:10000/>


In namenode log there are loads of lines like:

op.hdfs.protocol.ClientProtocol
2016-05-17 01:57:55,408 INFO ipc.Server (Server.java:saslProcess(1386)) - Auth successful for hive/bigdata29.webmedia....@testhadoop.com (auth:KERBEROS) 2016-05-17 01:57:55,409 INFO authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(135)) - Authorization successful for margusja (auth:PROXY) via hive/bigdata29.webmedia....@testhadoop.com (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol

In hiveserver2.log there is loads of lines like

2016-05-17 01:58:40,202 INFO [org.apache.hadoop.util.JvmPauseMonitor$Monitor@6704df84]: util.JvmPauseMonitor (JvmPauseMonitor.java:run(195)) - Detected pause in JVM or host machine (eg GC): pause of approximately 1221ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=1251ms

2016-05-17 02:00:12,021 INFO [org.apache.hadoop.util.JvmPauseMonitor$Monitor@6704df84]: util.JvmPauseMonitor (JvmPauseMonitor.java:run(195)) - Detected pause in JVM or host machine (eg GC): pause of approximately 1455ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=1946ms
2016-05-17 02:00:13,963 INFO [org.apache.hadoop.util.JvmPauseMonitor$Monitor@6704df84]: util.JvmPauseMonitor (JvmPauseMonitor.java:run(195)) - Detected pause in JVM or host machine (eg GC): pause of approximately 1441ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=1928ms


Now I disable ranger and:

Connected to: Apache Hive (version 1.2.1.2.3.4.0-3485)
Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485)
Transaction isolation: TRANSACTION_REPEATABLE_READ
4: jdbc:hive2://bigdata29.webmedia.int:10000/> create external table files_10k (i int) row format delimited fields terminated by '\t' location '/tmp/files_10k';
No rows affected (1.399 seconds)
4: jdbc:hive2://bigdata29.webmedia.int:10000/>

Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780

On 17/05/16 01:24, Don Bosco Durai wrote:
There is an implicit check done by HiveServer2 to make sure the user has access to the external files. You are correct, at the HDFS level, each file permission is individually checked.

What sort of error are you getting in hive log file? And is there any error on the HDFS side?

Thanks

Bosco


From: Margus Roo <mar...@roo.ee <mailto:mar...@roo.ee>>
Reply-To: <user@ranger.incubator.apache.org <mailto:user@ranger.incubator.apache.org>>
Date: Monday, May 16, 2016 at 7:36 AM
To: <user@ranger.incubator.apache.org <mailto:user@ranger.incubator.apache.org>>
Subject: Can not create hive2 external table

    Hi

    In case I try to create external table and I have in example 100
    000 files in locations I point in hive DDL and Ranger
    authorization is enabled then I will get different errors in hive
    log. Mainly they are GC timeouts.
    In hdfs namenode log I can see loads of authorization rows. I
    think ranger is doing check for every single file ?!

    In case I disable ranger authorization for hive then it creates
    external table.

    So - am I doing something wrong?

-- Margus (margusja) Roo
    http://margus.roo.ee
    skype: margusja
    +372 51 48 780


Reply via email to