[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995693#comment-16995693 ] Hive QA commented on HIVE-9736: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12739895/HIVE-9736.8.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19918/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19918/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19918/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2019-12-13 15:11:21.427 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-19918/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2019-12-13 15:11:21.430 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 25efd17 HIVE-22327: Repl: Ignore read-only transactions in notification log (Denys Kuzmenko reviewed by mahesh kumar behera and Peter Vary) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 25efd17 HIVE-22327: Repl: Ignore read-only transactions in notification log (Denys Kuzmenko reviewed by mahesh kumar behera and Peter Vary) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2019-12-13 15:11:22.847 + rm -rf ../yetus_PreCommit-HIVE-Build-19918 + mkdir ../yetus_PreCommit-HIVE-Build-19918 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-19918 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-19918/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/common/src/java/org/apache/hadoop/hive/common/FileUtils.java: does not exist in index error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java: does not exist in index error: a/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java: does not exist in index error: a/shims/common/src/main/java/org/apache/hadoop/fs/DefaultFileAccess.java: does not exist in index error: a/shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java: does not exist in index error: a/shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java: does not exist in index error: patch failed: common/src/java/org/apache/hadoop/hive/common/FileUtils.java:25 Falling back to three-way merge... Applied patch to 'common/src/java/org/apache/hadoop/hive/common/FileUtils.java' with conflicts. error: patch failed: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:1634 Falling back to three-way merge... Applied patch to 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java' cleanly. error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java:18 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java' with conflicts. error: patch failed: shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:29 Falling back to three-way merge... Applied patch to 'shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java' with conflicts. error: patch failed: shims/common/src/main/java/org/apache/hadoop/fs/DefaultFileAccess.java:18 Falling back to three-way merge... Applied p
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995221#comment-16995221 ] Mithun Radhakrishnan commented on HIVE-9736: [~zshao], sorry I missed your comment. You're right about this not being faster today, since the DFSClient currently just loops on the client-side. The intention was for the file-list to be pushed down to the name-node, so as to avoid the loop. We intended to request for the {{DFSClient}} to implement this. :/ > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Affects Versions: 1.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan >Priority: Major > Labels: TODOC1.2 > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch, > HIVE-9736.8.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752550#comment-16752550 ] Hive QA commented on HIVE-9736: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12739895/HIVE-9736.8.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15795/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15795/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15795/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2019-01-25 18:31:06.721 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-15795/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2019-01-25 18:31:06.724 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 6d4b19b HIVE-11708 : Logical operators raises ClassCastExceptions with NULL (Ryu Kobayashi via Ashutosh Chauhan) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 6d4b19b HIVE-11708 : Logical operators raises ClassCastExceptions with NULL (Ryu Kobayashi via Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2019-01-25 18:31:07.350 + rm -rf ../yetus_PreCommit-HIVE-Build-15795 + mkdir ../yetus_PreCommit-HIVE-Build-15795 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-15795 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-15795/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/common/src/java/org/apache/hadoop/hive/common/FileUtils.java: does not exist in index error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java: does not exist in index error: a/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java: does not exist in index error: a/shims/common/src/main/java/org/apache/hadoop/fs/DefaultFileAccess.java: does not exist in index error: a/shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java: does not exist in index error: a/shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java: does not exist in index error: patch failed: common/src/java/org/apache/hadoop/hive/common/FileUtils.java:25 Falling back to three-way merge... Applied patch to 'common/src/java/org/apache/hadoop/hive/common/FileUtils.java' with conflicts. error: patch failed: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:1634 Falling back to three-way merge... Applied patch to 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java' cleanly. error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java:18 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java' with conflicts. error: patch failed: shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:29 Falling back to three-way merge... Applied patch to 'shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java' with conflicts. error: patch failed: shims/common/src/main/java/org/apache/hadoop/fs/DefaultFileAccess.java:18 Falling back to three-way merge... Applied patch to 'shims/common/src/main/java/org/apache/hadoop
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752547#comment-16752547 ] Zheng Shao commented on HIVE-9736: -- Are we sure that FileSystem.listStatus(Path[]) is faster than a loop over FileSystem.listStatus(Path)? The following code is in the Hadoop FIleSystem class. HDFS DistributedFileSystem class didn't override this method. {quote}{{public FileStatus[] listStatus(Path[] files, PathFilter filter) throws FileNotFoundException, IOException {}} {{ ArrayList results = new ArrayList();}} {{ for(int i = 0; i < files.length; ++i) {}} {{ this.listStatus(results, files[i], filter);}} {{ }}} {{ return (FileStatus[])results.toArray(new FileStatus[results.size()]);}} {{}}} {quote} > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Affects Versions: 1.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan >Priority: Major > Labels: TODOC1.2 > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch, > HIVE-9736.8.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588814#comment-14588814 ] Sushanth Sowmyan commented on HIVE-9736: Looks like we have a regression : org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition is failing while it shouldn't. This happened in the 9th May run as well. Error Message : expected:<1> but was:<0> Stacktrace: {noformat} java.lang.AssertionError: expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.dropPartitionByOtherUser(TestStorageBasedMetastoreAuthorizationDrops.java:202) at org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition(TestStorageBasedMetastoreAuthorizationDrops.java:172) {noformat} [~mithun], if we can look at this and resolve this, we can get this into 1.2.1, but if not, then I'm afraid this will have to be deferred out of branch-1.2, and make it in 1.3/2.0 . > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Affects Versions: 1.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Labels: TODOC1.2 > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch, > HIVE-9736.8.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588792#comment-14588792 ] Hive QA commented on HIVE-9736: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12739895/HIVE-9736.8.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9008 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4275/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4275/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4275/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12739895 - PreCommit-HIVE-TRUNK-Build > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Affects Versions: 1.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Labels: TODOC1.2 > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch, > HIVE-9736.8.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536981#comment-14536981 ] Hive QA commented on HIVE-9736: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12731062/HIVE-9736.7.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8912 tests executed *Failed tests:* {noformat} TestSparkClient - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3833/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3833/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3833/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12731062 - PreCommit-HIVE-TRUNK-Build > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Affects Versions: 1.2.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Labels: TODOC1.2 > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535883#comment-14535883 ] Sushanth Sowmyan commented on HIVE-9736: Removing fix version of 1.2.0 in preparation of release, since this is not a blocker for 1.2.0. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Affects Versions: 1.2.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Labels: TODOC1.2 > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535528#comment-14535528 ] Sushanth Sowmyan commented on HIVE-9736: I did not find this in the precommit queue, so I've manually added it in now : build#3815 should test this. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Labels: TODOC1.2 > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14534035#comment-14534035 ] Lefty Leverenz commented on HIVE-9736: -- Doc note: This adds configuration parameter *hive.authprovider.hdfs.liststatus.batch.size* to HiveConf.java, so it needs to be documented in the wiki (for whatever release it ends up in). * [Configuration Properties -- Authentication/Authorization | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Authentication/Authorization] > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Labels: TODOC1.2 > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14532252#comment-14532252 ] Sushanth Sowmyan commented on HIVE-9736: Sorry about the null-check, yup, I meant to check statuses.next(), but looks like that's not necessary. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531961#comment-14531961 ] Chris Nauroth commented on HIVE-9736: - I apologize for missing this in my code review. I'm +1 (non-binding) for patch v7 pending a fresh test run. I reran these tests locally and they passed, although they were also passing with the prior patch for me. Mithun, thank you for updating the patch. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch, HIVE-9736.7.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531885#comment-14531885 ] Mithun Radhakrishnan commented on HIVE-9736: @[~sushanth]: Quick question about the null-check: If the {{statuses}} are a result of {{FileSystem.listStatus(Path[])}}, then I don't see them being null, or returning null from {{FileStatus.getPath()}}. I think I might have missed the point you made. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531851#comment-14531851 ] Mithun Radhakrishnan commented on HIVE-9736: [~spena], [~sushanth], thanks for reporting the bug. Sorry for the inconvenience. I'll update the patch and see if that sorts things out. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531809#comment-14531809 ] Sushanth Sowmyan commented on HIVE-9736: Also, since this is a performance patch rather than an outtage or regression, and since it is not trivial either, I'm marking it as tentative for inclusion to 1.2.0, i.e. if it gets done in time, we will include it, but if not, we mark it for 1.2.1 > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531800#comment-14531800 ] Sushanth Sowmyan commented on HIVE-9736: The problem, as Sergio mentions, is that accessMethod is instantiated via reflection as a method that can take a Path and a FsAction. In the call, however, it is called with a FileStatus and an FsAction. To wit, this will fix it: {noformat} - accessMethod.invoke(fs, statuses.next(), combine(actions)); + accessMethod.invoke(fs, statuses.next().getPath(), combine(actions)); {noformat} This is easily fixed as a one-line fix, but I feel we need more testing. At the very least, I can see a case for a nullcheck past what I just mentioned. At this time, I recommend we close HIVE-10638 as a DUPLICATE, revert HIVE-9736, reopen it, add this fix in, run through the precommit tests again, and then get it in. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531722#comment-14531722 ] Sushanth Sowmyan commented on HIVE-9736: Hi Sergio, thanks for the catch, have filed https://issues.apache.org/jira/browse/HIVE-10638 for the same. [~mithun], could you please look at that issue? I will look through it too. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531704#comment-14531704 ] Sergio Peña commented on HIVE-9736: --- Hi [~mithun] This patch is causing the above tests to fail due to the change on {{Hadoop23Shims.checkFileAccess(FileSystem fs, Iterator statuses, EnumSet actions)}}. The line that fails is {{accessMethod.invoke(fs, statuses.next(), combine(actions));}} I an running hadoop 2.6.0, and the FileSystem.access() object accepts a Path and FsAction. When I run the code that checks patch permissions, I get this error: {noformat} hive> explain select * from a join b on a.id = b.id; FAILED: SemanticException Unable to determine if hdfs://localhost:9000/user/hive/warehouse/a is read only: java.lang.IllegalArgumentException: argument type mismatch {noformat} Is there a follow-up jira for this error? > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528724#comment-14528724 ] Chris Nauroth commented on HIVE-9736: - [~sushanth], thank you for your review and the commit! > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528696#comment-14528696 ] Sushanth Sowmyan commented on HIVE-9736: +1 : Have looked through patch and it makes sense. Tests pass, and I trust Chris' judgement on this for a more detailed verification. :) Will commit to master and branch-1.2 > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528348#comment-14528348 ] Hive QA commented on HIVE-9736: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12730299/HIVE-9736.6.patch {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8895 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3731/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3731/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3731/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12730299 - PreCommit-HIVE-TRUNK-Build > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527506#comment-14527506 ] Chris Nauroth commented on HIVE-9736: - I verified with both {{-Phadoop-2}} and {{-Phadoop-1}}. Thanks again, Mithun! > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527362#comment-14527362 ] Chris Nauroth commented on HIVE-9736: - Just as a reminder, we were asked to check the build with {{-Phadoop-1}}. I can volunteer to do that, but I think we'll need one more final revision of the patch intended to be committed. I'm +1 (non-binding) for the changes shown in the last patch though, so if it's just a rebase, then that wouldn't change. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527348#comment-14527348 ] Chris Nauroth commented on HIVE-9736: - I figured we could make {{DefaultFileAccess#combine}} public, and then {{Hadoop23Shims}} could call it. hive-shims-0.23 already has a dependency on hive-shims-common. However, if there is a detail that I'm missing, then I wouldn't intend to hold up the patch over making that change. +1 (non-binding) from me, and I defer to you on what's best to do with {{combine}} right now. Thank you for the patch, and thank you for responding to the code review feedback! > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527314#comment-14527314 ] Mithun Radhakrishnan commented on HIVE-9736: Thanks for the review, Chris. Would it be alright if we moved the {{combine()}} code to a common place as part of a separate JIRA? I didn't do this here because both call-sites are in different packages, and adding a dependency would be involved. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522338#comment-14522338 ] Sushanth Sowmyan commented on HIVE-9736: (Note that given that HIVE-9681 has now been committed, you'll have to unsquish 9681 out of this for any further patch updates if you want tests to rerun) > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520892#comment-14520892 ] Chris Nauroth commented on HIVE-9736: - I have one remaining comment in Review Board suggesting a possible reusable {{combine}} method for combining {{FsAction}} values instead of duplicating the logic. Aside from that very minor thing, I'm basically +1 (non-binding) for the patch. However, I still couldn't get the consolidated v5 patch to apply to master, so I couldn't check a build with {{-Phadoop-1}}. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520852#comment-14520852 ] Hive QA commented on HIVE-9736: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12729095/HIVE-9736.5.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3655/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3655/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3655/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-3655/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-git-master-source ]] + [[ ! -d apache-git-master-source/.git ]] + [[ ! -d apache-git-master-source ]] + cd apache-git-master-source + git fetch origin + git reset --hard HEAD HEAD is now at 3f5659f HIVE-10235 Loop optimization for SIMD in ColumnDivideColumn.txt (chengxiang, reviewed by Gopal V) + git clean -f -d Removing ql/src/java/org/apache/hadoop/hive/ql/security/authorization/HiveMultiPartitionAuthorizationProviderBase.java + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 3f5659f HIVE-10235 Loop optimization for SIMD in ColumnDivideColumn.txt (chengxiang, reviewed by Gopal V) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12729095 - PreCommit-HIVE-TRUNK-Build > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520508#comment-14520508 ] Sushanth Sowmyan commented on HIVE-9736: [~thejas], could you please review this patch? I'm waiting on HIVE-9681 test passing before committing that, so this patch should be treated in succession after that. (see .4.patch for this patch itself) Also, as a general note, HIVE-10444 is in-queue as well, and hopefully this patch does not create any new -Phadoop-1 problems. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch, HIVE-9736.5.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518943#comment-14518943 ] Sushanth Sowmyan commented on HIVE-9736: Hi, just so this gets into the precommit queue, could you upload a HIVE-9736.5.patch which is really the combination of HIVE-9681 and HIVE-9736.4.patch and set this jira to patch-available? When committing it, I'll be sure to use the .4.patch, even uploading a new .6.patch which is its equivalent to make it clear for future java visitors, but this would make the precommit queue pick it up. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, > HIVE-9736.4.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518834#comment-14518834 ] Mithun Radhakrishnan commented on HIVE-9736: Hello, Chris. bq. ... we can combine the multiple actions by using FsAction#or, and then call accessMethod.invoke just once... Yikes! I might've missed incorporating that suggestion by accident. Thank you for following up. I'll update the patch shortly. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518695#comment-14518695 ] Chris Nauroth commented on HIVE-9736: - Hi [~mithun]. Thank you for uploading a new patch. I was unable to apply patch v3 to the master branch. Does it need to be rebased, or should I be working with a different branch? There was one suggestion I made on Review Board that still isn't implemented. In {{Hadoop23Shims#checkFileAccess}}, we can combine the multiple {{actions}} by using {{FsAction#or}}, and then call {{accessMethod.invoke}} just once to do the check in a single RPC (per file). Were you planning to make this change, or is there a reason you decided not to do it? Aside from that, I can see all of my other feedback has been addressed. Thanks again! > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492966#comment-14492966 ] Chris Nauroth commented on HIVE-9736: - Thank you for the rebased patch. It looks great to me overall. I've entered a few comments in ReviewBoard for your consideration regarding consolidation of RPC calls and a few other minor things. https://reviews.apache.org/r/31615/ > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486260#comment-14486260 ] Mithun Radhakrishnan commented on HIVE-9736: @[~cnauroth]: Good to meet you, sir. I'd value your input on this change, given that you've worked on the SBAP already. bq. Great ideas in this patch! Aww, shucks... You're only saying that because it's true. ;p I should have a rebased version for you shortly. I'd better sort HIVE-9674 out first. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486193#comment-14486193 ] Chris Nauroth commented on HIVE-9736: - Hi [~mithun]. Great ideas in this patch! I'd be happy to help code review (non-binding) on a rebased version of the patch. I'll watch for it. Thanks! > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486159#comment-14486159 ] Mithun Radhakrishnan commented on HIVE-9736: Ok, I'd better rebase this change. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486151#comment-14486151 ] Sushanth Sowmyan commented on HIVE-9736: I've not looked at this patch in detail yet. But I'd also like to point to some refactoring [~cnauroth] did recently in HIVE-10223 for you to look at to see that we gel with some of Chris's work. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342835#comment-14342835 ] Mithun Radhakrishnan commented on HIVE-9736: The review on RB is [r/31615|https://reviews.apache.org/r/31615/]. Apologies for the delay. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335645#comment-14335645 ] Thejas M Nair commented on HIVE-9736: - [~mithun] Can you please upload the patch to reviewboard ? > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14334157#comment-14334157 ] Mithun Radhakrishnan commented on HIVE-9736: Tagging [~thejas], since this marches all over the security-work (and SBAP). Sorry the patch looks a little big... The actual changes aren't, really. > StorageBasedAuthProvider should batch namenode-calls where possible. > > > Key: HIVE-9736 > URL: https://issues.apache.org/jira/browse/HIVE-9736 > Project: Hive > Issue Type: Bug > Components: Metastore, Security >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-9736.1.patch > > > Consider a table partitioned by 2 keys (dt, region). Say a dt partition could > have 1 associated regions. Consider that the user does: > {code:sql} > ALTER TABLE my_table DROP PARTITION (dt='20150101'); > {code} > As things stand now, {{StorageBasedAuthProvider}} will make individual > {{DistributedFileSystem.listStatus()}} calls for each partition-directory, > and authorize each one separately. It'd be faster to batch the calls, and > examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)