[jira] [Commented] (HIVE-21718) Improvement performance of UpdateInputAccessTimeHook

2019-05-21 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845002#comment-16845002
 ] 

Aihua Xu commented on HIVE-21718:
-

[~ngangam] Sorry for the late reply. I will take a look.

> Improvement performance of UpdateInputAccessTimeHook
> 
>
> Key: HIVE-21718
> URL: https://issues.apache.org/jira/browse/HIVE-21718
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.1.1
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21718.2.patch, HIVE-21718.patch
>
>
> Currently, Hive does not update the lastAccessTime property for any entities 
> when a query accesses them. Thus it has not possible to know when a table was 
> last accessed.
> Hive does provide a configurable hook to HS2 that is execcuted as a pre-query 
> hook prior to the query being executed. However, this hook is inefficient 
> because for each table or partition it is attempting to update time for, it 
> executes an "alter table ... " command internally. This is bad 
> 1) For a query touching 1000's of partitions, this hook takes forever to 
> update them.
> 2) Meanwhile, it is holding up the original query from executing.
> So even though we do not recommend using the hook, because the reward is too 
> little (having lastAccessTime updated), we realize there is no other means to 
> achieve this.
> Also, we can improve the performance of the hook significantly by adding a 
> new thrift API on HMS to update the lastAccessTime on the database rows 
> directly instead of going to HMS front end for 1 entity at time (leading to 
> 1000's of HMS calls that lead to multiple 1000's of calls to the database).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21718) Improvement performance of UpdateInputAccessTimeHook

2019-05-14 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839822#comment-16839822
 ] 

Naveen Gangam commented on HIVE-21718:
--

Review posted to RB at https://reviews.apache.org/r/70645/

> Improvement performance of UpdateInputAccessTimeHook
> 
>
> Key: HIVE-21718
> URL: https://issues.apache.org/jira/browse/HIVE-21718
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.1.1
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21718.2.patch, HIVE-21718.patch
>
>
> Currently, Hive does not update the lastAccessTime property for any entities 
> when a query accesses them. Thus it has not possible to know when a table was 
> last accessed.
> Hive does provide a configurable hook to HS2 that is execcuted as a pre-query 
> hook prior to the query being executed. However, this hook is inefficient 
> because for each table or partition it is attempting to update time for, it 
> executes an "alter table ... " command internally. This is bad 
> 1) For a query touching 1000's of partitions, this hook takes forever to 
> update them.
> 2) Meanwhile, it is holding up the original query from executing.
> So even though we do not recommend using the hook, because the reward is too 
> little (having lastAccessTime updated), we realize there is no other means to 
> achieve this.
> Also, we can improve the performance of the hook significantly by adding a 
> new thrift API on HMS to update the lastAccessTime on the database rows 
> directly instead of going to HMS front end for 1 entity at time (leading to 
> 1000's of HMS calls that lead to multiple 1000's of calls to the database).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21718) Improvement performance of UpdateInputAccessTimeHook

2019-05-14 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839819#comment-16839819
 ] 

Naveen Gangam commented on HIVE-21718:
--

[~aihuaxu] [~ychena] [~daijy] Could you please review this ? Thanks

> Improvement performance of UpdateInputAccessTimeHook
> 
>
> Key: HIVE-21718
> URL: https://issues.apache.org/jira/browse/HIVE-21718
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.1.1
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21718.2.patch, HIVE-21718.patch
>
>
> Currently, Hive does not update the lastAccessTime property for any entities 
> when a query accesses them. Thus it has not possible to know when a table was 
> last accessed.
> Hive does provide a configurable hook to HS2 that is execcuted as a pre-query 
> hook prior to the query being executed. However, this hook is inefficient 
> because for each table or partition it is attempting to update time for, it 
> executes an "alter table ... " command internally. This is bad 
> 1) For a query touching 1000's of partitions, this hook takes forever to 
> update them.
> 2) Meanwhile, it is holding up the original query from executing.
> So even though we do not recommend using the hook, because the reward is too 
> little (having lastAccessTime updated), we realize there is no other means to 
> achieve this.
> Also, we can improve the performance of the hook significantly by adding a 
> new thrift API on HMS to update the lastAccessTime on the database rows 
> directly instead of going to HMS front end for 1 entity at time (leading to 
> 1000's of HMS calls that lead to multiple 1000's of calls to the database).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21718) Improvement performance of UpdateInputAccessTimeHook

2019-05-14 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839803#comment-16839803
 ] 

Hive QA commented on HIVE-21718:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12968701/HIVE-21718.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16008 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17211/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17211/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17211/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12968701 - PreCommit-HIVE-Build

> Improvement performance of UpdateInputAccessTimeHook
> 
>
> Key: HIVE-21718
> URL: https://issues.apache.org/jira/browse/HIVE-21718
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.1.1
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21718.2.patch, HIVE-21718.patch
>
>
> Currently, Hive does not update the lastAccessTime property for any entities 
> when a query accesses them. Thus it has not possible to know when a table was 
> last accessed.
> Hive does provide a configurable hook to HS2 that is execcuted as a pre-query 
> hook prior to the query being executed. However, this hook is inefficient 
> because for each table or partition it is attempting to update time for, it 
> executes an "alter table ... " command internally. This is bad 
> 1) For a query touching 1000's of partitions, this hook takes forever to 
> update them.
> 2) Meanwhile, it is holding up the original query from executing.
> So even though we do not recommend using the hook, because the reward is too 
> little (having lastAccessTime updated), we realize there is no other means to 
> achieve this.
> Also, we can improve the performance of the hook significantly by adding a 
> new thrift API on HMS to update the lastAccessTime on the database rows 
> directly instead of going to HMS front end for 1 entity at time (leading to 
> 1000's of HMS calls that lead to multiple 1000's of calls to the database).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21718) Improvement performance of UpdateInputAccessTimeHook

2019-05-14 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839781#comment-16839781
 ] 

Hive QA commented on HIVE-21718:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
48s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
56s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
16s{color} | {color:blue} standalone-metastore/metastore-server in master has 
181 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
30s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
47s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} standalone-metastore/metastore-common: The patch 
generated 1 new + 388 unchanged - 0 fixed = 389 total (was 388) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 5 new + 1638 unchanged - 0 fixed = 1643 total (was 1638) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
49s{color} | {color:red} ql: The patch generated 1 new + 208 unchanged - 0 
fixed = 209 total (was 208) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
26s{color} | {color:red} standalone-metastore/metastore-server generated 3 new 
+ 181 unchanged - 0 fixed = 184 total (was 181) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  1m  
5s{color} | {color:red} standalone-metastore_metastore-common generated 2 new + 
45 unchanged - 0 fixed = 47 total (was 45) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:standalone-metastore/metastore-server |
|  |  
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.updateLastAccessTime(Map, 
int) concatenates strings using + in a loop  At MetaStoreDirectSql.java:+ in a 
loop  At MetaStoreDirectSql.java:[line 545] |
|  |  
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.updateLastAccessTime(Map, 
int) passes a nonconstant String to an execute or addBatch method on an SQL 
statement  At MetaStoreDirectSql.java:String to an execute or addBatch method 
on an SQL statement  At MetaStoreDirectSql.java:[line 558] |
|  |  
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.updateLastAccessTime(Map, 
int) makes inefficient use of keySet iterator instead of entrySet iterator  At 
MetaStoreDirectSql.java:of keySet iterator instead of entrySet iterator  At 
MetaStoreDirectSql.java:[line 536] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP 

[jira] [Commented] (HIVE-21718) Improvement performance of UpdateInputAccessTimeHook

2019-05-14 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839634#comment-16839634
 ] 

Hive QA commented on HIVE-21718:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12968685/HIVE-21718.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17209/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17209/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17209/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/slf4j/jul-to-slf4j/1.7.10/jul-to-slf4j-1.7.10.jar(org/slf4j/bridge/SLF4JBridgeHandler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/DispatcherType.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/Filter.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/FilterChain.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/FilterConfig.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/ServletException.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/ServletRequest.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/ServletResponse.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/annotation/WebFilter.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/http/HttpServletRequest.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-runner/9.3.25.v20180904/jetty-runner-9.3.25.v20180904.jar(javax/servlet/http/HttpServletResponse.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/classification/target/hive-classification-4.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceAudience$LimitedPrivate.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/classification/target/hive-classification-4.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceStability$Unstable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/ByteArrayOutputStream.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/OutputStream.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/Closeable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/AutoCloseable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/Flushable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(javax/xml/bind/annotation/XmlRootElement.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/commons/commons-exec/1.1/commons-exec-1.1.jar(org/apache/commons/exec/ExecuteException.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/security/PrivilegedExceptionAction.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/ExecutionException.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/TimeoutException.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/3.1.0/hadoop-common-3.1.0.jar(org/apache/hadoop/fs/FileSystem.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/shims/common/target/hive-shims-common-4.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/shims/HadoopShimsSecure.class)]]
[loading