[jira] [Commented] (HIVE-21865) Add verification for Tez engine before starting Tez sessions

2020-07-03 Thread shuangxiao liu (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151185#comment-17151185
 ] 

shuangxiao liu commented on HIVE-21865:
---

This is also true when the engine is spark

> Add verification for Tez engine before starting Tez sessions
> 
>
> Key: HIVE-21865
> URL: https://issues.apache.org/jira/browse/HIVE-21865
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-21865.1.patch
>
>
> Hive-3.1.1.
> Here is the log of starting HS2 when engine is MR:
> {code}
> 2019-06-12T06:08:05,115  WARN [main] server.HiveServer2: Error starting 
> HiveServer2 on attempt 1, will retry in 6ms
> java.lang.NoClassDefFoundError: org/apache/tez/dag/api/TezConfiguration
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession$AbstractTriggerValidator.startTriggerValidator(TezSessionPoolSession.java:74)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.initTriggers(TezSessionPoolManager.java:207)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:114)
>  
> at 
> org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:860)
>  
> at 
> org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:843)
> at 
> org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:766)
> at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1058)
>  
> at 
> org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:144) 
> at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1326)
>  
> at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1170) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.tez.dag.api.TezConfiguration
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) 
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
> ... 16 more
> {code}
> HS2 starts correctly but the exception above annoys customer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-21865) Add verification for Tez engine before starting Tez sessions

2020-07-03 Thread shuangxiao liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shuangxiao liu reassigned HIVE-21865:
-

Assignee: Oleksiy Sayankin  (was: shuangxiao liu)

> Add verification for Tez engine before starting Tez sessions
> 
>
> Key: HIVE-21865
> URL: https://issues.apache.org/jira/browse/HIVE-21865
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-21865.1.patch
>
>
> Hive-3.1.1.
> Here is the log of starting HS2 when engine is MR:
> {code}
> 2019-06-12T06:08:05,115  WARN [main] server.HiveServer2: Error starting 
> HiveServer2 on attempt 1, will retry in 6ms
> java.lang.NoClassDefFoundError: org/apache/tez/dag/api/TezConfiguration
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession$AbstractTriggerValidator.startTriggerValidator(TezSessionPoolSession.java:74)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.initTriggers(TezSessionPoolManager.java:207)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:114)
>  
> at 
> org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:860)
>  
> at 
> org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:843)
> at 
> org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:766)
> at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1058)
>  
> at 
> org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:144) 
> at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1326)
>  
> at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1170) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.tez.dag.api.TezConfiguration
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) 
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
> ... 16 more
> {code}
> HS2 starts correctly but the exception above annoys customer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-21865) Add verification for Tez engine before starting Tez sessions

2020-07-03 Thread shuangxiao liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shuangxiao liu reassigned HIVE-21865:
-

Assignee: shuangxiao liu  (was: Oleksiy Sayankin)

> Add verification for Tez engine before starting Tez sessions
> 
>
> Key: HIVE-21865
> URL: https://issues.apache.org/jira/browse/HIVE-21865
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: shuangxiao liu
>Priority: Major
> Attachments: HIVE-21865.1.patch
>
>
> Hive-3.1.1.
> Here is the log of starting HS2 when engine is MR:
> {code}
> 2019-06-12T06:08:05,115  WARN [main] server.HiveServer2: Error starting 
> HiveServer2 on attempt 1, will retry in 6ms
> java.lang.NoClassDefFoundError: org/apache/tez/dag/api/TezConfiguration
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession$AbstractTriggerValidator.startTriggerValidator(TezSessionPoolSession.java:74)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.initTriggers(TezSessionPoolManager.java:207)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:114)
>  
> at 
> org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:860)
>  
> at 
> org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:843)
> at 
> org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:766)
> at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1058)
>  
> at 
> org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:144) 
> at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1326)
>  
> at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1170) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.tez.dag.api.TezConfiguration
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) 
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
> ... 16 more
> {code}
> HS2 starts correctly but the exception above annoys customer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23801) TestMiniLlapLocalCliDriver[replication_metrics_ingest] is flaky

2020-07-03 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-23801:
--
Summary: TestMiniLlapLocalCliDriver[replication_metrics_ingest] is flaky  
(was: Disable flaky test: 
TestMiniLlapLocalCliDriver[replication_metrics_ingest])

> TestMiniLlapLocalCliDriver[replication_metrics_ingest] is flaky
> ---
>
> Key: HIVE-23801
> URL: https://issues.apache.org/jira/browse/HIVE-23801
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Peter Vary
>Priority: Major
>
> This test is flaky. See: 
> [http://ci.hive.apache.org/job/hive-flaky-check/62/console]
> {code:java}
> 21:59:19  [INFO] ---
> 21:59:19  [INFO]  T E S T S
> 21:59:19  [INFO] ---
> 21:59:19  [INFO] Running org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
> 22:01:56  [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time 
> elapsed: 144.366 s <<< FAILURE! - in 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
> 22:01:56  [ERROR] 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[replication_metrics_ingest]
>   Time elapsed: 124.174 s  <<< FAILURE!
> 22:01:56  java.lang.AssertionError: 
> 22:01:56  Client Execution succeeded but contained differences (error code = 
> 1) after executing replication_metrics_ingest.q 
> 22:01:56  76c76
> 22:01:56  < 3 repl2   1
> 22:01:56  ---
> 22:01:56  > 2 repl2   1
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23801) TestMiniLlapLocalCliDriver[replication_metrics_ingest] is flaky

2020-07-03 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151176#comment-17151176
 ] 

Peter Vary commented on HIVE-23801:
---

Disabled it until it is fixed. Please run 
http://ci.hive.apache.org/job/hive-flaky-check/62 before enabling again to 
confirm that it is fixed.

Thanks, Peter 

> TestMiniLlapLocalCliDriver[replication_metrics_ingest] is flaky
> ---
>
> Key: HIVE-23801
> URL: https://issues.apache.org/jira/browse/HIVE-23801
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Peter Vary
>Priority: Major
>
> This test is flaky. See: 
> [http://ci.hive.apache.org/job/hive-flaky-check/62/console]
> {code:java}
> 21:59:19  [INFO] ---
> 21:59:19  [INFO]  T E S T S
> 21:59:19  [INFO] ---
> 21:59:19  [INFO] Running org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
> 22:01:56  [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time 
> elapsed: 144.366 s <<< FAILURE! - in 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
> 22:01:56  [ERROR] 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[replication_metrics_ingest]
>   Time elapsed: 124.174 s  <<< FAILURE!
> 22:01:56  java.lang.AssertionError: 
> 22:01:56  Client Execution succeeded but contained differences (error code = 
> 1) after executing replication_metrics_ingest.q 
> 22:01:56  76c76
> 22:01:56  < 3 repl2   1
> 22:01:56  ---
> 22:01:56  > 2 repl2   1
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22634) Improperly SemanticException when filter is optimized to False on a partition table

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22634?focusedWorklogId=454490=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454490
 ]

ASF GitHub Bot logged work on HIVE-22634:
-

Author: ASF GitHub Bot
Created on: 04/Jul/20 02:30
Start Date: 04/Jul/20 02:30
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 edited a comment on pull request #865:
URL: https://github.com/apache/hive/pull/865#issuecomment-653355058


   +1 @WangGuangxin you can reopen the pr to trigger a new test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454490)
Time Spent: 0.5h  (was: 20m)

> Improperly SemanticException when filter is optimized to False on a partition 
> table
> ---
>
> Key: HIVE-22634
> URL: https://issues.apache.org/jira/browse/HIVE-22634
> Project: Hive
>  Issue Type: Improvement
>Reporter: EdisonWang
>Assignee: EdisonWang
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-22634.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When filter is optimized to False on a partition table, it will throw 
> improperly SemanticException reporting that there is no partition predicate 
> found.
> The step to reproduce is
> {code:java}
> set hive.strict.checks.no.partition.filter=true;
> CREATE TABLE test(id int, name string)PARTITIONED BY (`date` string);
> select * from test where `date` = '20191201' and 1<>1;
> {code}
>  
> The above sql will throw "Queries against partitioned tables without a 
> partition filter"  exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-13875) Beeline ignore where clause when it is the last line of file and missing a EOL hence give wrong query result

2020-07-03 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-13875.

Fix Version/s: 2.0.0
   1.3.0
   Resolution: Fixed

> Beeline ignore where clause when it is the last line of file and missing a 
> EOL hence give wrong query result
> 
>
> Key: HIVE-13875
> URL: https://issues.apache.org/jira/browse/HIVE-13875
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1
>Reporter: Lu Ji
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
>
> Steps to reproduce:
> Say we have a simple table:
> {code}
> select * from lji.lu_test;
> +---+--+--+
> | lu_test.name  | lu_test.country  |
> +---+--+--+
> | john  | us   |
> | hong  | cn   |
> +---+--+--+
> 2 rows selected (0.04 seconds)
> {code}
> We have a simple query in a file. But note this file missing the last EOL.
> {code}
> cat -A test.hql
> use lji;$
> select * from lu_test$
> where country='us';[lji@~]$
> {code}
> Then if we execute file using both hive CLI and beeline + HS2, we have 
> different result.
> {code}
> [lji@~]$ hive -f test.hql
> WARNING: Use "yarn jar" to launch YARN applications.
> Logging initialized using configuration in 
> file:/etc/hive/2.3.4.7-4/0/hive-log4j.properties
> OK
> Time taken: 1.624 seconds
> OK
> johnus
> Time taken: 1.482 seconds, Fetched: 1 row(s)
> [lji@~]$ beeline -u "jdbc:hive2://XXX:1/default;principal=hive/_HOST@XXX" 
> -f test.hql
> WARNING: Use "yarn jar" to launch YARN applications.
> Connecting to jdbc:hive2://XXXl:1/default;principal=hive/_HOST@XXX
> Connected to: Apache Hive (version 1.2.1.2.3.4.7-4)
> Driver: Hive JDBC (version 1.2.1.2.3.4.7-4)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://XXX> use lji;
> No rows affected (0.06 seconds)
> 0: jdbc:hive2://XXX> select * from lu_test
> 0: jdbc:hive2://XXX> where 
> country='us';+---+--+--+
> | lu_test.name  | lu_test.country  |
> +---+--+--+
> | john  | us   |
> | hong  | cn   |
> +---+--+--+
> 2 rows selected (0.073 seconds)
> 0: jdbc:hive2://XXX>
> Closing: 0: jdbc:hive2://XXX:1/default;principal=hive/_HOST@XXX
> {code}
> Obviously, beeline gave the wrong result. It ignore the where clause in the 
> last line.
> I know it is quit weird for a file missing the last EOL, but for whatever 
> reason, we kind of having quit some files in this state. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23727) Improve SQLOperation log handling when cleanup

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23727?focusedWorklogId=454484=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454484
 ]

ASF GitHub Bot logged work on HIVE-23727:
-

Author: ASF GitHub Bot
Created on: 04/Jul/20 00:28
Start Date: 04/Jul/20 00:28
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1149:
URL: https://github.com/apache/hive/pull/1149


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454484)
Time Spent: 1h 40m  (was: 1.5h)

> Improve SQLOperation log handling when cleanup
> --
>
> Key: HIVE-23727
> URL: https://issues.apache.org/jira/browse/HIVE-23727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The SQLOperation checks _if (shouldRunAsync() && state != 
> OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the 
> background task. If true, the state should not be OperationState.CANCELED, so 
> logging under the state == OperationState.CANCELED should never happen.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23727) Improve SQLOperation log handling when cleanup

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23727?focusedWorklogId=454483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454483
 ]

ASF GitHub Bot logged work on HIVE-23727:
-

Author: ASF GitHub Bot
Created on: 04/Jul/20 00:27
Start Date: 04/Jul/20 00:27
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #1149:
URL: https://github.com/apache/hive/pull/1149


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454483)
Time Spent: 1.5h  (was: 1h 20m)

> Improve SQLOperation log handling when cleanup
> --
>
> Key: HIVE-23727
> URL: https://issues.apache.org/jira/browse/HIVE-23727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The SQLOperation checks _if (shouldRunAsync() && state != 
> OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the 
> background task. If true, the state should not be OperationState.CANCELED, so 
> logging under the state == OperationState.CANCELED should never happen.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23801) Disable flaky test: TestMiniLlapLocalCliDriver[replication_metrics_ingest]

2020-07-03 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151117#comment-17151117
 ] 

Peter Vary commented on HIVE-23801:
---

CC: [~anishek], [~aasha]

> Disable flaky test: TestMiniLlapLocalCliDriver[replication_metrics_ingest]
> --
>
> Key: HIVE-23801
> URL: https://issues.apache.org/jira/browse/HIVE-23801
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Peter Vary
>Priority: Major
>
> This test is flaky. See: 
> [http://ci.hive.apache.org/job/hive-flaky-check/62/console]
> {code:java}
> 21:59:19  [INFO] ---
> 21:59:19  [INFO]  T E S T S
> 21:59:19  [INFO] ---
> 21:59:19  [INFO] Running org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
> 22:01:56  [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time 
> elapsed: 144.366 s <<< FAILURE! - in 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
> 22:01:56  [ERROR] 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[replication_metrics_ingest]
>   Time elapsed: 124.174 s  <<< FAILURE!
> 22:01:56  java.lang.AssertionError: 
> 22:01:56  Client Execution succeeded but contained differences (error code = 
> 1) after executing replication_metrics_ingest.q 
> 22:01:56  76c76
> 22:01:56  < 3 repl2   1
> 22:01:56  ---
> 22:01:56  > 2 repl2   1
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?focusedWorklogId=454344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454344
 ]

ASF GitHub Bot logged work on HIVE-22015:
-

Author: ASF GitHub Bot
Created on: 03/Jul/20 09:38
Start Date: 03/Jul/20 09:38
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r449486010



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##
@@ -1788,6 +2082,58 @@ public void addPartitionToCache(String catName, String 
dbName, String tblName, P
 }
   }
 
+  public void addPrimaryKeysToCache(String catName, String dbName, String 
tblName, List keys) {
+try {
+  cacheLock.readLock().lock();

Review comment:
   This is fine, the below method (cachePrimaryKeys), takes the writeLock. 
the read lock here, is just to get the tblWrapper object.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454344)
Time Spent: 1h 50m  (was: 1h 40m)

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?focusedWorklogId=454299=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454299
 ]

ASF GitHub Bot logged work on HIVE-22015:
-

Author: ASF GitHub Bot
Created on: 03/Jul/20 06:37
Start Date: 03/Jul/20 06:37
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r449401898



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##
@@ -1870,6 +2228,122 @@ public void removePartitionsFromCache(String catName, 
String dbName, String tblN
 return parts;
   }
 
+  public List listCachedPrimaryKeys(String catName, String 
dbName, String tblName) {
+List keys = new ArrayList<>();
+try {
+  cacheLock.readLock().lock();
+  TableWrapper tblWrapper = 
tableCache.getIfPresent(CacheUtils.buildTableKey(catName, dbName, tblName));
+  if (tblWrapper != null) {
+keys = tblWrapper.getPrimaryKeys();
+  }
+} finally {
+  cacheLock.readLock().unlock();
+}
+return keys;
+  }
+
+  public List listCachedForeignKeys(String catName, String 
foreignDbName, String foreignTblName,
+   String parentDbName, String 
parentTblName) {
+List keys = new ArrayList<>();
+try {
+  cacheLock.readLock().lock();
+  TableWrapper tblWrapper = 
tableCache.getIfPresent(CacheUtils.buildTableKey(catName, foreignDbName, 
foreignTblName));
+  if (tblWrapper != null) {
+keys = tblWrapper.getForeignKeys();
+  }
+} finally {
+  cacheLock.readLock().unlock();
+}
+
+// filter out required foreign keys based on parent db/tbl name
+if (!StringUtils.isEmpty(parentTblName) && 
!StringUtils.isEmpty(parentDbName)) {

Review comment:
   Even if tblWrapper is null, keys will be empty list and hence an empty 
list will be returned. So this should be fine, right?
   
   In case we move it to above if block, and assuming the list is not-empty, we 
will keep the read lock on for a longer duration (though only in milliseconds 
or even less), that's why I added it below. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454299)
Time Spent: 1.5h  (was: 1h 20m)

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?focusedWorklogId=454333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454333
 ]

ASF GitHub Bot logged work on HIVE-22015:
-

Author: ASF GitHub Bot
Created on: 03/Jul/20 09:05
Start Date: 03/Jul/20 09:05
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r449387162



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -543,10 +557,30 @@ static void prewarm(RawStore rawStore) {
 tableColStats = rawStore.getTableColumnStatistics(catName, 
dbName, tblName, colNames, CacheUtils.HIVE_ENGINE);
 Deadline.stopTimer();
   }
+  Deadline.startTimer("getPrimaryKeys");
+  primaryKeys = rawStore.getPrimaryKeys(catName, dbName, tblName);
+  Deadline.stopTimer();
+  cacheObjects.setPrimaryKeys(primaryKeys);
+
+  Deadline.startTimer("getForeignKeys");
+  foreignKeys = rawStore.getForeignKeys(catName, null, null, 
dbName, tblName);

Review comment:
   Then should we store foreign key mappings against parentDb and table for 
quick access (otherwise we will be scanning all the db/tables in cache)? 
   
   And this also means we will be keeping two copies, one with parent table and 
another with foreign table.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454333)
Time Spent: 1h 40m  (was: 1.5h)

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23799) Fix AcidUtils.parseBaseOrDeltaBucketFilename handling of data loaded by LOAD DATA

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23799?focusedWorklogId=454416=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454416
 ]

ASF GitHub Bot logged work on HIVE-23799:
-

Author: ASF GitHub Bot
Created on: 03/Jul/20 12:32
Start Date: 03/Jul/20 12:32
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #1204:
URL: https://github.com/apache/hive/pull/1204


   * Removed copyNumber attribute
   * Used parseDelta which does not try to access the actual file
   * Fixed handling of files like "table/delta_002_002_/12_0"
   * Added test for this case
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454416)
Remaining Estimate: 0h
Time Spent: 10m

> Fix AcidUtils.parseBaseOrDeltaBucketFilename handling of data loaded by LOAD 
> DATA
> -
>
> Key: HIVE-23799
> URL: https://issues.apache.org/jira/browse/HIVE-23799
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the {{AcidUtils.parseBaseOrDeltaBucketFilename}} considers files 
> loaded to the table as original base files. We should fix that.
> Also by checking the code for {{AcidUtils.parseBaseOrDeltaBucketFilename}}, I 
> have found 2 things:
> * The attribute {{copyNumber}} is not used anymore, so we should remove it
> * The version of the {{parsedDelta}} we use here tries to check if the files 
> are in raw format, or not. We do not need this information here, so we can 
> use a different implementation of {{parseDelta}}, and avoid a remote call and 
> file read.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23799) Fix AcidUtils.parseBaseOrDeltaBucketFilename handling of data loaded by LOAD DATA

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23799:
--
Labels: pull-request-available  (was: )

> Fix AcidUtils.parseBaseOrDeltaBucketFilename handling of data loaded by LOAD 
> DATA
> -
>
> Key: HIVE-23799
> URL: https://issues.apache.org/jira/browse/HIVE-23799
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the {{AcidUtils.parseBaseOrDeltaBucketFilename}} considers files 
> loaded to the table as original base files. We should fix that.
> Also by checking the code for {{AcidUtils.parseBaseOrDeltaBucketFilename}}, I 
> have found 2 things:
> * The attribute {{copyNumber}} is not used anymore, so we should remove it
> * The version of the {{parsedDelta}} we use here tries to check if the files 
> are in raw format, or not. We do not need this information here, so we can 
> use a different implementation of {{parseDelta}}, and avoid a remote call and 
> file read.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23799) Fix AcidUtils.parseBaseOrDeltaBucketFilename handling of data loaded by LOAD DATA

2020-07-03 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-23799:
-


> Fix AcidUtils.parseBaseOrDeltaBucketFilename handling of data loaded by LOAD 
> DATA
> -
>
> Key: HIVE-23799
> URL: https://issues.apache.org/jira/browse/HIVE-23799
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> Currently the {{AcidUtils.parseBaseOrDeltaBucketFilename}} considers files 
> loaded to the table as original base files. We should fix that.
> Also by checking the code for {{AcidUtils.parseBaseOrDeltaBucketFilename}}, I 
> have found 2 things:
> * The attribute {{copyNumber}} is not used anymore, so we should remove it
> * The version of the {{parsedDelta}} we use here tries to check if the files 
> are in raw format, or not. We do not need this information here, so we can 
> use a different implementation of {{parseDelta}}, and avoid a remote call and 
> file read.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?focusedWorklogId=454385=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454385
 ]

ASF GitHub Bot logged work on HIVE-22015:
-

Author: ASF GitHub Bot
Created on: 03/Jul/20 11:15
Start Date: 03/Jul/20 11:15
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r449528837



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -867,6 +909,77 @@ private void updateTableColStats(RawStore rawStore, String 
catName, String dbNam
   }
 }
 
+private void updateTableForeignKeys(RawStore rawStore, String catName, 
String dbName, String tblName) {
+  LOG.debug("CachedStore: updating cached foreign keys objects for 
catalog: {}, database: {}, table: {}", catName,
+  dbName, tblName);
+  try {
+Deadline.startTimer("getForeignKeys");
+List fks = rawStore.getForeignKeys(catName, null, null, 
dbName, tblName);
+Deadline.stopTimer();
+
sharedCache.refreshForeignKeysInCache(StringUtils.normalizeIdentifier(catName),
+StringUtils.normalizeIdentifier(dbName), 
StringUtils.normalizeIdentifier(tblName), fks);
+LOG.debug("CachedStore: updated cached foreign keys objects for 
catalog: {}, database: {}, table: {}", catName,
+dbName, tblName);
+  } catch (MetaException e) {
+LOG.info("Updating CachedStore: unable to read foreign keys of 
catalog: " + catName + ", database: "

Review comment:
   Fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454385)
Time Spent: 2h 40m  (was: 2.5h)

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?focusedWorklogId=454384=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454384
 ]

ASF GitHub Bot logged work on HIVE-22015:
-

Author: ASF GitHub Bot
Created on: 03/Jul/20 11:14
Start Date: 03/Jul/20 11:14
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r449528675



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##
@@ -514,6 +655,130 @@ public boolean containsPartition(List partVals) {
   return containsPart;
 }
 
+public void removeConstraint(String name) {
+  try {
+tableLock.writeLock().lock();
+Object constraint = null;
+MemberName mn = null;
+Class constraintClass = null;
+if (this.primaryKeyCache.containsKey(name)) {
+  constraint = this.primaryKeyCache.remove(name);
+  mn = MemberName.PRIMARY_KEY_CACHE;
+  this.memberCacheDirty[mn.ordinal()].set(true);
+  constraintClass = SQLPrimaryKey.class;
+} else if (this.foreignKeyCache.containsKey(name)) {
+  constraint = this.foreignKeyCache.remove(name);
+  mn = MemberName.FOREIGN_KEY_CACHE;
+  this.memberCacheDirty[mn.ordinal()].set(true);
+  constraintClass = SQLForeignKey.class;
+} else if (this.notNullConstraintCache.containsKey(name)) {
+  constraint = this.notNullConstraintCache.remove(name);
+  mn = MemberName.NOTNULL_CONSTRAINT_CACHE;
+  this.memberCacheDirty[mn.ordinal()].set(true);
+  constraintClass = SQLNotNullConstraint.class;
+} else if (this.uniqueConstraintCache.containsKey(name)) {
+  constraint = this.uniqueConstraintCache.remove(name);
+  mn = MemberName.UNIQUE_CONSTRAINT_CACHE;
+  this.memberCacheDirty[mn.ordinal()].set(true);
+  constraintClass = SQLUniqueConstraint.class;
+}
+
+if(constraint == null) {
+  LOG.debug("Constraint: " + name + " does not exist in cache.");
+  return;
+}
+int size = getObjectSize(constraintClass, constraint);
+updateMemberSize(mn, -1 * size, SizeMode.Delta);
+
+  } finally {
+tableLock.writeLock().unlock();
+  }
+}
+
+public void refreshPrimaryKeys(List keys) {
+  Map newKeys = new ConcurrentHashMap<>();
+  try {
+tableLock.writeLock().lock();
+int size = 0;
+for (SQLPrimaryKey key : keys) {
+  if 
(this.memberCacheDirty[MemberName.PRIMARY_KEY_CACHE.ordinal()].compareAndSet(true,
 false)) {

Review comment:
   Updated the name of the variable. This is used during refreshOperation. 
If a particular Object cache is set to true, means it was updated after the 
last refresh operation and should be refreshed now, otherwise, current refresh 
operation will not modify/refresh the cache.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454384)
Time Spent: 2.5h  (was: 2h 20m)

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?focusedWorklogId=454383=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454383
 ]

ASF GitHub Bot logged work on HIVE-22015:
-

Author: ASF GitHub Bot
Created on: 03/Jul/20 11:11
Start Date: 03/Jul/20 11:11
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r449527172



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##
@@ -514,6 +655,130 @@ public boolean containsPartition(List partVals) {
   return containsPart;
 }
 
+public void removeConstraint(String name) {
+  try {
+tableLock.writeLock().lock();
+Object constraint = null;
+MemberName mn = null;
+Class constraintClass = null;
+if (this.primaryKeyCache.containsKey(name)) {

Review comment:
   Fixed.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##
@@ -514,6 +655,130 @@ public boolean containsPartition(List partVals) {
   return containsPart;
 }
 
+public void removeConstraint(String name) {
+  try {
+tableLock.writeLock().lock();
+Object constraint = null;
+MemberName mn = null;
+Class constraintClass = null;
+if (this.primaryKeyCache.containsKey(name)) {
+  constraint = this.primaryKeyCache.remove(name);
+  mn = MemberName.PRIMARY_KEY_CACHE;
+  this.memberCacheDirty[mn.ordinal()].set(true);
+  constraintClass = SQLPrimaryKey.class;
+} else if (this.foreignKeyCache.containsKey(name)) {
+  constraint = this.foreignKeyCache.remove(name);
+  mn = MemberName.FOREIGN_KEY_CACHE;
+  this.memberCacheDirty[mn.ordinal()].set(true);
+  constraintClass = SQLForeignKey.class;
+} else if (this.notNullConstraintCache.containsKey(name)) {
+  constraint = this.notNullConstraintCache.remove(name);
+  mn = MemberName.NOTNULL_CONSTRAINT_CACHE;
+  this.memberCacheDirty[mn.ordinal()].set(true);
+  constraintClass = SQLNotNullConstraint.class;
+} else if (this.uniqueConstraintCache.containsKey(name)) {
+  constraint = this.uniqueConstraintCache.remove(name);
+  mn = MemberName.UNIQUE_CONSTRAINT_CACHE;
+  this.memberCacheDirty[mn.ordinal()].set(true);
+  constraintClass = SQLUniqueConstraint.class;
+}
+
+if(constraint == null) {
+  LOG.debug("Constraint: " + name + " does not exist in cache.");
+  return;
+}
+int size = getObjectSize(constraintClass, constraint);
+updateMemberSize(mn, -1 * size, SizeMode.Delta);
+
+  } finally {
+tableLock.writeLock().unlock();
+  }
+}
+
+public void refreshPrimaryKeys(List keys) {
+  Map newKeys = new ConcurrentHashMap<>();
+  try {
+tableLock.writeLock().lock();
+int size = 0;
+for (SQLPrimaryKey key : keys) {
+  if 
(this.memberCacheDirty[MemberName.PRIMARY_KEY_CACHE.ordinal()].compareAndSet(true,
 false)) {
+LOG.debug("Skipping primary key cache update for table: " + 
getTable().getTableName()
++ "; the primary keys we have is dirty.");
+return;
+  }
+  newKeys.put(key.getPk_name(), key);
+  size += getObjectSize(SQLPrimaryKey.class, key);
+}
+primaryKeyCache = newKeys;
+updateMemberSize(MemberName.PRIMARY_KEY_CACHE, size, 
SizeMode.Snapshot);

Review comment:
   done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454383)
Time Spent: 2h 20m  (was: 2h 10m)

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to 

[jira] [Work logged] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?focusedWorklogId=454379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454379
 ]

ASF GitHub Bot logged work on HIVE-22015:
-

Author: ASF GitHub Bot
Created on: 03/Jul/20 11:07
Start Date: 03/Jul/20 11:07
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r449525799



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -2497,26 +2599,82 @@ long getPartsFound() {
 
   @Override public List getPrimaryKeys(String catName, String 
dbName, String tblName)
   throws MetaException {
-// TODO constraintCache
-return rawStore.getPrimaryKeys(catName, dbName, tblName);
+catName = normalizeIdentifier(catName);
+dbName = StringUtils.normalizeIdentifier(dbName);
+tblName = StringUtils.normalizeIdentifier(tblName);
+if (!shouldCacheTable(catName, dbName, tblName) || (canUseEvents && 
rawStore.isActiveTransaction())) {
+  return rawStore.getPrimaryKeys(catName, dbName, tblName);
+}
+
+Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName);
+if (tbl == null) {
+  // The table containing the primary keys is not yet loaded in cache
+  return rawStore.getPrimaryKeys(catName, dbName, tblName);
+}
+List keys = sharedCache.listCachedPrimaryKeys(catName, 
dbName, tblName);

Review comment:
   done. Fetching keys from rawStore if we got empty/null. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454379)
Time Spent: 2h 10m  (was: 2h)

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-07-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?focusedWorklogId=454377=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-454377
 ]

ASF GitHub Bot logged work on HIVE-22015:
-

Author: ASF GitHub Bot
Created on: 03/Jul/20 11:06
Start Date: 03/Jul/20 11:06
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r449525538



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -867,6 +909,77 @@ private void updateTableColStats(RawStore rawStore, String 
catName, String dbNam
   }
 }
 
+private void updateTableForeignKeys(RawStore rawStore, String catName, 
String dbName, String tblName) {
+  LOG.debug("CachedStore: updating cached foreign keys objects for 
catalog: {}, database: {}, table: {}", catName,
+  dbName, tblName);
+  try {
+Deadline.startTimer("getForeignKeys");
+List fks = rawStore.getForeignKeys(catName, null, null, 
dbName, tblName);
+Deadline.stopTimer();
+
sharedCache.refreshForeignKeysInCache(StringUtils.normalizeIdentifier(catName),
+StringUtils.normalizeIdentifier(dbName), 
StringUtils.normalizeIdentifier(tblName), fks);
+LOG.debug("CachedStore: updated cached foreign keys objects for 
catalog: {}, database: {}, table: {}", catName,
+dbName, tblName);
+  } catch (MetaException e) {

Review comment:
   done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 454377)
Time Spent: 2h  (was: 1h 50m)

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)