[jira] [Created] (HIVE-20989) JDBC: The GetOperationStatus + log can block query progress via sleep()
Gopal V created HIVE-20989: -- Summary: JDBC: The GetOperationStatus + log can block query progress via sleep() Key: HIVE-20989 URL: https://issues.apache.org/jira/browse/HIVE-20989 Project: Hive Issue Type: Bug Reporter: Gopal V There is an exponential sleep operation inside the CLIService which can end up adding tens of seconds to a query which has already completed. {code} "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) at org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} The sleep loop is on the server side. {code} private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, Operation operation, HiveConf conf) { ... long startTime = System.nanoTime(); int timeOutMs = 8; try { while (sessionState.getProgressMonitor() == null && !operation.isDone()) { long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - startTime)) / 100l; if (remainingMs <= 0) { LOG.debug("timed out and hence returning progress log as NULL"); return new JobProgressUpdate(ProgressMonitor.NULL); } Thread.sleep(Math.min(remainingMs, timeOutMs)); timeOutMs <<= 1; } {code} After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, which means the next sleep cycle is for min(30 - 17, 16) = 13. If the query finishes on the 17th second, the JDBC server will only respond after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20988) Wrong results for group by queries with primary key on multiple columns
Vineet Garg created HIVE-20988: -- Summary: Wrong results for group by queries with primary key on multiple columns Key: HIVE-20988 URL: https://issues.apache.org/jira/browse/HIVE-20988 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Vineet Garg Assignee: Vineet Garg -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] hive pull request #498: HIVE-20987: split Druid Tests and start nodes on ran...
GitHub user b-slim opened a pull request: https://github.com/apache/hive/pull/498 HIVE-20987: split Druid Tests and start nodes on random ports Change-Id: I89009fd8a79a85b26bcc080c34a07577125f0110 You can merge this pull request into a Git repository by running: $ git pull https://github.com/b-slim/hive HIVE-20987 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/498.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #498 commit 959438cf82a522f6cf573b8df2b7850762ea6c7b Author: Slim Bouguerra Date: 2018-11-30T03:35:55Z HIVE-20987: split Druid Tests and start nodes on random ports Change-Id: I89009fd8a79a85b26bcc080c34a07577125f0110 ---
[jira] [Created] (HIVE-20987) Split Druid Tests to avoid Timeouts
slim bouguerra created HIVE-20987: - Summary: Split Druid Tests to avoid Timeouts Key: HIVE-20987 URL: https://issues.apache.org/jira/browse/HIVE-20987 Project: Hive Issue Type: Test Reporter: slim bouguerra Assignee: slim bouguerra Currently Druid Tests fail with Timeout issue. I am plaining on splitting the test into 2 batches at least to avoid timeouts. I will tweak the test code to pick random Druid nodes ports like that minimize the collision issue that we saw before. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 69054: HIVE-20740 : Remove global lock in ObjectStore.setConf method
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69054/#review210953 --- Ship it! Ship It! - Naveen Gangam On Nov. 27, 2018, 7:18 a.m., Vihang Karajgaonkar wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69054/ > --- > > (Updated Nov. 27, 2018, 7:18 a.m.) > > > Review request for hive, Andrew Sherman, Alan Gates, and Peter Vary. > > > Bugs: HIVE-20740 > https://issues.apache.org/jira/browse/HIVE-20740 > > > Repository: hive-git > > > Description > --- > > HIVE-20740 : Remove global lock in ObjectStore.setConf method > > > Diffs > - > > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java > 5a88550f0625a7ec1890df7f54e7fa579f58fff4 > > itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java > 5cb0a887e672f49739f5b648e608fba66de06326 > ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java > 455ffc3887e62fa503cc3fa28255702ea9da3cc0 > > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java > 570281b54fa236d5bb568b4ded9b166ef367f613 > > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java > af9efd98ea210335c6ac1d3da8624e02aadc2706 > > > Diff: https://reviews.apache.org/r/69054/diff/6/ > > > Testing > --- > > > Thanks, > > Vihang Karajgaonkar > >
[jira] [Created] (HIVE-20986) Add TransactionalValidationListener to HMS preListeners only when ACID support is enabled
Karthik Manamcheri created HIVE-20986: - Summary: Add TransactionalValidationListener to HMS preListeners only when ACID support is enabled Key: HIVE-20986 URL: https://issues.apache.org/jira/browse/HIVE-20986 Project: Hive Issue Type: Improvement Reporter: Karthik Manamcheri Assignee: Adam Holley We add the TransactionalValidationListener to the preListeners in HMS unconditionally. {code:java} public void init() throws MetaException { .. preListeners.add(0, new TransactionalValidationListener(conf)); .. }{code} This causes some performance issues because the listener is called even when not needed. Lets add a condition around this and add this listener only if the transactional support is enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20985) If select operator inputs are temporary columns vectorization may reuse some of them as output
Zoltan Haindrich created HIVE-20985: --- Summary: If select operator inputs are temporary columns vectorization may reuse some of them as output Key: HIVE-20985 URL: https://issues.apache.org/jira/browse/HIVE-20985 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20984) Hive cannot call MapReduce, please tell me where there is a configuration problem
yuxuqi created HIVE-20984: - Summary: Hive cannot call MapReduce, please tell me where there is a configuration problem Key: HIVE-20984 URL: https://issues.apache.org/jira/browse/HIVE-20984 Project: Hive Issue Type: Bug Components: Beeline Affects Versions: 1.1.0 Environment: CDH 5.14.2 hive 1.1.0 Reporter: yuxuqi Attachments: image-2018-11-29-19-58-56-026.png {color:#FF}{color} yarn.acl.enable true yarn.admin.acl * yarn.resourcemanager.address master:8032 yarn.resourcemanager.admin.address master:8033 yarn.resourcemanager.scheduler.address master:8030 yarn.resourcemanager.resource-tracker.address master:8031 yarn.resourcemanager.webapp.address master:8088 yarn.resourcemanager.webapp.https.address master:8090 yarn.resourcemanager.client.thread-count 50 yarn.resourcemanager.scheduler.client.thread-count 50 yarn.resourcemanager.admin.client.thread-count 1 yarn.scheduler.minimum-allocation-mb 1024 yarn.scheduler.increment-allocation-mb 512 yarn.scheduler.maximum-allocation-mb 31647 yarn.scheduler.minimum-allocation-vcores 1 yarn.scheduler.increment-allocation-vcores 1 yarn.scheduler.maximum-allocation-vcores 48 yarn.resourcemanager.amliveliness-monitor.interval-ms 1000 yarn.am.liveness-monitor.expiry-interval-ms 60 yarn.resourcemanager.am.max-attempts 2 yarn.resourcemanager.container.liveness-monitor.interval-ms 60 yarn.resourcemanager.nm.liveness-monitor.interval-ms 1000 yarn.nm.liveness-monitor.expiry-interval-ms 60 yarn.resourcemanager.resource-tracker.client.thread-count 50 yarn.application.classpath $HADOOP_CLIENT_CONF_DIR,$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/* yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler yarn.resourcemanager.max-completed-applications 1 yarn.nodemanager.remote-app-log-dir /tmp/logs yarn.nodemanager.remote-app-log-dir-suffix logs -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20983) Vectorization: Scale up small hashtables, when collisions are detected
Gopal V created HIVE-20983: -- Summary: Vectorization: Scale up small hashtables, when collisions are detected Key: HIVE-20983 URL: https://issues.apache.org/jira/browse/HIVE-20983 Project: Hive Issue Type: Bug Reporter: Gopal V Hive's hashtable estimates are getting better with HyperLogLog stats in place, but an accurate estimate does not always result in a low number of collisions. The hashtables which contain a very small number of items tend to lose their O(1) lookup performance where there are collisions. Since collisions are easy to detect within the fast hashtable implementation, a rehashing to a higher size will help these small hashtables avoid collisions and go back to O(1) perf. -- This message was sent by Atlassian JIRA (v7.6.3#76005)