[jira] [Created] (HIVE-21524) Impala Engine
David Mollitor created HIVE-21524: - Summary: Impala Engine Key: HIVE-21524 URL: https://issues.apache.org/jira/browse/HIVE-21524 Project: Hive Issue Type: New Feature Affects Versions: 4.0.0 Reporter: David Mollitor Now that Impala has "dedicated coordinator" capability, it could be interesting to pair HiveServer2 instances with Impala dedicated coordinators on the same localhost. A client could request an 'impala' execution engine and subsequent queries would be routed to the local coordinator. {code:sql} set hive.execution.engine=impala; {code} This would allow clients seamless access to both capabilities without needing different connections or drivers, Hive would also be a central location for auditing and authorization. https://www.cloudera.com/documentation/enterprise/latest/topics/impala_dedicated_coordinator.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21581) Remove Lock in GetInputSummary
David Mollitor created HIVE-21581: - Summary: Remove Lock in GetInputSummary Key: HIVE-21581 URL: https://issues.apache.org/jira/browse/HIVE-21581 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Fix For: 4.0.0 Now that Hive compile lock has been relaxed in [HIVE-20535], remove the {{getInputSummary}} lock: [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2459] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21515) Improvement to MoveTrash Facilities
David Mollitor created HIVE-21515: - Summary: Improvement to MoveTrash Facilities Key: HIVE-21515 URL: https://issues.apache.org/jira/browse/HIVE-21515 Project: Hive Issue Type: Improvement Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-21515.1.patch -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21414) Hive JSON SerDe Does Not Properly Handle Field Comments
David Mollitor created HIVE-21414: - Summary: Hive JSON SerDe Does Not Properly Handle Field Comments Key: HIVE-21414 URL: https://issues.apache.org/jira/browse/HIVE-21414 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor Field comments are handed to the JSON SerDe from HMS and then are ignored. The result is that all field comments are 'from deserializer' and cannot be changed. For example, Avro SerDe handles comments: https://github.com/apache/hive/blob/release-1.1.0/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java#L133 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21466) Increase Default Size of SPLIT_MAXSIZE
David Mollitor created HIVE-21466: - Summary: Increase Default Size of SPLIT_MAXSIZE Key: HIVE-21466 URL: https://issues.apache.org/jira/browse/HIVE-21466 Project: Hive Issue Type: Improvement Components: Configuration Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-21466.1.patch {code:java} MAPREDMAXSPLITSIZE(FileInputFormat.SPLIT_MAXSIZE, 25600L, "", true), {code} [https://github.com/apache/hive/blob/8d4300a02691777fc96f33861ed27e64fed72f2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L682] This field specifies a maximum size for each MR (maybe other?) splits. This number should be a multiple of the HDFS Block size. The way that this maximum is implemented, is that each block is added to the split, and if the split grows to be larger than the maximum allowed, the split is submitted to the cluster and a new split is opened. So, imagine the following scenario: * HDFS block size of 16 bytes * Maximum size of 40 bytes This will produce a split with 3 blocks. (2x16) = 32; another block will be inserted, (3x16) = 48 bytes in the split. So, while many operators would assume a split of 2 blocks, the actual is 3 blocks. Setting the maximum split size to a multiple of the HDFS block size will make this behavior less confusing. The current setting is ~256MB and when this was introduced, the default HDFS block size was 64MB. That is a factor of 4x. However, now HDFS block sizes are 128MB by default, so I propose setting this to 4x128MB. The larger splits (fewer tasks) should give a nice performance boost for modern hardware. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21469) Review of ZooKeeperHiveLockManager
David Mollitor created HIVE-21469: - Summary: Review of ZooKeeperHiveLockManager Key: HIVE-21469 URL: https://issues.apache.org/jira/browse/HIVE-21469 Project: Hive Issue Type: Improvement Components: Locking Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-21469.1.patch A lot of sins in this class to resolve: {code:java} @Override public void setContext(HiveLockManagerCtx ctx) throws LockException { try { curatorFramework = CuratorFrameworkSingleton.getInstance(conf); parent = conf.getVar(HiveConf.ConfVars.HIVE_ZOOKEEPER_NAMESPACE); try{ curatorFramework.create().withMode(CreateMode.PERSISTENT).forPath("/" + parent, new byte[0]); } catch (Exception e) { // ignore if the parent already exists if (!(e instanceof KeeperException) || ((KeeperException)e).code() != KeeperException.Code.NODEEXISTS) { LOG.warn("Unexpected ZK exception when creating parent node /" + parent, e); } } {code} Every time a new session is created and this {{setContext}} method is called, it attempts to create the root node. I have seen that, even though the root node exists, an create node action is written to the ZK logs. Check first if the node exists before trying to create it. {code:java} try { curatorFramework.delete().forPath(zLock.getPath()); } catch (InterruptedException ie) { curatorFramework.delete().forPath(zLock.getPath()); } {code} There has historically been a quite a few bugs regarding leaked locks. The Driver will signal the session {{Thread}} by performing an interrupt. That interrupt can happen any time and it can kill a create/delete action within the ZK framework. We can see one example of workaround for this. If the ZK action is interrupted, simply do it again. Well, what if it's interrupted yet again? The lock will be leaked anyway. Also, when the {{InterruptedException}} is caught in the try block, the thread's interrupted flag is cleared. The flag is not reset in this code and therefore we lose the fact that this thread has been interrupted. {code:java} if (tryNum > 1) { Thread.sleep(sleepTime); } unlockPrimitive(hiveLock, parent, curatorFramework); break; } catch (Exception e) { if (tryNum >= numRetriesForUnLock) { String name = ((ZooKeeperHiveLock)hiveLock).getPath(); throw new LockException("Node " + name + " can not be deleted after " + numRetriesForUnLock + " attempts.", e); } } {code} ... related... the sleep here may be interrupted, but we still need to delete the lock (again, for fear of leaking it). This sleep should be uninterruptible. If we need to get the lock deleted, and there's a problem, interrupting the sleep will cause the code to eventually exit and locks will be leaked. It also requires a bunch more TLC. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21433) Doc: Remove Reference to hive.stats.avg.row.size
David Mollitor created HIVE-21433: - Summary: Doc: Remove Reference to hive.stats.avg.row.size Key: HIVE-21433 URL: https://issues.apache.org/jira/browse/HIVE-21433 Project: Hive Issue Type: Improvement Components: Documentation Affects Versions: 4.0.0 Reporter: David Mollitor [https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties] Remove reference to {{hive.stats.avg.row.size}}. I think it's been replaced by {{hive.stats.max.variable.length}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21425) Use newDirectExecutorService for getInputSummary
David Mollitor created HIVE-21425: - Summary: Use newDirectExecutorService for getInputSummary Key: HIVE-21425 URL: https://issues.apache.org/jira/browse/HIVE-21425 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor {code:java|title=Utilities.java} int numExecutors = getMaxExecutorsForInputListing(ctx.getConf(), pathNeedProcess.size()); if (numExecutors > 1) { LOG.info("Using {} threads for getContentSummary", numExecutors); executor = Executors.newFixedThreadPool(numExecutors, new ThreadFactoryBuilder().setDaemon(true) .setNameFormat("Get-Input-Summary-%d").build()); } else { executor = null; } {code} https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2482-L2490 Instead of using a 'null' {{ExecutorService}}, use Guava's {{DirectExecutorService}} and remove special casing for a 'null' value. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21426) Remove Utilities Global Random
David Mollitor created HIVE-21426: - Summary: Remove Utilities Global Random Key: HIVE-21426 URL: https://issues.apache.org/jira/browse/HIVE-21426 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253 Remove global {{Random}} object in favor of {{ThreadLocalRandom}}. {quote} ThreadLocalRandom is initialized with an internally generated seed that may not otherwise be modified. When applicable, use of ThreadLocalRandom rather than shared Random objects in concurrent programs will typically encounter much less overhead and contention. {quote} https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21748) HBase Operations Can Fail When Using MAPREDLOCAL
David Mollitor created HIVE-21748: - Summary: HBase Operations Can Fail When Using MAPREDLOCAL Key: HIVE-21748 URL: https://issues.apache.org/jira/browse/HIVE-21748 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor https://github.com/apache/hive/blob/5634140b2beacdac20ceec8c73ff36bce5675ef8/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java#L258-L262 {code:java|title=HBaseStorageHandler.java} if (this.configureInputJobProps) { LOG.info("Configuring input job properties"); ... try { addHBaseDelegationToken(jobConf); } catch (IOException | MetaException e) { throw new IllegalStateException("Error while configuring input job properties", e); } } else { LOG.info("Configuring output job properties"); ... } {code} What we can see here is that the HBase Delegation Token is only created when there is an input job (reading from an HBase source). For a particular stage of a query, if there is no HBASE input, only HBASE output, then the delegation token is not created and will cause a failure. {code:none|title=Error Message in HS2 Log} 2019-05-17 10:24:55,036 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-388]: Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238) at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89) at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924) at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} You can tell it will fail because an HDFS Token will be created, but it will not report an HBASE token in the HS2 logs. The following is an example of a proper setup. If it is missing the HBASE_AUTH_TOKEN it will fail because it will try to initiate Kerberos handshake and fail. {code:none|title=Logging of a Proper Run} 2019-05-17 10:36:15,593 INFO org.apache.hadoop.mapreduce.JobSubmitter: [HiveServer2-Background-Pool: Thread-455]: Submitting tokens for job: job_1557858663665_0048 2019-05-17 10:36:15,593 INFO org.apache.hadoop.mapreduce.JobSubmitter: [HiveServer2-Background-Pool: Thread-455]: Kind: HDFS_DELEGATION_TOKEN, Service: 10.17.101.237:8020, Ident: (token for hive: HDFS_DELEGATION_TOKEN owner=hive/host-10-17-102-135.coe.cloudera@example.com, renewer=yarn, realUser=, issueDate=1558114574357, maxDate=1558719374357, sequenceNumber=75, masterKeyId=4) 2019-05-17 10:36:15,593 INFO org.apache.hadoop.mapreduce.JobSubmitter: [HiveServer2-Background-Pool: Thread-455]: Kind: HBASE_AUTH_TOKEN, Service: 9b282733-7927-4785-92ea-dad419f6f055, Ident: (org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@b1) 2019-05-17 10:36:15,859 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: [HiveServer2-Background-Pool: Thread-455]: Submitted application application_1557858663665_0048 {code} Error message in the Local MapReduce log. {code:none|title=Error message} 2019-05-10 07:43:24,875 WARN [htable-pool2-t1]: security.UserGroupInformation (UserGroupInformation.java:doAs(1927)) - PriviledgedActionException as:hive (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2019-05-10 07:43:24,876 WARN [htable-pool2-t1]: ipc.RpcClientImpl (RpcClientImpl.java:run(675)) - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2019-05-10 07:43:24,876 ERROR [htable-pool2-t1]: ipc.RpcClientImpl (RpcClientImpl.java:run(685)) - SASL authentication failed. The most likely cause is missing
[jira] [Created] (HIVE-21747) Remove Dependency on org.cliffc.high_scale_lib.Counter
David Mollitor created HIVE-21747: - Summary: Remove Dependency on org.cliffc.high_scale_lib.Counter Key: HIVE-21747 URL: https://issues.apache.org/jira/browse/HIVE-21747 Project: Hive Issue Type: Improvement Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor [https://github.com/apache/hive/blob/5634140b2beacdac20ceec8c73ff36bce5675ef8/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java#L327] {code:java} static { try { counterClass = Class.forName("org.cliffc.high_scale_lib.Counter"); } catch (ClassNotFoundException cnfe) { // this dependency is removed for HBase 1.0 } {code} I think this _counterClass_ stuff can be removed now that Hive is firmly on HBase 1.0+ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21792) Hive Indexes... Again
David Mollitor created HIVE-21792: - Summary: Hive Indexes... Again Key: HIVE-21792 URL: https://issues.apache.org/jira/browse/HIVE-21792 Project: Hive Issue Type: New Feature Components: Indexing Reporter: David Mollitor Hive had an implementation of indexing that was made somewhat obsolete given the introduction of columnar file formats with their own internal indexing. I propose that Hive introduce Indexing again. # Column Index: Stored in HBase # Full-Text Index: Stored in Solr The basic idea is that, the key in HBase is the record and the value is the relative file path of the data in the Hive table. Performing an INSERT statement creates the index for each record. https://dev.mysql.com/doc/refman/8.0/en/create-index.html When generating the explain plan, only the files involved in the query are considered. This would prevents having to scan large amounts of data for the typical BI tools when the set of data is known to be very small. {code:sql} -- Quick retrieval of small sets of records select * from user where userid=27; -- Full scans select count(1) from user; {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21727) Allow For Ordinal Substitution
David Mollitor created HIVE-21727: - Summary: Allow For Ordinal Substitution Key: HIVE-21727 URL: https://issues.apache.org/jira/browse/HIVE-21727 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor Impala allows for ordinal substitution. Add a compatible feature to Hive to allow Hive to be more compatible with Impala. Allows for more of a drop-in replacement. [IMPALA-8548] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21655) Add Re-Try to LdapSearchFactory
David Mollitor created HIVE-21655: - Summary: Add Re-Try to LdapSearchFactory Key: HIVE-21655 URL: https://issues.apache.org/jira/browse/HIVE-21655 Project: Hive Issue Type: Improvement Components: Authentication Affects Versions: 4.0.0, 3.2.0 Environment: It may be the case that LDAP service is temporarily unreachable. Please implement a re-try facility here: https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/auth/ldap/LdapSearchFactory.java#L41 Reporter: David Mollitor -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-22078) Upgrade arrow version to 0.14.1
David Mollitor created HIVE-22078: - Summary: Upgrade arrow version to 0.14.1 Key: HIVE-22078 URL: https://issues.apache.org/jira/browse/HIVE-22078 Project: Hive Issue Type: Task Affects Versions: 4.0.0 Reporter: David Mollitor -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22217) Better Logging for Hive JAR Reload
David Mollitor created HIVE-22217: - Summary: Better Logging for Hive JAR Reload Key: HIVE-22217 URL: https://issues.apache.org/jira/browse/HIVE-22217 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 2.3.6, 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult. Add logging to at least confirm which JAR files are being loaded. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22032) Allow Hive JSON SerDe To Be Case Insensitive for Field Names
David Mollitor created HIVE-22032: - Summary: Allow Hive JSON SerDe To Be Case Insensitive for Field Names Key: HIVE-22032 URL: https://issues.apache.org/jira/browse/HIVE-22032 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor https://fasterxml.github.io/jackson-databind/javadoc/2.9/com/fasterxml/jackson/databind/MapperFeature.html#ACCEPT_CASE_INSENSITIVE_PROPERTIES -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22445) LazySimpleSerDe toString is not Correct
David Mollitor created HIVE-22445: - Summary: LazySimpleSerDe toString is not Correct Key: HIVE-22445 URL: https://issues.apache.org/jira/browse/HIVE-22445 Project: Hive Issue Type: Improvement Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-22445.1.patch {code:none} 2019-11-01T10:03:49,228 INFO [pool-23-thread-1] exec.FileSinkOperator: Using serializer : class org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe[[[B@983dd25]:[_col0, _col1]:[struct
[jira] [Created] (HIVE-22443) HBase Maven site configuration causes Hive project to get a directory named ${project.basedir}
David Mollitor created HIVE-22443: - Summary: HBase Maven site configuration causes Hive project to get a directory named ${project.basedir} Key: HIVE-22443 URL: https://issues.apache.org/jira/browse/HIVE-22443 Project: Hive Issue Type: Improvement Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor Upgrade HBase versions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22444) Clean up Project POM Files
David Mollitor created HIVE-22444: - Summary: Clean up Project POM Files Key: HIVE-22444 URL: https://issues.apache.org/jira/browse/HIVE-22444 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor # Address warnings in the build process # Use DependencyManagement in Root POM for ITest (see HIVE-22426) # General POM cleanup -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22447) Update HBase Version to GA
David Mollitor created HIVE-22447: - Summary: Update HBase Version to GA Key: HIVE-22447 URL: https://issues.apache.org/jira/browse/HIVE-22447 Project: Hive Issue Type: Improvement Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Currently at: {code:none} 2.0.0-alpha4 {code} Upgrade to a GA release -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22475) Update slf4j to 1.7.25
David Mollitor created HIVE-22475: - Summary: Update slf4j to 1.7.25 Key: HIVE-22475 URL: https://issues.apache.org/jira/browse/HIVE-22475 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor Druid handler is already there. Updating will allow the entire project to be on the same version. https://github.com/apache/hive/blob/38190f3e95752c85188682d8a78d259455e173c2/itests/qtest-druid/pom.xml#L228 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22469) Lower Metastore DB Connection Pool Size in QTests
David Mollitor created HIVE-22469: - Summary: Lower Metastore DB Connection Pool Size in QTests Key: HIVE-22469 URL: https://issues.apache.org/jira/browse/HIVE-22469 Project: Hive Issue Type: Improvement Components: Test, Tests Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Hive Metastore uses the 'HikariCP' database connection pool. The default number of connections to the database is 10. For the Qtests, connecting to a local DerbyDB, there need not be more than 1 connection. Anymore simply adds undo overhead and by looking at the QTest logs, I see a bunch of 'connection refused' from HikariCP. It may be the case that the standalone DB does not support that many concurrent connections anyway. https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22462) Error Information Lost in GenericUDTFGetSplits
David Mollitor created HIVE-22462: - Summary: Error Information Lost in GenericUDTFGetSplits Key: HIVE-22462 URL: https://issues.apache.org/jira/browse/HIVE-22462 Project: Hive Issue Type: Improvement Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-22462.1.patch I was recently looking at some logs from a failed unit test and saw: {code:none} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create temp table: nullCaused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create temp table: null at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits2.process(GenericUDTFGetSplits2.java:81) at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.jav {code} Error information was lost... useless 'null' string is written. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22441) Metrics Subsytem Improvements
David Mollitor created HIVE-22441: - Summary: Metrics Subsytem Improvements Key: HIVE-22441 URL: https://issues.apache.org/jira/browse/HIVE-22441 Project: Hive Issue Type: Improvement Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor # CodahaleMetrics uses Guava LoadingCache, which is already thread-safe, and then puts an explicit lock around the structure. Use Java 8 new Map API with ConcurrentHashMap. # Introduce Java 8 APIs # Simplifications # Updated unit tests to no longer include a 'sleep' https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java#L91-L94 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22493) Scheduled Query Execution Failure in Tests
David Mollitor created HIVE-22493: - Summary: Scheduled Query Execution Failure in Tests Key: HIVE-22493 URL: https://issues.apache.org/jira/browse/HIVE-22493 Project: Hive Issue Type: Improvement Reporter: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22491) Use Collections emptyList
David Mollitor created HIVE-22491: - Summary: Use Collections emptyList Key: HIVE-22491 URL: https://issues.apache.org/jira/browse/HIVE-22491 Project: Hive Issue Type: Improvement Affects Versions: 3.2.0 Environment: https://docs.oracle.com/javase/8/docs/api/?java/util/Collections.html Use Collections#emptyList instead of instantiating empty ArrayLists Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-22491.1.patch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22496) Update Hadoop Version to 3.1.1
David Mollitor created HIVE-22496: - Summary: Update Hadoop Version to 3.1.1 Key: HIVE-22496 URL: https://issues.apache.org/jira/browse/HIVE-22496 Project: Hive Issue Type: Improvement Affects Versions: 4.0.0 Reporter: David Mollitor Assignee: David Mollitor https://lists.apache.org/thread.html/8313e605c0ed0012f134cce9cc6adca738eea81feccea99c8de87cd9@%3Cgeneral.hadoop.apache.org%3E {quote} - This release is *not* yet ready for production use. Critical issues are being ironed out via testing and downstream adoption. Production users should wait for a 3.1.1/3.1.2 release. {quote} Current: {code:xml} 3.1.0 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22494) Use System NanoTime to Measure Code Execution
David Mollitor created HIVE-22494: - Summary: Use System NanoTime to Measure Code Execution Key: HIVE-22494 URL: https://issues.apache.org/jira/browse/HIVE-22494 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime() It's designed for these use cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22484) Remove Calls to printStackTrace
David Mollitor created HIVE-22484: - Summary: Remove Calls to printStackTrace Key: HIVE-22484 URL: https://issues.apache.org/jira/browse/HIVE-22484 Project: Hive Issue Type: Improvement Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor In many cases, the call to {{printStackTrace}} bypasses the logging framework, in other cases, the error stack trace is printed and the exception is re-thrown (log-and-throw is a bad pattern), and then there are some other edge cases. Remove this call and replace with calls to the logging framework. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22503) Harmonize JODA Time Version in Module hive-hcatalog-it-unit
David Mollitor created HIVE-22503: - Summary: Harmonize JODA Time Version in Module hive-hcatalog-it-unit Key: HIVE-22503 URL: https://issues.apache.org/jira/browse/HIVE-22503 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor https://github.com/apache/hive/blob/078182ade4b76e810ca945354f4897dbe36ad5c2/itests/hcatalog-unit/pom.xml#L296 Currently hard-coded as version 2.2 See if we can get away with using the same version of Joda as the rest of the Hive project. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22529) Make Debugging Stacktrace More Explicit
David Mollitor created HIVE-22529: - Summary: Make Debugging Stacktrace More Explicit Key: HIVE-22529 URL: https://issues.apache.org/jira/browse/HIVE-22529 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor In some places, the following DEBUG logging was introduced: {code:java} LOG.debug("Message", new Exception()); {code} The purpose of this is to log the stack trace of the Thread calling this debug logging method. However, the resulting log message includes the following: {code:none} 2019-11-19T08:13:31,392 DEBUG [Thread] Logger: Message java.lang.Exception: null at {code} To the observer, it appears that there was perhaps some sort of NPE. Add a message to the Exception being generated to make it more clear that this "Exception" is for debugging purposes and not an actual error. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22569) PartitionPruner use Collections Class
David Mollitor created HIVE-22569: - Summary: PartitionPruner use Collections Class Key: HIVE-22569 URL: https://issues.apache.org/jira/browse/HIVE-22569 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22570) Review of ExprNodeDesc.java
David Mollitor created HIVE-22570: - Summary: Review of ExprNodeDesc.java Key: HIVE-22570 URL: https://issues.apache.org/jira/browse/HIVE-22570 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-22570.1.patch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22571) Review of ExprNodeFieldDesc Class
David Mollitor created HIVE-22571: - Summary: Review of ExprNodeFieldDesc Class Key: HIVE-22571 URL: https://issues.apache.org/jira/browse/HIVE-22571 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22607) Session State Thread Names Are Way Too Long
David Mollitor created HIVE-22607: - Summary: Session State Thread Names Are Way Too Long Key: HIVE-22607 URL: https://issues.apache.org/jira/browse/HIVE-22607 Project: Hive Issue Type: Improvement Reporter: David Mollitor {code:none} 2019-12-08T07:15:34,107 DEBUG [bc661ab9-b44a-4ee5-8ce6-b8b7ae5a0e07 d480b94c-8773-4817-85fc-faa177308660 73ecdf15-06fd-4358-a249-22623c98d234 37e3c8de-9c77-4939-94d0-5aa45501b545 7b152899-d8da-493f-a329-a16f171bb1a3 604abf47-22ed-480c-a460-f6d44ac740ec 91fee8a8-aee4-4eee-993f-ad2927784d57 c496e18f-a681-4db8-b254-6911793e7fb2 c6069d3c-03fb-4e1d-b0be-5dd13882b086 3c360edd-9c0f-486b-99f9-dd9f8aac5fcd c2be6f69-3ef5-44e4-b8d5-16b835b0ae2f 3bd2c6d2-9cce-4497-aa07-4672b95e6f76 ccb64f47-b1d7-477b-81ed-f8baabae4ee4 8b244038-d7a3-4e11-ab2f-b42d117c8e40 8183bded-ef37-4bdc-ab17-a4c4b136401d 8b161c72-0fc2-4175-ac9f-9a0c7bf9b387 38fd3a6d-498a-4145-b77e-24e33c31edaf 70f2729d-7249-4397-b04f-22694fec391e c65a3e39-009d-4e8a-b16e-1ca0e9de7cef 1dd2b274-75bd-4204-80b9-0103fefd0227 0bab6264-40ff-4f05-8d11-0cc9ee3132c0 0e7e677b-3f24-408b-ac1e-ddeee54023b8 8fd57ca5-2128-45a1-b7c4-ec611bfdefcd 01bf788e-cf98-44e8-abaa-c9e74e782bde 9d27f963-b45b-4991-baa8-2ed2db41857d 207da47a-b3b2-4583-b654-61fc551f4eca a10d8b77-8f27-48ca-a3b5-e5cfcbf8a4c0 352e77d5-3071-4502-bb0e-a6d0df8d1cac 18697755-6e2b-4907-ae8b-11ee0ff2f057 d779ca40-0760-4b27-b0b1-c35d985359c1 ca4469aa-7be7-4eaf-9bcd-aeff6441c7c0 30ce1107-8e95-4c16-b55a-5a7c4dea0696 ee44b107-07ae-45c9-88a4-0424da8a6bcb main] session.SessionState: Updating thread name to 749f4123-716a-4534-af7f-b426fdc3ccc9 bc661ab9-b44a-4ee5-8ce6-b8b7ae5a0e07 d480b94c-8773-4817-85fc-faa177308660 73ecdf15-06fd-4358-a249-22623c98d234 37e3c8de-9c77-4939-94d0-5aa45501b545 7b152899-d8da-493f-a329-a16f171bb1a3 604abf47-22ed-480c-a460-f6d44ac740ec 91fee8a8-aee4-4eee-993f-ad2927784d57 c496e18f-a681-4db8-b254-6911793e7fb2 c6069d3c-03fb-4e1d-b0be-5dd13882b086 3c360edd-9c0f-486b-99f9-dd9f8aac5fcd c2be6f69-3ef5-44e4-b8d5-16b835b0ae2f 3bd2c6d2-9cce-4497-aa07-4672b95e6f76 ccb64f47-b1d7-477b-81ed-f8baabae4ee4 8b244038-d7a3-4e11-ab2f-b42d117c8e40 8183bded-ef37-4bdc-ab17-a4c4b136401d 8b161c72-0fc2-4175-ac9f-9a0c7bf9b387 38fd3a6d-498a-4145-b77e-24e33c31edaf 70f2729d-7249-4397-b04f-22694fec391e c65a3e39-009d-4e8a-b16e-1ca0e9de7cef 1dd2b274-75bd-4204-80b9-0103fefd0227 0bab6264-40ff-4f05-8d11-0cc9ee3132c0 0e7e677b-3f24-408b-ac1e-ddeee54023b8 8fd57ca5-2128-45a1-b7c4-ec611bfdefcd 01bf788e-cf98-44e8-abaa-c9e74e782bde 9d27f963-b45b-4991-baa8-2ed2db41857d 207da47a-b3b2-4583-b654-61fc551f4eca a10d8b77-8f27-48ca-a3b5-e5cfcbf8a4c0 352e77d5-3071-4502-bb0e-a6d0df8d1cac 18697755-6e2b-4907-ae8b-11ee0ff2f057 d779ca40-0760-4b27-b0b1-c35d985359c1 ca4469aa-7be7-4eaf-9bcd-aeff6441c7c0 30ce1107-8e95-4c16-b55a-5a7c4dea0696 ee44b107-07ae-45c9-88a4-0424da8a6bcb main {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22605) NPE in LlapLoadGeneratorService During Tests
David Mollitor created HIVE-22605: - Summary: NPE in LlapLoadGeneratorService During Tests Key: HIVE-22605 URL: https://issues.apache.org/jira/browse/HIVE-22605 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor {code:none} java.lang.NullPointerException: null at org.apache.hadoop.hive.llap.daemon.impl.LlapLoadGeneratorService.serviceStop(LlapLoadGeneratorService.java:103) ~[classes/:?] at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220) ~[hadoop-common-3.1.0.jar:?] at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54) ~[hadoop-common-3.1.0.jar:?] at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) [hadoop-common-3.1.0.jar:?] at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) [hadoop-common-3.1.0.jar:?] at org.apache.hadoop.hive.llap.daemon.impl.TestLlapLoadGeneratorService.testLoadGeneratorFails(TestLlapLoadGeneratorService.java:70) [test-classes/:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_102] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22614) Replace Base64 in hive-jdbc Package
David Mollitor created HIVE-22614: - Summary: Replace Base64 in hive-jdbc Package Key: HIVE-22614 URL: https://issues.apache.org/jira/browse/HIVE-22614 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22613) Replace Base64 in hive-hbase-handler
David Mollitor created HIVE-22613: - Summary: Replace Base64 in hive-hbase-handler Key: HIVE-22613 URL: https://issues.apache.org/jira/browse/HIVE-22613 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22611) Use JDK Base64 Classes
David Mollitor created HIVE-22611: - Summary: Use JDK Base64 Classes Key: HIVE-22611 URL: https://issues.apache.org/jira/browse/HIVE-22611 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor Replace dependency on thirdparty libraries with native support for Base-64 encode/decode. https://docs.oracle.com/javase/8/docs/api/java/util/Base64.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22612) Replace Base64 in accumulo-handler Package
David Mollitor created HIVE-22612: - Summary: Replace Base64 in accumulo-handler Package Key: HIVE-22612 URL: https://issues.apache.org/jira/browse/HIVE-22612 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22615) Replace Base64 in hive-common Package
David Mollitor created HIVE-22615: - Summary: Replace Base64 in hive-common Package Key: HIVE-22615 URL: https://issues.apache.org/jira/browse/HIVE-22615 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22597) Include More Context in Database NoSuchObjectException
David Mollitor created HIVE-22597: - Summary: Include More Context in Database NoSuchObjectException Key: HIVE-22597 URL: https://issues.apache.org/jira/browse/HIVE-22597 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-22597.1.patch {code:none} org.apache.hadoop.hive.metastore.api.NoSuchObjectException: default at org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:717) ~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at sun.reflect.GeneratedMethodAccessor260.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_102] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102] at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) ~[hive-standalone-metastore-server-4.0.0-SN {code} One can decipher that this exception is in regards to a database by looking at the stack trace, but it should be specified in the error message itself. Also, there is no catalogue information provided, so it could be a bit ambiguous. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22415) Upgrade to Java 11
David Mollitor created HIVE-22415: - Summary: Upgrade to Java 11 Key: HIVE-22415 URL: https://issues.apache.org/jira/browse/HIVE-22415 Project: Hive Issue Type: Improvement Reporter: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22417) Remove stringifyException from MetaStore
David Mollitor created HIVE-22417: - Summary: Remove stringifyException from MetaStore Key: HIVE-22417 URL: https://issues.apache.org/jira/browse/HIVE-22417 Project: Hive Issue Type: Sub-task Components: Metastore, Standalone Metastore Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22370) Remove Deprecated Fields from HiveConf
David Mollitor created HIVE-22370: - Summary: Remove Deprecated Fields from HiveConf Key: HIVE-22370 URL: https://issues.apache.org/jira/browse/HIVE-22370 Project: Hive Issue Type: Improvement Affects Versions: 3.0.0 Reporter: David Mollitor Assignee: David Mollitor Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22427) PersistenceManagerProvider Logs a Warning About datanucleus.autoStartMechanismMode
David Mollitor created HIVE-22427: - Summary: PersistenceManagerProvider Logs a Warning About datanucleus.autoStartMechanismMode Key: HIVE-22427 URL: https://issues.apache.org/jira/browse/HIVE-22427 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor {code:none} WARN [pool-6-thread-2] metastore.PersistenceManagerProvider: datanucleus.autoStartMechanismMode is set to unsupported value null . Setting it to value: ignored {code} This does not need to be a WARN level logging for this scenario. Perhaps if user configures the value to some non-null value, then emit a warning, otherwise, simply emit an INFO level stating that the configuration is not set and that a reasonable default value will be used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22426) Use DependencyManagement in Root POM for itests
David Mollitor created HIVE-22426: - Summary: Use DependencyManagement in Root POM for itests Key: HIVE-22426 URL: https://issues.apache.org/jira/browse/HIVE-22426 Project: Hive Issue Type: Improvement Components: Test, Tests Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22428) Superfluous "Failed to get database" WARN Logging in ObjectStore
David Mollitor created HIVE-22428: - Summary: Superfluous "Failed to get database" WARN Logging in ObjectStore Key: HIVE-22428 URL: https://issues.apache.org/jira/browse/HIVE-22428 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-22428.1.patch In my testing, I get lots of logs like this: {code:none} Line 26319: 2019-10-28T21:09:52,134 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.compdb, returning NoSuchObjectException Line 26327: 2019-10-28T21:09:52,135 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.compdb, returning NoSuchObjectException Line 26504: 2019-10-28T21:09:52,600 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.tstatsfast, returning NoSuchObjectException Line 26519: 2019-10-28T21:09:52,606 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.tstatsfast, returning NoSuchObjectException Line 26695: 2019-10-28T21:09:52,922 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.createDb, returning NoSuchObjectException Line 26703: 2019-10-28T21:09:52,923 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.createDb, returning NoSuchObjectException Line 26763: 2019-10-28T21:09:52,936 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.compdb, returning NoSuchObjectException Line 26778: 2019-10-28T21:09:52,939 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.compdb, returning NoSuchObjectException Line 26963: 2019-10-28T21:09:53,273 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.db1, returning NoSuchObjectException Line 26978: 2019-10-28T21:09:53,276 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.db2, returning NoSuchObjectException Line 26986: 2019-10-28T21:09:53,277 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.db1, returning NoSuchObjectException Line 27018: 2019-10-28T21:09:53,300 WARN [pool-6-thread-5] metastore.ObjectStore: Failed to get database hive.db2, returning NoSuchObjectException {code} This is a superfluous log message. It might be pretty common for a database to not exists if, for example, a user fat-fingers the name of the database. The code also has the bad habit of log-and-throw. Just log or throw, not both. Since I'm looking at this class, touch up some of the other logging as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22421) Improve Logging If Configuration File Not Found
David Mollitor created HIVE-22421: - Summary: Improve Logging If Configuration File Not Found Key: HIVE-22421 URL: https://issues.apache.org/jira/browse/HIVE-22421 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor {code:none} 2019-10-28T21:07:27,599 INFO [main] conf.MetastoreConf: Unable to find config file metastore-site.xml 2019-10-28T21:07:27,599 INFO [main] conf.MetastoreConf: Found configuration file null {code} Prints 'unable to find' followed by 'null'. Just print one or the other. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22423) Improve Logging In HadoopThriftAuthBridge
David Mollitor created HIVE-22423: - Summary: Improve Logging In HadoopThriftAuthBridge Key: HIVE-22423 URL: https://issues.apache.org/jira/browse/HIVE-22423 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor # Remove superfluous debug log guards # Improve messages # Improve message format -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22419) Improve Messages Emitted From HiveMetaStoreClient
David Mollitor created HIVE-22419: - Summary: Improve Messages Emitted From HiveMetaStoreClient Key: HIVE-22419 URL: https://issues.apache.org/jira/browse/HIVE-22419 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 4.0.0, 3.2.0 Reporter: David Mollitor Assignee: David Mollitor After reviewing some logs and errors emitted during a QTest run, I would like to propose some improvements to logging in {{HiveMetaStoreClient}}. * Remove duplicate logging * Remove superfluous class {{StackTraceLogger}} * Do not use contractions in public-facing error messages and logs * Make all logging side-effect free (see {{connCount}}) * Code simplification -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22424) User PerfLogger in MetastoreDirectSqlUtils.java
David Mollitor created HIVE-22424: - Summary: User PerfLogger in MetastoreDirectSqlUtils.java Key: HIVE-22424 URL: https://issues.apache.org/jira/browse/HIVE-22424 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 3.2.0 Reporter: David Mollitor Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22425) ReplChangeManager Not Logging Database Name
David Mollitor created HIVE-22425: - Summary: ReplChangeManager Not Logging Database Name Key: HIVE-22425 URL: https://issues.apache.org/jira/browse/HIVE-22425 Project: Hive Issue Type: Improvement Affects Versions: 3.2.0 Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-22425.1.patch {code:java|title=ReplChangeManager.java} LOG.debug("Repl policy is not set for database ", db.getName()); {code} The log statement is missing the placeholder '{}' so the DB name is not getting logged. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22390) Remove Dependency on JODA Time Library
David Mollitor created HIVE-22390: - Summary: Remove Dependency on JODA Time Library Key: HIVE-22390 URL: https://issues.apache.org/jira/browse/HIVE-22390 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor Hive uses Joda time library. {quote} Joda-Time is the de facto standard date and time library for Java prior to Java SE 8. Users are now asked to migrate to java.time (JSR-310). https://www.joda.org/joda-time/ {quote} Remove this dependency from classes, POM files, and licence files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22402) Deprecate Hive PerfLogger
David Mollitor created HIVE-22402: - Summary: Deprecate Hive PerfLogger Key: HIVE-22402 URL: https://issues.apache.org/jira/browse/HIVE-22402 Project: Hive Issue Type: Improvement Affects Versions: 4.0.0 Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22404) Upgrade to Java 9
David Mollitor created HIVE-22404: - Summary: Upgrade to Java 9 Key: HIVE-22404 URL: https://issues.apache.org/jira/browse/HIVE-22404 Project: Hive Issue Type: Improvement Reporter: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22403) Beeline Should Print Location of Configuration Directory at Startup
David Mollitor created HIVE-22403: - Summary: Beeline Should Print Location of Configuration Directory at Startup Key: HIVE-22403 URL: https://issues.apache.org/jira/browse/HIVE-22403 Project: Hive Issue Type: Improvement Components: Beeline Affects Versions: 2.4.0, 3.2.0 Reporter: David Mollitor Beeline should print the CONF directory it is utilizing when it starts up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22547) Review txn compactor Package
David Mollitor created HIVE-22547: - Summary: Review txn compactor Package Key: HIVE-22547 URL: https://issues.apache.org/jira/browse/HIVE-22547 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor * Remove log-and-throw anti-pattern * Use parameterized logging * Add a CompactionException class to improve debug-ability * Introduce Java Optional utility * Other clean up -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22337) Improve and Expand Text-Based SerDes
David Mollitor created HIVE-22337: - Summary: Improve and Expand Text-Based SerDes Key: HIVE-22337 URL: https://issues.apache.org/jira/browse/HIVE-22337 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 4.0.0 Reporter: David Mollitor Assignee: David Mollitor Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22993) Include Bloom Filter in Column Statistics to Better Estimate nDV
David Mollitor created HIVE-22993: - Summary: Include Bloom Filter in Column Statistics to Better Estimate nDV Key: HIVE-22993 URL: https://issues.apache.org/jira/browse/HIVE-22993 Project: Hive Issue Type: Improvement Components: CBO, Statistics Reporter: David Mollitor When performing an INSERT statement, Hive has no way to determine the number of distinct values since the distinct values themselves are not recorded. {code:sql} create table test_mm(`id` int, `my_dt` date); insert into test_mm values (1, "2018-10-01"), (2, "2018-10-01"), (3, "2018-10-01"), (4, "2017-10-01"), (5, "2017-10-01"), (6, "2017-10-01"), (7, "2010-10-01"), (8, "2010-10-01"), (9, "2010-10-01"), (10, "1998-10-01"), (11, "1998-10-01"), (12, "1998-10-01"); DESCRIBE FORMATTED test_mm my_dt; -- distinct_count: 4 insert into test_mm values (13, "2030-10-01"), (14, "2030-10-01"), (15, "2030-10-01"); DESCRIBE FORMATTED test_mm my_dt; -- distinct_count: 4 {code} The first INSERT statement sees that there are 0 records, so it makes sense that any distinct values marked in the statistics. However, for the second INSERT, Hive has no idea if "2030-10-01" is distinct, so the distinct_count is unchanged. By introducing a bloom filter for column statistics, the second INSERT may be able to determine that "2030-10-01" is indeed unique and update the distinct_count accordingly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22914) Make Hive Connection ZK Interactions Easier to Troubleshoot
David Mollitor created HIVE-22914: - Summary: Make Hive Connection ZK Interactions Easier to Troubleshoot Key: HIVE-22914 URL: https://issues.apache.org/jira/browse/HIVE-22914 Project: Hive Issue Type: Improvement Affects Versions: 3.1.2, 4.0.0 Reporter: David Mollitor Assignee: David Mollitor Add better logging and make errors more consistent and meaningful. Recently was trying to troubleshoot an issue where the ZK namespace of the client and the HS2 were different and it was way too difficult to diagnose. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22884) Put Reasons for Failed CBO In Explain Plan
David Mollitor created HIVE-22884: - Summary: Put Reasons for Failed CBO In Explain Plan Key: HIVE-22884 URL: https://issues.apache.org/jira/browse/HIVE-22884 Project: Hive Issue Type: Improvement Components: CBO Reporter: David Mollitor If a query cannot be processed by CBO, the reason for the failure is logged into the HiveServer2 application log. In addition, please also put this information into the EXPLAIN plan output itself. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22679) Replace Base64 in metastore-common Package
David Mollitor created HIVE-22679: - Summary: Replace Base64 in metastore-common Package Key: HIVE-22679 URL: https://issues.apache.org/jira/browse/HIVE-22679 Project: Hive Issue Type: Sub-task Reporter: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22676) Replace Base64 in hive-service Package
David Mollitor created HIVE-22676: - Summary: Replace Base64 in hive-service Package Key: HIVE-22676 URL: https://issues.apache.org/jira/browse/HIVE-22676 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22680) Replace Base64 in druid-handler Package
David Mollitor created HIVE-22680: - Summary: Replace Base64 in druid-handler Package Key: HIVE-22680 URL: https://issues.apache.org/jira/browse/HIVE-22680 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22681) Replace Base64 in hcatalog-webhcat Package
David Mollitor created HIVE-22681: - Summary: Replace Base64 in hcatalog-webhcat Package Key: HIVE-22681 URL: https://issues.apache.org/jira/browse/HIVE-22681 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22678) Run Eclipse Cleanup Against hive-accumulo-handler Module
David Mollitor created HIVE-22678: - Summary: Run Eclipse Cleanup Against hive-accumulo-handler Module Key: HIVE-22678 URL: https://issues.apache.org/jira/browse/HIVE-22678 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22677) Run Eclipse Cleanup Against Hive Project
David Mollitor created HIVE-22677: - Summary: Run Eclipse Cleanup Against Hive Project Key: HIVE-22677 URL: https://issues.apache.org/jira/browse/HIVE-22677 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22683) Run Eclipse Cleanup Against beeline Module
David Mollitor created HIVE-22683: - Summary: Run Eclipse Cleanup Against beeline Module Key: HIVE-22683 URL: https://issues.apache.org/jira/browse/HIVE-22683 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-22683.1.patch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22684) Run Eclipse Cleanup Against hbase-handler Module
David Mollitor created HIVE-22684: - Summary: Run Eclipse Cleanup Against hbase-handler Module Key: HIVE-22684 URL: https://issues.apache.org/jira/browse/HIVE-22684 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22685) TestHiveSqlDateTimeFormatter Now Broken with New Year 2020
David Mollitor created HIVE-22685: - Summary: TestHiveSqlDateTimeFormatter Now Broken with New Year 2020 Key: HIVE-22685 URL: https://issues.apache.org/jira/browse/HIVE-22685 Project: Hive Issue Type: Bug Reporter: David Mollitor Unit test is now broken (n)(n):( {code:java} //Tests for these patterns would need changing every decade if done in the above way. //Thursday of the first week in an ISO year always matches the Gregorian year. checkParseTimestampIso("IY-IW-ID", "0-01-04", "iw, ", "01, " + thisYearString.substring(0, 3) + "0"); checkParseTimestampIso("I-IW-ID", "0-01-04", "iw, ", "01, " + thisYearString.substring(0, 3) + "0"); {code} {code} org.junit.ComparisonFailure: expected:<01, 20[1]0> but was:<01, 20[2]0> at org.junit.Assert.assertEquals(Assert.java:115) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.hive.common.format.datetime.TestHiveSqlDateTimeFormatter.checkParseTimestampIso(TestHiveSqlDateTimeFormatter.java:313) at org.apache.hadoop.hive.common.format.datetime.TestHiveSqlDateTimeFormatter.testParseTimestamp(TestHiveSqlDateTimeFormatter.java:287) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22673) Replace Base64 in contrib Package
David Mollitor created HIVE-22673: - Summary: Replace Base64 in contrib Package Key: HIVE-22673 URL: https://issues.apache.org/jira/browse/HIVE-22673 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-22673.1.patch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22674) Replace Base64 in serde Package
David Mollitor created HIVE-22674: - Summary: Replace Base64 in serde Package Key: HIVE-22674 URL: https://issues.apache.org/jira/browse/HIVE-22674 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22675) Replace Base64 in hive-standalone-metastore Package
David Mollitor created HIVE-22675: - Summary: Replace Base64 in hive-standalone-metastore Package Key: HIVE-22675 URL: https://issues.apache.org/jira/browse/HIVE-22675 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23012) Beeline Has Too Many Dependencies
David Mollitor created HIVE-23012: - Summary: Beeline Has Too Many Dependencies Key: HIVE-23012 URL: https://issues.apache.org/jira/browse/HIVE-23012 Project: Hive Issue Type: Improvement Environment: * jetty-server * ORC client libraries * HBase client libraries * Avro * Something called 'twill' Please investigate cleaning up the POM file and cutting down the number of dependencies. Reporter: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23017) Remove Superfluous 'Transient' Keyword From FetchTask
David Mollitor created HIVE-23017: - Summary: Remove Superfluous 'Transient' Keyword From FetchTask Key: HIVE-23017 URL: https://issues.apache.org/jira/browse/HIVE-23017 Project: Hive Issue Type: Improvement Reporter: David Mollitor {code:java|title=FetchTask} public class FetchTask extends Task implements Serializable { private static final long serialVersionUID = 1L; private int maxRows = 100; private FetchOperator fetch; private ListSinkOperator sink; private int totalRows; private static transient final Logger LOG = LoggerFactory.getLogger(FetchTask.class); JobConf job = null; {code} There is not need for this {{Logger}} to be transient. Please remove as it is useless overheard. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23016) Extract JdbcConnectionParams from Utils Class
David Mollitor created HIVE-23016: - Summary: Extract JdbcConnectionParams from Utils Class Key: HIVE-23016 URL: https://issues.apache.org/jira/browse/HIVE-23016 Project: Hive Issue Type: Improvement Reporter: David Mollitor And make it its own class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23005) Consider Default JDBC Fetch Size From HS2
David Mollitor created HIVE-23005: - Summary: Consider Default JDBC Fetch Size From HS2 Key: HIVE-23005 URL: https://issues.apache.org/jira/browse/HIVE-23005 Project: Hive Issue Type: Sub-task Components: JDBC Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23007) Server Should Return Default Fetch Size If One Is Not Sent By Client
David Mollitor created HIVE-23007: - Summary: Server Should Return Default Fetch Size If One Is Not Sent By Client Key: HIVE-23007 URL: https://issues.apache.org/jira/browse/HIVE-23007 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor Attachments: HIVE-23007.1.patch -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23171) Create Tool To Visualize Hive Parser Tree
David Mollitor created HIVE-23171: - Summary: Create Tool To Visualize Hive Parser Tree Key: HIVE-23171 URL: https://issues.apache.org/jira/browse/HIVE-23171 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23172) Quoted Backtick Columns Are Not Parsing Correctly
David Mollitor created HIVE-23172: - Summary: Quoted Backtick Columns Are Not Parsing Correctly Key: HIVE-23172 URL: https://issues.apache.org/jira/browse/HIVE-23172 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor I recently came across a weird behavior while examining failures of {{special_character_in_tabnames_2.q}} while working on HIVE-23150. I was surprised to see it fail because I couldn't see of any reason why it should... it's doing pretty standard SQL statements just like every other test, but for some reason this test is just a *little bit* differently than most others and it brought this issue to light. Turns out,... the parsing of table names is pretty much wrong across the board. The statement that caught my attention was this: {code:sql} DROP TABLE IF EXISTS `s/c`; {code} And here is the relevant grammar: {code:none} fragment RegexComponent : 'a'..'z' | 'A'..'Z' | '0'..'9' | '_' | PLUS | STAR | QUESTION | MINUS | DOT | LPAREN | RPAREN | LSQUARE | RSQUARE | LCURLY | RCURLY | BITWISEXOR | BITWISEOR | DOLLAR | '!' ; Identifier : (Letter | Digit) (Letter | Digit | '_')* | {allowQuotedId()}? QuotedIdentifier /* though at the language level we allow all Identifiers to be QuotedIdentifiers; at the API level only columns are allowed to be of this form */ | '`' RegexComponent+ '`' ; fragment QuotedIdentifier : '`' ( '``' | ~('`') )* '`' { setText(StringUtils.replace(getText().substring(1, getText().length() -1 ), "``", "`")); } ; {code} The mystery for me was that, for some reason, this String {{`s/c`}} was being stripped of its back-ticks. Every other test I investigated did not have this behavior, the back ticks were always preserved around the table name. The main Hive Java code base would see the back-ticks and deal with it internally. For HIVE-23150, I introduced some sanity checks and they were failing because they were expecting the back ticks to be present. With the help of HIVE-23171 I finally figured it out. So, what I discovered is that pretty much every table name is hitting the {{RegexComponent}} rule and the back ticks are carried forward. However, {{`s/c`}} the forward slash `/` is not allowable in {{RegexComponent}} so it hits on {{QuotedIdentifier}} rule which is trimming the back ticks. I validated this by disabling {{QuotedIdentifier}}. When I did this, {{`s/c`}} fails in error but {{`sc`}} parses successfully... because {{`sc`}} is being treated as a {{RegexComponent}}. So, if you have {{allowQuotedId}} disabled, table names can only use the characters defined in the {{RegexComponent}} rule (otherwise it errors), and it will *not* strip the back ticks. If you have {{allowQuotedId}} enabled, then if the table name has a character not specified in {{RegexComponent}}, it will identify it as a table name and it *will* strip the back ticks, if all the characters are part of {{RegexComponent}} then it will *not* strip the back ticks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23159) Cleanup ShowCreateTableOperation
David Mollitor created HIVE-23159: - Summary: Cleanup ShowCreateTableOperation Key: HIVE-23159 URL: https://issues.apache.org/jira/browse/HIVE-23159 Project: Hive Issue Type: Bug Reporter: David Mollitor Assignee: David Mollitor * Move StringTemplate templates to external files * Explore better leveraging StringTemplate capabilities to remove duplicate functionality in the class * General clean up and formatting -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23174) Remove TOK_TRUNCATETABLE
David Mollitor created HIVE-23174: - Summary: Remove TOK_TRUNCATETABLE Key: HIVE-23174 URL: https://issues.apache.org/jira/browse/HIVE-23174 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23177) Upgrade to ANTLR4
David Mollitor created HIVE-23177: - Summary: Upgrade to ANTLR4 Key: HIVE-23177 URL: https://issues.apache.org/jira/browse/HIVE-23177 Project: Hive Issue Type: Improvement Reporter: David Mollitor Upgrade Hive to ANTL4, ANTLR3 lost support many moons ago. This is going to be a big lift. Many of the Hive rules use the "rule rewrite" feature which no longer exists in ANLTR4 and it must be completely re-implemented: https://stackoverflow.com/questions/14565794/antlr-4-tree-inject-rewrite-operator -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23176) Remove REGEX Column Feature
David Mollitor created HIVE-23176: - Summary: Remove REGEX Column Feature Key: HIVE-23176 URL: https://issues.apache.org/jira/browse/HIVE-23176 Project: Hive Issue Type: Improvement Reporter: David Mollitor Remove the Hive feature: REGEX Column. Hive has this interesting feature for doing REGEX to SELECT multiple columns. This needs to go. It is not SQL standard and as currently implemented, it is impossible to determine if a column identifier is a REGEX or the actual name of the column. If a column name is enclosed in back ticks then any UTF-8 character is a valid table name. [https://dev.mysql.com/doc/refman/8.0/en/identifiers.html] [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23189) Change Explain ANALYZE to Explain PROFILE
David Mollitor created HIVE-23189: - Summary: Change Explain ANALYZE to Explain PROFILE Key: HIVE-23189 URL: https://issues.apache.org/jira/browse/HIVE-23189 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor {code:none} EXPLAIN [EXTENDED|CBO|AST|DEPENDENCY|AUTHORIZATION|LOCKS|VECTORIZATION|ANALYZE] query {code} https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain#LanguageManualExplain-TheANALYZEClause In Hive, there is an {{EXPLAIN ANALYZE}} query. This can get a bit confusing because you can run an {{EXPLAIN ANALYZE}} against an {{ANALYZE TABLE}} statement, so you have something like,... {code:sql} EXPLAIN ANALYZE ANALYZE TABLE `myTable` COMPUTE STATISTICS; {code} I would like to propose that the name be changed to {{EXPLAIN PROFILE}}. This borrows from Apache Impala because it has a {{PROFILE}} command which produces the stats that actually occurred during the query run (much like this Hive feature). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23187) Make TABLE Token Optional in ANALYZE Statement
David Mollitor created HIVE-23187: - Summary: Make TABLE Token Optional in ANALYZE Statement Key: HIVE-23187 URL: https://issues.apache.org/jira/browse/HIVE-23187 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23188) Allow STATS Token in Analyze Table
David Mollitor created HIVE-23188: - Summary: Allow STATS Token in Analyze Table Key: HIVE-23188 URL: https://issues.apache.org/jira/browse/HIVE-23188 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23186) Strict Check SemanticException Should Properly Quote Table Name
David Mollitor created HIVE-23186: - Summary: Strict Check SemanticException Should Properly Quote Table Name Key: HIVE-23186 URL: https://issues.apache.org/jira/browse/HIVE-23186 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor https://github.com/apache/hive/blob/029cab297a9ae40d249f63040721f93857398648/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java#L191-L192 {code:java} throw new SemanticException(error + " No partition predicate for Alias \"" + alias + "\" Table \"" + tab.getTableName() + "\""); {code} Use back ticks and use the database name as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23150) Create a Parser that All Components Use
David Mollitor created HIVE-23150: - Summary: Create a Parser that All Components Use Key: HIVE-23150 URL: https://issues.apache.org/jira/browse/HIVE-23150 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor Create a parser for parsing (and validating) MySQL/MariaDB style object identifiers. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23149) Consistency of Parsing Object Identifiers
David Mollitor created HIVE-23149: - Summary: Consistency of Parsing Object Identifiers Key: HIVE-23149 URL: https://issues.apache.org/jira/browse/HIVE-23149 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor There needs to be better consistency with handling of object identifiers (database, tables, column, view, function, etc.). I think it makes sense to standardize on the same rules which MySQL/MariaDB uses for their column names so that Hive can be more of a drop-in replacement for these. The two important things to keep in mind are: 1// Permitted characters in quoted identifiers include the full Unicode Basic Multilingual Plane (BMP), except U+ 2// If any components of a multiple-part name require quoting, quote them individually rather than quoting the name as a whole. For example, write {{`my-table`.`my-column`}}, not {{`my-table.my-column`}}. [https://dev.mysql.com/doc/refman/8.0/en/identifiers.html] [https://dev.mysql.com/doc/refman/8.0/en/identifier-qualifiers.html] That is to say: {code:sql} -- Select all rows from a table named `default.mytable` -- (Yes, the table name itself has a period in it. This is valid) SELECT * FROM `default.mytable`; -- Select all rows from database `default`, table `mytable` SELECT * FROM `default`.`mytable`; {code} This plays out in a couple of ways. There may be more, but these are the ones I know about already: 1// Hive generates incorrect syntax: [HIVE-23128] 2// Hive throws exception if there is a period in the table name. This is an invalid response. Table name may have a period in them. More likely than not, it will throw 'table not found' exception since the user most likely accidentally used backticks incorrectly and meant to specify a db and a table separately. [HIVE-16907] Once we have the parsing figured out and support for backticks to enclose UTF-8 strings, then the backend database needs to actually support the UTF-8 character set. It currently does not: [HIVE-1808] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23193) Review of Subset of Debug Logging
David Mollitor created HIVE-23193: - Summary: Review of Subset of Debug Logging Key: HIVE-23193 URL: https://issues.apache.org/jira/browse/HIVE-23193 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor bq. Better yet, use parameterized messages bq. Will outperform the first form by a factor of at least 30, in case of a disabled logging statement. http://www.slf4j.org/faq.html * Use parameterized logging where appropriate * Add logging guards {{if (Log.isDebugEnabled()}} around loops and complex debug message Simplify the code, remove lines of code, and potentially increase performance -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23194) Use Queue Instead of List for CollectOperator
David Mollitor created HIVE-23194: - Summary: Use Queue Instead of List for CollectOperator Key: HIVE-23194 URL: https://issues.apache.org/jira/browse/HIVE-23194 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor https://github.com/apache/hive/blob/d6948a28ab3e34e5116591a60a96bdf031185e47/ql/src/java/org/apache/hadoop/hive/ql/exec/CollectOperator.java#L85-L88 {code:java|title=CollectOperator.java} rowList = new ArrayList(); ... } else { result.o = rowList.remove(0); result.oi = standardRowInspector; } {code} Removing from the head of an {{ArrayList}} is an expensive operation because it needs to shift all of the elements down in the array for each call. Better to use a {{Queue}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23182) Semantic Exception: rule Identifier failed predicate allowQuotedId
David Mollitor created HIVE-23182: - Summary: Semantic Exception: rule Identifier failed predicate allowQuotedId Key: HIVE-23182 URL: https://issues.apache.org/jira/browse/HIVE-23182 Project: Hive Issue Type: Improvement Reporter: David Mollitor Querying a Hive Table (via Hiveserver2) with Column Masking enabled via Ranger Hive Plugin returns with an error. {code:none} [42000]: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.parse.ParseException: line 1:62 rule Identifier failed predicate: {allowQuotedId()}? line 1:74 rule Identifier failed predicate: {allowQuotedId()}? line 1:94 rule Identifier failed predicate: {allowQuotedId()}? line 1:117 rule Identifier failed predicate: {allowQuotedId()}? {code} Querying a Hive Table (via Hiveserver2) with Column Masking enabled via Ranger Hive Plugin returns with an error. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23183) Make TABLE Token Optional in TRUNCATE Statement
David Mollitor created HIVE-23183: - Summary: Make TABLE Token Optional in TRUNCATE Statement Key: HIVE-23183 URL: https://issues.apache.org/jira/browse/HIVE-23183 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor It's optional in MySQL, let's make it optional for Hive too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23258) Remove BoneCP Connection Pool
David Mollitor created HIVE-23258: - Summary: Remove BoneCP Connection Pool Key: HIVE-23258 URL: https://issues.apache.org/jira/browse/HIVE-23258 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor {quote} BoneCP is a Java JDBC connection pool implementation that is tuned for high performance by minimizing lock contention to give greater throughput for your application ... but SHOULD NOW BE CONSIDERED DEPRECATED in favour of HikariCP. {quote} https://github.com/wwadge/bonecp The default in Hive 3.x is already HikariCP, so just remove BoneCP in 4.x https://github.com/apache/hive/blob/branch-3.1/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java#L392 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23079) Remove Calls to printStackTrace in Module hive-serde
David Mollitor created HIVE-23079: - Summary: Remove Calls to printStackTrace in Module hive-serde Key: HIVE-23079 URL: https://issues.apache.org/jira/browse/HIVE-23079 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23077) Remove Calls to printStackTrace in Module hive-exec
David Mollitor created HIVE-23077: - Summary: Remove Calls to printStackTrace in Module hive-exec Key: HIVE-23077 URL: https://issues.apache.org/jira/browse/HIVE-23077 Project: Hive Issue Type: Sub-task Reporter: David Mollitor Assignee: David Mollitor Only one "tricky" change. Throw an Exception instead of {{printStackTrace}} in the static Driver loader as suggested from the reference here: https://github.com/mariadb-corporation/mariadb-connector-j/blob/3bc66153b51aca188afc50ff35a0123f16c099ed/src/main/java/org/mariadb/jdbc/Driver.java#L72 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23078) Remove HiveDriver SecurityManager Check
David Mollitor created HIVE-23078: - Summary: Remove HiveDriver SecurityManager Check Key: HIVE-23078 URL: https://issues.apache.org/jira/browse/HIVE-23078 Project: Hive Issue Type: Improvement Components: JDBC Reporter: David Mollitor Assignee: David Mollitor {code:java|title=HiveDriver.java} public HiveDriver() { // TODO Auto-generated constructor stub SecurityManager security = System.getSecurityManager(); if (security != null) { security.checkWrite("foobah"); } } {code} Not sure why it needs to write a file called "foobah" but I checked out some other JDBC drivers and they do nothing like this. Remove this check; remove the constructor. -- This message was sent by Atlassian Jira (v8.3.4#803005)