[jira] [Created] (HIVE-21524) Impala Engine

2019-03-27 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21524:
-

 Summary: Impala Engine
 Key: HIVE-21524
 URL: https://issues.apache.org/jira/browse/HIVE-21524
 Project: Hive
  Issue Type: New Feature
Affects Versions: 4.0.0
Reporter: David Mollitor


Now that Impala has "dedicated coordinator" capability, it could be interesting 
to pair HiveServer2 instances with Impala dedicated coordinators on the same 
localhost.  A client could request an 'impala' execution engine and subsequent 
queries would be routed to the local coordinator.

{code:sql}
set hive.execution.engine=impala;
{code}

This would allow clients seamless access to both capabilities without needing 
different connections or drivers, Hive would also be a central location for 
auditing and authorization.

https://www.cloudera.com/documentation/enterprise/latest/topics/impala_dedicated_coordinator.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21581) Remove Lock in GetInputSummary

2019-04-04 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21581:
-

 Summary: Remove Lock in GetInputSummary
 Key: HIVE-21581
 URL: https://issues.apache.org/jira/browse/HIVE-21581
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor
 Fix For: 4.0.0


Now that Hive compile lock has been relaxed in [HIVE-20535], remove the 
{{getInputSummary}} lock:

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2459]

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21515) Improvement to MoveTrash Facilities

2019-03-26 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21515:
-

 Summary: Improvement to MoveTrash Facilities
 Key: HIVE-21515
 URL: https://issues.apache.org/jira/browse/HIVE-21515
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-21515.1.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21414) Hive JSON SerDe Does Not Properly Handle Field Comments

2019-03-08 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21414:
-

 Summary: Hive JSON SerDe Does Not Properly Handle Field Comments
 Key: HIVE-21414
 URL: https://issues.apache.org/jira/browse/HIVE-21414
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


Field comments are handed to the JSON SerDe from HMS and then are ignored.  The 
result is that all field comments are 'from deserializer' and cannot be changed.

For example, Avro SerDe handles comments:

https://github.com/apache/hive/blob/release-1.1.0/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java#L133



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21466) Increase Default Size of SPLIT_MAXSIZE

2019-03-18 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21466:
-

 Summary: Increase Default Size of SPLIT_MAXSIZE
 Key: HIVE-21466
 URL: https://issues.apache.org/jira/browse/HIVE-21466
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-21466.1.patch

{code:java}
 MAPREDMAXSPLITSIZE(FileInputFormat.SPLIT_MAXSIZE, 25600L, "", true),
{code}
[https://github.com/apache/hive/blob/8d4300a02691777fc96f33861ed27e64fed72f2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L682]

This field specifies a maximum size for each MR (maybe other?) splits.

This number should be a multiple of the HDFS Block size. The way that this 
maximum is implemented, is that each block is added to the split, and if the 
split grows to be larger than the maximum allowed, the split is submitted to 
the cluster and a new split is opened.

So, imagine the following scenario:
 * HDFS block size of 16 bytes
 * Maximum size of 40 bytes

This will produce a split with 3 blocks. (2x16) = 32; another block will be 
inserted, (3x16) = 48 bytes in the split. So, while many operators would assume 
a split of 2 blocks, the actual is 3 blocks. Setting the maximum split size to 
a multiple of the HDFS block size will make this behavior less confusing.

The current setting is ~256MB and when this was introduced, the default HDFS 
block size was 64MB. That is a factor of 4x. However, now HDFS block sizes are 
128MB by default, so I propose setting this to 4x128MB.  The larger splits 
(fewer tasks) should give a nice performance boost for modern hardware.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21469) Review of ZooKeeperHiveLockManager

2019-03-18 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21469:
-

 Summary: Review of ZooKeeperHiveLockManager
 Key: HIVE-21469
 URL: https://issues.apache.org/jira/browse/HIVE-21469
 Project: Hive
  Issue Type: Improvement
  Components: Locking
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-21469.1.patch

A lot of sins in this class to resolve:

{code:java}
  @Override
  public void setContext(HiveLockManagerCtx ctx) throws LockException {
 try {
  curatorFramework = CuratorFrameworkSingleton.getInstance(conf);
  parent = conf.getVar(HiveConf.ConfVars.HIVE_ZOOKEEPER_NAMESPACE);
  try{
curatorFramework.create().withMode(CreateMode.PERSISTENT).forPath("/" + 
 parent, new byte[0]);
  } catch (Exception e) {
// ignore if the parent already exists
if (!(e instanceof KeeperException) || ((KeeperException)e).code() != 
KeeperException.Code.NODEEXISTS) {
  LOG.warn("Unexpected ZK exception when creating parent node /" + 
parent, e);
}
  }
{code}

Every time a new session is created and this {{setContext}} method is called, 
it attempts to create the root node.  I have seen that, even though the root 
node exists, an create node action is written to the ZK logs.  Check first if 
the node exists before trying to create it.

{code:java}
  try {
curatorFramework.delete().forPath(zLock.getPath());
  } catch (InterruptedException ie) {
curatorFramework.delete().forPath(zLock.getPath());
  }
{code}

There has historically been a quite a few bugs regarding leaked locks.  The 
Driver will signal the session {{Thread}} by performing an interrupt.  That 
interrupt can happen any time and it can kill a create/delete action within the 
ZK framework.  We can see one example of workaround for this.  If the ZK action 
is interrupted, simply do it again.  Well, what if it's interrupted yet again?  
The lock will be leaked anyway.  Also, when the {{InterruptedException}} is 
caught in the try block, the thread's interrupted flag is cleared.  The flag is 
not reset in this code and therefore we lose the fact that this thread has been 
interrupted.

{code:java}
if (tryNum > 1) {
  Thread.sleep(sleepTime);
}
unlockPrimitive(hiveLock, parent, curatorFramework);
break;
  } catch (Exception e) {
if (tryNum >= numRetriesForUnLock) {
  String name = ((ZooKeeperHiveLock)hiveLock).getPath();
  throw new LockException("Node " + name + " can not be deleted after " 
+ numRetriesForUnLock + " attempts.",
  e);
}
  }
{code}

... related... the sleep here may be interrupted, but we still need to delete 
the lock (again, for fear of leaking it).  This sleep should be 
uninterruptible.  If we need to get the lock deleted, and there's a problem, 
interrupting the sleep will cause the code to eventually exit and locks will be 
leaked.

It also requires a bunch more TLC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21433) Doc: Remove Reference to hive.stats.avg.row.size

2019-03-12 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21433:
-

 Summary: Doc: Remove Reference to hive.stats.avg.row.size
 Key: HIVE-21433
 URL: https://issues.apache.org/jira/browse/HIVE-21433
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 4.0.0
Reporter: David Mollitor


[https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties]

 

Remove reference to {{hive.stats.avg.row.size}}.  I think it's been replaced by 
{{hive.stats.max.variable.length}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21425) Use newDirectExecutorService for getInputSummary

2019-03-11 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21425:
-

 Summary: Use newDirectExecutorService for getInputSummary
 Key: HIVE-21425
 URL: https://issues.apache.org/jira/browse/HIVE-21425
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


{code:java|title=Utilities.java}
  int numExecutors = getMaxExecutorsForInputListing(ctx.getConf(), 
pathNeedProcess.size());
  if (numExecutors > 1) {
LOG.info("Using {} threads for getContentSummary", numExecutors);
executor = Executors.newFixedThreadPool(numExecutors,
new ThreadFactoryBuilder().setDaemon(true)
.setNameFormat("Get-Input-Summary-%d").build());
  } else {
executor = null;
  }
{code}

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L2482-L2490

Instead of using a 'null' {{ExecutorService}}, use Guava's 
{{DirectExecutorService}} and remove special casing for a 'null' value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21426) Remove Utilities Global Random

2019-03-11 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21426:
-

 Summary: Remove Utilities Global Random
 Key: HIVE-21426
 URL: https://issues.apache.org/jira/browse/HIVE-21426
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L253

Remove global {{Random}} object in favor of {{ThreadLocalRandom}}.

{quote}
ThreadLocalRandom is initialized with an internally generated seed that may not 
otherwise be modified. When applicable, use of ThreadLocalRandom rather than 
shared Random objects in concurrent programs will typically encounter much less 
overhead and contention.
{quote}

https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21748) HBase Operations Can Fail When Using MAPREDLOCAL

2019-05-17 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21748:
-

 Summary: HBase Operations Can Fail When Using MAPREDLOCAL
 Key: HIVE-21748
 URL: https://issues.apache.org/jira/browse/HIVE-21748
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


https://github.com/apache/hive/blob/5634140b2beacdac20ceec8c73ff36bce5675ef8/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java#L258-L262

{code:java|title=HBaseStorageHandler.java}
if (this.configureInputJobProps) {
  LOG.info("Configuring input job properties");
...
  try {
addHBaseDelegationToken(jobConf);
  } catch (IOException | MetaException e) {
throw new IllegalStateException("Error while configuring input job 
properties", e);
  }
   }
  else {
LOG.info("Configuring output job properties");
...
  }
{code}

What we can see here is that the HBase Delegation Token is only created when 
there is an input job (reading from an HBase source).  For a particular stage 
of a query, if there is no HBASE input, only HBASE output, then the delegation 
token is not created and will cause a failure.

{code:none|title=Error Message in HS2 Log}
2019-05-17 10:24:55,036 ERROR org.apache.hive.service.cli.operation.Operation: 
[HiveServer2-Background-Pool: Thread-388]: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
at 
org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at 
org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}


You can tell it will fail because an HDFS Token will be created, but it will 
not report an HBASE token in the HS2 logs.  The following is an example of a 
proper setup.  If it is missing the HBASE_AUTH_TOKEN it will fail because it 
will try to initiate Kerberos handshake and fail.

{code:none|title=Logging of a Proper Run}
2019-05-17 10:36:15,593 INFO  org.apache.hadoop.mapreduce.JobSubmitter: 
[HiveServer2-Background-Pool: Thread-455]: Submitting tokens for job: 
job_1557858663665_0048
2019-05-17 10:36:15,593 INFO  org.apache.hadoop.mapreduce.JobSubmitter: 
[HiveServer2-Background-Pool: Thread-455]: Kind: HDFS_DELEGATION_TOKEN, 
Service: 10.17.101.237:8020, Ident: (token for hive: HDFS_DELEGATION_TOKEN 
owner=hive/host-10-17-102-135.coe.cloudera@example.com, renewer=yarn, 
realUser=, issueDate=1558114574357, maxDate=1558719374357, sequenceNumber=75, 
masterKeyId=4)
2019-05-17 10:36:15,593 INFO  org.apache.hadoop.mapreduce.JobSubmitter: 
[HiveServer2-Background-Pool: Thread-455]: Kind: HBASE_AUTH_TOKEN, Service: 
9b282733-7927-4785-92ea-dad419f6f055, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@b1)
2019-05-17 10:36:15,859 INFO  
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: 
[HiveServer2-Background-Pool: Thread-455]: Submitted application 
application_1557858663665_0048
{code}

Error message in the Local MapReduce log.

{code:none|title=Error message}
2019-05-10 07:43:24,875 WARN  [htable-pool2-t1]: security.UserGroupInformation 
(UserGroupInformation.java:doAs(1927)) - PriviledgedActionException as:hive 
(auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed 
[Caused by GSSException: No valid credentials provided (Mechanism level: Failed 
to find any Kerberos tgt)]
2019-05-10 07:43:24,876 WARN  [htable-pool2-t1]: ipc.RpcClientImpl 
(RpcClientImpl.java:run(675)) - Exception encountered while connecting to the 
server : javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2019-05-10 07:43:24,876 ERROR [htable-pool2-t1]: ipc.RpcClientImpl 
(RpcClientImpl.java:run(685)) - SASL authentication failed. The most likely 
cause is missing 

[jira] [Created] (HIVE-21747) Remove Dependency on org.cliffc.high_scale_lib.Counter

2019-05-17 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21747:
-

 Summary: Remove Dependency on org.cliffc.high_scale_lib.Counter
 Key: HIVE-21747
 URL: https://issues.apache.org/jira/browse/HIVE-21747
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


[https://github.com/apache/hive/blob/5634140b2beacdac20ceec8c73ff36bce5675ef8/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java#L327]

 

{code:java}
  static {
try {
  counterClass = Class.forName("org.cliffc.high_scale_lib.Counter");
} catch (ClassNotFoundException cnfe) {
  // this dependency is removed for HBase 1.0
}
{code}

I think this _counterClass_ stuff can be removed now that Hive is firmly on 
HBase 1.0+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21792) Hive Indexes... Again

2019-05-24 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21792:
-

 Summary: Hive Indexes... Again
 Key: HIVE-21792
 URL: https://issues.apache.org/jira/browse/HIVE-21792
 Project: Hive
  Issue Type: New Feature
  Components: Indexing
Reporter: David Mollitor


Hive had an implementation of indexing that was made somewhat obsolete given 
the introduction of columnar file formats with their own internal indexing.

I propose that Hive introduce Indexing again.

# Column Index: Stored in HBase
# Full-Text Index: Stored in Solr

The basic idea is that, the key in HBase is the record and the value is the 
relative file path of the data in the Hive table.

Performing an INSERT statement creates the index for each record.

https://dev.mysql.com/doc/refman/8.0/en/create-index.html

When generating the explain plan, only the files involved in the query are 
considered.

This would prevents having to scan large amounts of data for the typical BI 
tools when the set of data is known to be very small.

{code:sql}
-- Quick retrieval of small sets of records
select * from user where userid=27;

-- Full scans
select count(1) from user;
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21727) Allow For Ordinal Substitution

2019-05-14 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21727:
-

 Summary: Allow For Ordinal Substitution 
 Key: HIVE-21727
 URL: https://issues.apache.org/jira/browse/HIVE-21727
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


Impala allows for ordinal substitution.  Add a compatible feature to Hive to 
allow Hive to be more compatible with Impala.  Allows for more of a drop-in 
replacement.

[IMPALA-8548]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21655) Add Re-Try to LdapSearchFactory

2019-04-26 Thread David Mollitor (JIRA)
David Mollitor created HIVE-21655:
-

 Summary: Add Re-Try to LdapSearchFactory
 Key: HIVE-21655
 URL: https://issues.apache.org/jira/browse/HIVE-21655
 Project: Hive
  Issue Type: Improvement
  Components: Authentication
Affects Versions: 4.0.0, 3.2.0
 Environment: It may be the case that LDAP service is temporarily 
unreachable.  Please implement a re-try facility here:

https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/auth/ldap/LdapSearchFactory.java#L41
Reporter: David Mollitor






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-22078) Upgrade arrow version to 0.14.1

2019-08-02 Thread David Mollitor (JIRA)
David Mollitor created HIVE-22078:
-

 Summary: Upgrade arrow version to 0.14.1
 Key: HIVE-22078
 URL: https://issues.apache.org/jira/browse/HIVE-22078
 Project: Hive
  Issue Type: Task
Affects Versions: 4.0.0
Reporter: David Mollitor






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22217) Better Logging for Hive JAR Reload

2019-09-18 Thread David Mollitor (Jira)
David Mollitor created HIVE-22217:
-

 Summary: Better Logging for Hive JAR Reload
 Key: HIVE-22217
 URL: https://issues.apache.org/jira/browse/HIVE-22217
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 2.3.6, 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor


Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult.

Add logging to at least confirm which JAR files are being loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22032) Allow Hive JSON SerDe To Be Case Insensitive for Field Names

2019-07-22 Thread David Mollitor (JIRA)
David Mollitor created HIVE-22032:
-

 Summary: Allow Hive JSON SerDe To Be Case Insensitive for Field 
Names
 Key: HIVE-22032
 URL: https://issues.apache.org/jira/browse/HIVE-22032
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


https://fasterxml.github.io/jackson-databind/javadoc/2.9/com/fasterxml/jackson/databind/MapperFeature.html#ACCEPT_CASE_INSENSITIVE_PROPERTIES



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22445) LazySimpleSerDe toString is not Correct

2019-11-01 Thread David Mollitor (Jira)
David Mollitor created HIVE-22445:
-

 Summary: LazySimpleSerDe toString is not Correct
 Key: HIVE-22445
 URL: https://issues.apache.org/jira/browse/HIVE-22445
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22445.1.patch

{code:none}
2019-11-01T10:03:49,228  INFO [pool-23-thread-1] exec.FileSinkOperator: Using 
serializer : class 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe[[[B@983dd25]:[_col0, 
_col1]:[struct

[jira] [Created] (HIVE-22443) HBase Maven site configuration causes Hive project to get a directory named ${project.basedir}

2019-11-01 Thread David Mollitor (Jira)
David Mollitor created HIVE-22443:
-

 Summary: HBase Maven site configuration causes Hive project to get 
a directory named ${project.basedir}
 Key: HIVE-22443
 URL: https://issues.apache.org/jira/browse/HIVE-22443
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor


Upgrade HBase versions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22444) Clean up Project POM Files

2019-11-01 Thread David Mollitor (Jira)
David Mollitor created HIVE-22444:
-

 Summary: Clean up Project POM Files
 Key: HIVE-22444
 URL: https://issues.apache.org/jira/browse/HIVE-22444
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


# Address warnings in the build process
 # Use DependencyManagement in Root POM for ITest (see HIVE-22426)
 # General POM cleanup



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22447) Update HBase Version to GA

2019-11-01 Thread David Mollitor (Jira)
David Mollitor created HIVE-22447:
-

 Summary: Update HBase Version to GA
 Key: HIVE-22447
 URL: https://issues.apache.org/jira/browse/HIVE-22447
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor


Currently at:

{code:none}
2.0.0-alpha4
{code}

Upgrade to a GA release



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22475) Update slf4j to 1.7.25

2019-11-08 Thread David Mollitor (Jira)
David Mollitor created HIVE-22475:
-

 Summary: Update slf4j to 1.7.25
 Key: HIVE-22475
 URL: https://issues.apache.org/jira/browse/HIVE-22475
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


Druid handler is already there.  Updating will allow the entire project to be 
on the same version.

https://github.com/apache/hive/blob/38190f3e95752c85188682d8a78d259455e173c2/itests/qtest-druid/pom.xml#L228



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22469) Lower Metastore DB Connection Pool Size in QTests

2019-11-07 Thread David Mollitor (Jira)
David Mollitor created HIVE-22469:
-

 Summary: Lower Metastore DB Connection Pool Size in QTests
 Key: HIVE-22469
 URL: https://issues.apache.org/jira/browse/HIVE-22469
 Project: Hive
  Issue Type: Improvement
  Components: Test, Tests
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor


Hive Metastore uses the 'HikariCP' database connection pool.  The default 
number of connections to the database is 10.  For the Qtests, connecting to a 
local DerbyDB, there need not be more than 1 connection.  Anymore simply adds 
undo overhead and by looking at the QTest logs, I see a bunch of 'connection 
refused' from HikariCP.  It may be the case that the standalone DB does not 
support that many concurrent connections anyway.


https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22462) Error Information Lost in GenericUDTFGetSplits

2019-11-05 Thread David Mollitor (Jira)
David Mollitor created HIVE-22462:
-

 Summary: Error Information Lost in GenericUDTFGetSplits
 Key: HIVE-22462
 URL: https://issues.apache.org/jira/browse/HIVE-22462
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22462.1.patch

I was recently looking at some logs from a failed unit test and saw:

 
{code:none}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create temp table: 
nullCaused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create temp table: 
null at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits2.process(GenericUDTFGetSplits2.java:81)
 at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116) 
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) 
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.jav 
{code}

Error information was lost... useless 'null' string is written.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22441) Metrics Subsytem Improvements

2019-10-31 Thread David Mollitor (Jira)
David Mollitor created HIVE-22441:
-

 Summary: Metrics Subsytem Improvements
 Key: HIVE-22441
 URL: https://issues.apache.org/jira/browse/HIVE-22441
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor


# CodahaleMetrics uses Guava LoadingCache, which is already thread-safe, and 
then puts an explicit lock around the structure.  Use Java 8 new Map API with 
ConcurrentHashMap.
# Introduce Java 8 APIs
# Simplifications
# Updated unit tests to no longer include a 'sleep'

https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java#L91-L94




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22493) Scheduled Query Execution Failure in Tests

2019-11-13 Thread David Mollitor (Jira)
David Mollitor created HIVE-22493:
-

 Summary: Scheduled Query Execution Failure in Tests
 Key: HIVE-22493
 URL: https://issues.apache.org/jira/browse/HIVE-22493
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22491) Use Collections emptyList

2019-11-13 Thread David Mollitor (Jira)
David Mollitor created HIVE-22491:
-

 Summary: Use Collections emptyList
 Key: HIVE-22491
 URL: https://issues.apache.org/jira/browse/HIVE-22491
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.2.0
 Environment: 
https://docs.oracle.com/javase/8/docs/api/?java/util/Collections.html

Use Collections#emptyList instead of instantiating empty ArrayLists
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22491.1.patch





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22496) Update Hadoop Version to 3.1.1

2019-11-14 Thread David Mollitor (Jira)
David Mollitor created HIVE-22496:
-

 Summary: Update Hadoop Version to 3.1.1
 Key: HIVE-22496
 URL: https://issues.apache.org/jira/browse/HIVE-22496
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: David Mollitor
Assignee: David Mollitor


https://lists.apache.org/thread.html/8313e605c0ed0012f134cce9cc6adca738eea81feccea99c8de87cd9@%3Cgeneral.hadoop.apache.org%3E

{quote}
   - This release is *not* yet ready for production use. Critical issues
   are being ironed out via testing and downstream adoption. Production
users should wait for a 3.1.1/3.1.2 release.
{quote}

Current:
{code:xml}
3.1.0
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22494) Use System NanoTime to Measure Code Execution

2019-11-13 Thread David Mollitor (Jira)
David Mollitor created HIVE-22494:
-

 Summary: Use System NanoTime to Measure Code Execution
 Key: HIVE-22494
 URL: https://issues.apache.org/jira/browse/HIVE-22494
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime()

It's designed for these use cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22484) Remove Calls to printStackTrace

2019-11-12 Thread David Mollitor (Jira)
David Mollitor created HIVE-22484:
-

 Summary: Remove Calls to printStackTrace
 Key: HIVE-22484
 URL: https://issues.apache.org/jira/browse/HIVE-22484
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor


In many cases, the call to {{printStackTrace}} bypasses the logging framework, 
in other cases, the error stack trace is printed and the exception is re-thrown 
(log-and-throw is a bad pattern), and then there are some other edge cases.

Remove this call and replace with calls to the logging framework.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22503) Harmonize JODA Time Version in Module hive-hcatalog-it-unit

2019-11-15 Thread David Mollitor (Jira)
David Mollitor created HIVE-22503:
-

 Summary: Harmonize JODA Time Version in Module 
hive-hcatalog-it-unit
 Key: HIVE-22503
 URL: https://issues.apache.org/jira/browse/HIVE-22503
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


https://github.com/apache/hive/blob/078182ade4b76e810ca945354f4897dbe36ad5c2/itests/hcatalog-unit/pom.xml#L296

Currently hard-coded as version 2.2

See if we can get away with using the same version of Joda as the rest of the 
Hive project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22529) Make Debugging Stacktrace More Explicit

2019-11-22 Thread David Mollitor (Jira)
David Mollitor created HIVE-22529:
-

 Summary: Make Debugging Stacktrace More Explicit
 Key: HIVE-22529
 URL: https://issues.apache.org/jira/browse/HIVE-22529
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


In some places, the following DEBUG logging was introduced:

{code:java}
LOG.debug("Message", new Exception());
{code}

The purpose of this is to log the stack trace of the Thread calling this debug 
logging method.  However, the resulting log message includes the following:

{code:none}
2019-11-19T08:13:31,392 DEBUG [Thread] Logger: Message
java.lang.Exception: null
 at 
{code}

To the observer, it appears that there was perhaps some sort of NPE.  Add a 
message to the Exception being generated to make it more clear that this 
"Exception" is for debugging purposes and not an actual error.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22569) PartitionPruner use Collections Class

2019-12-03 Thread David Mollitor (Jira)
David Mollitor created HIVE-22569:
-

 Summary: PartitionPruner use Collections Class
 Key: HIVE-22569
 URL: https://issues.apache.org/jira/browse/HIVE-22569
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22570) Review of ExprNodeDesc.java

2019-12-03 Thread David Mollitor (Jira)
David Mollitor created HIVE-22570:
-

 Summary: Review of ExprNodeDesc.java
 Key: HIVE-22570
 URL: https://issues.apache.org/jira/browse/HIVE-22570
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22570.1.patch





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22571) Review of ExprNodeFieldDesc Class

2019-12-03 Thread David Mollitor (Jira)
David Mollitor created HIVE-22571:
-

 Summary: Review of ExprNodeFieldDesc Class
 Key: HIVE-22571
 URL: https://issues.apache.org/jira/browse/HIVE-22571
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22607) Session State Thread Names Are Way Too Long

2019-12-09 Thread David Mollitor (Jira)
David Mollitor created HIVE-22607:
-

 Summary: Session State Thread Names Are Way Too Long
 Key: HIVE-22607
 URL: https://issues.apache.org/jira/browse/HIVE-22607
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor


{code:none}
2019-12-08T07:15:34,107 DEBUG [bc661ab9-b44a-4ee5-8ce6-b8b7ae5a0e07 
d480b94c-8773-4817-85fc-faa177308660 73ecdf15-06fd-4358-a249-22623c98d234 
37e3c8de-9c77-4939-94d0-5aa45501b545 7b152899-d8da-493f-a329-a16f171bb1a3 
604abf47-22ed-480c-a460-f6d44ac740ec 91fee8a8-aee4-4eee-993f-ad2927784d57 
c496e18f-a681-4db8-b254-6911793e7fb2 c6069d3c-03fb-4e1d-b0be-5dd13882b086 
3c360edd-9c0f-486b-99f9-dd9f8aac5fcd c2be6f69-3ef5-44e4-b8d5-16b835b0ae2f 
3bd2c6d2-9cce-4497-aa07-4672b95e6f76 ccb64f47-b1d7-477b-81ed-f8baabae4ee4 
8b244038-d7a3-4e11-ab2f-b42d117c8e40 8183bded-ef37-4bdc-ab17-a4c4b136401d 
8b161c72-0fc2-4175-ac9f-9a0c7bf9b387 38fd3a6d-498a-4145-b77e-24e33c31edaf 
70f2729d-7249-4397-b04f-22694fec391e c65a3e39-009d-4e8a-b16e-1ca0e9de7cef 
1dd2b274-75bd-4204-80b9-0103fefd0227 0bab6264-40ff-4f05-8d11-0cc9ee3132c0 
0e7e677b-3f24-408b-ac1e-ddeee54023b8 8fd57ca5-2128-45a1-b7c4-ec611bfdefcd 
01bf788e-cf98-44e8-abaa-c9e74e782bde 9d27f963-b45b-4991-baa8-2ed2db41857d 
207da47a-b3b2-4583-b654-61fc551f4eca a10d8b77-8f27-48ca-a3b5-e5cfcbf8a4c0 
352e77d5-3071-4502-bb0e-a6d0df8d1cac 18697755-6e2b-4907-ae8b-11ee0ff2f057 
d779ca40-0760-4b27-b0b1-c35d985359c1 ca4469aa-7be7-4eaf-9bcd-aeff6441c7c0 
30ce1107-8e95-4c16-b55a-5a7c4dea0696 ee44b107-07ae-45c9-88a4-0424da8a6bcb main] 
session.SessionState: Updating thread name to 
749f4123-716a-4534-af7f-b426fdc3ccc9 bc661ab9-b44a-4ee5-8ce6-b8b7ae5a0e07 
d480b94c-8773-4817-85fc-faa177308660 73ecdf15-06fd-4358-a249-22623c98d234 
37e3c8de-9c77-4939-94d0-5aa45501b545 7b152899-d8da-493f-a329-a16f171bb1a3 
604abf47-22ed-480c-a460-f6d44ac740ec 91fee8a8-aee4-4eee-993f-ad2927784d57 
c496e18f-a681-4db8-b254-6911793e7fb2 c6069d3c-03fb-4e1d-b0be-5dd13882b086 
3c360edd-9c0f-486b-99f9-dd9f8aac5fcd c2be6f69-3ef5-44e4-b8d5-16b835b0ae2f 
3bd2c6d2-9cce-4497-aa07-4672b95e6f76 ccb64f47-b1d7-477b-81ed-f8baabae4ee4 
8b244038-d7a3-4e11-ab2f-b42d117c8e40 8183bded-ef37-4bdc-ab17-a4c4b136401d 
8b161c72-0fc2-4175-ac9f-9a0c7bf9b387 38fd3a6d-498a-4145-b77e-24e33c31edaf 
70f2729d-7249-4397-b04f-22694fec391e c65a3e39-009d-4e8a-b16e-1ca0e9de7cef 
1dd2b274-75bd-4204-80b9-0103fefd0227 0bab6264-40ff-4f05-8d11-0cc9ee3132c0 
0e7e677b-3f24-408b-ac1e-ddeee54023b8 8fd57ca5-2128-45a1-b7c4-ec611bfdefcd 
01bf788e-cf98-44e8-abaa-c9e74e782bde 9d27f963-b45b-4991-baa8-2ed2db41857d 
207da47a-b3b2-4583-b654-61fc551f4eca a10d8b77-8f27-48ca-a3b5-e5cfcbf8a4c0 
352e77d5-3071-4502-bb0e-a6d0df8d1cac 18697755-6e2b-4907-ae8b-11ee0ff2f057 
d779ca40-0760-4b27-b0b1-c35d985359c1 ca4469aa-7be7-4eaf-9bcd-aeff6441c7c0 
30ce1107-8e95-4c16-b55a-5a7c4dea0696 ee44b107-07ae-45c9-88a4-0424da8a6bcb main
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22605) NPE in LlapLoadGeneratorService During Tests

2019-12-09 Thread David Mollitor (Jira)
David Mollitor created HIVE-22605:
-

 Summary: NPE in LlapLoadGeneratorService During Tests
 Key: HIVE-22605
 URL: https://issues.apache.org/jira/browse/HIVE-22605
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


{code:none}
java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.llap.daemon.impl.LlapLoadGeneratorService.serviceStop(LlapLoadGeneratorService.java:103)
 ~[classes/:?]
at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220) 
~[hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54) 
~[hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
 [hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) 
[hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.hive.llap.daemon.impl.TestLlapLoadGeneratorService.testLoadGeneratorFails(TestLlapLoadGeneratorService.java:70)
 [test-classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_102]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22614) Replace Base64 in hive-jdbc Package

2019-12-10 Thread David Mollitor (Jira)
David Mollitor created HIVE-22614:
-

 Summary: Replace Base64 in hive-jdbc Package
 Key: HIVE-22614
 URL: https://issues.apache.org/jira/browse/HIVE-22614
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22613) Replace Base64 in hive-hbase-handler

2019-12-10 Thread David Mollitor (Jira)
David Mollitor created HIVE-22613:
-

 Summary: Replace Base64 in hive-hbase-handler
 Key: HIVE-22613
 URL: https://issues.apache.org/jira/browse/HIVE-22613
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22611) Use JDK Base64 Classes

2019-12-10 Thread David Mollitor (Jira)
David Mollitor created HIVE-22611:
-

 Summary: Use JDK Base64 Classes
 Key: HIVE-22611
 URL: https://issues.apache.org/jira/browse/HIVE-22611
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


Replace dependency on thirdparty libraries with native support for Base-64 
encode/decode.

https://docs.oracle.com/javase/8/docs/api/java/util/Base64.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22612) Replace Base64 in accumulo-handler Package

2019-12-10 Thread David Mollitor (Jira)
David Mollitor created HIVE-22612:
-

 Summary: Replace Base64 in accumulo-handler Package
 Key: HIVE-22612
 URL: https://issues.apache.org/jira/browse/HIVE-22612
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22615) Replace Base64 in hive-common Package

2019-12-10 Thread David Mollitor (Jira)
David Mollitor created HIVE-22615:
-

 Summary: Replace Base64 in hive-common Package
 Key: HIVE-22615
 URL: https://issues.apache.org/jira/browse/HIVE-22615
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22597) Include More Context in Database NoSuchObjectException

2019-12-07 Thread David Mollitor (Jira)
David Mollitor created HIVE-22597:
-

 Summary: Include More Context in Database NoSuchObjectException
 Key: HIVE-22597
 URL: https://issues.apache.org/jira/browse/HIVE-22597
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22597.1.patch

{code:none}
org.apache.hadoop.hive.metastore.api.NoSuchObjectException: default
at 
org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:717) 
~[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.GeneratedMethodAccessor260.invoke(Unknown Source) ~[?:?]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_102]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102]
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
~[hive-standalone-metastore-server-4.0.0-SN
{code}

One can decipher that this exception is in regards to a database by looking at 
the stack trace, but it should be specified in the error message itself.  Also, 
there is no catalogue information provided, so it could be a bit ambiguous.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22415) Upgrade to Java 11

2019-10-28 Thread David Mollitor (Jira)
David Mollitor created HIVE-22415:
-

 Summary: Upgrade to Java 11
 Key: HIVE-22415
 URL: https://issues.apache.org/jira/browse/HIVE-22415
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22417) Remove stringifyException from MetaStore

2019-10-28 Thread David Mollitor (Jira)
David Mollitor created HIVE-22417:
-

 Summary: Remove stringifyException from MetaStore
 Key: HIVE-22417
 URL: https://issues.apache.org/jira/browse/HIVE-22417
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore, Standalone Metastore
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22370) Remove Deprecated Fields from HiveConf

2019-10-18 Thread David Mollitor (Jira)
David Mollitor created HIVE-22370:
-

 Summary: Remove Deprecated Fields from HiveConf
 Key: HIVE-22370
 URL: https://issues.apache.org/jira/browse/HIVE-22370
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: David Mollitor
Assignee: David Mollitor
 Fix For: 4.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22427) PersistenceManagerProvider Logs a Warning About datanucleus.autoStartMechanismMode

2019-10-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22427:
-

 Summary: PersistenceManagerProvider Logs a Warning About 
datanucleus.autoStartMechanismMode
 Key: HIVE-22427
 URL: https://issues.apache.org/jira/browse/HIVE-22427
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor


{code:none}
WARN [pool-6-thread-2] metastore.PersistenceManagerProvider: 
datanucleus.autoStartMechanismMode is set to unsupported value null . Setting 
it to value: ignored
{code}

This does not need to be a WARN level logging for this scenario.  Perhaps if 
user configures the value to some non-null value, then emit a warning, 
otherwise, simply emit an INFO level stating that the configuration is not set 
and that a reasonable default value will be used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22426) Use DependencyManagement in Root POM for itests

2019-10-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22426:
-

 Summary: Use DependencyManagement in Root POM for itests
 Key: HIVE-22426
 URL: https://issues.apache.org/jira/browse/HIVE-22426
 Project: Hive
  Issue Type: Improvement
  Components: Test, Tests
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22428) Superfluous "Failed to get database" WARN Logging in ObjectStore

2019-10-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22428:
-

 Summary: Superfluous "Failed to get database" WARN Logging in 
ObjectStore
 Key: HIVE-22428
 URL: https://issues.apache.org/jira/browse/HIVE-22428
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22428.1.patch

In my testing, I get lots of logs like this:

{code:none}
Line 26319: 2019-10-28T21:09:52,134  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.compdb, returning 
NoSuchObjectException
Line 26327: 2019-10-28T21:09:52,135  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.compdb, returning 
NoSuchObjectException
Line 26504: 2019-10-28T21:09:52,600  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.tstatsfast, returning 
NoSuchObjectException
Line 26519: 2019-10-28T21:09:52,606  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.tstatsfast, returning 
NoSuchObjectException
Line 26695: 2019-10-28T21:09:52,922  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.createDb, returning 
NoSuchObjectException
Line 26703: 2019-10-28T21:09:52,923  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.createDb, returning 
NoSuchObjectException
Line 26763: 2019-10-28T21:09:52,936  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.compdb, returning 
NoSuchObjectException
Line 26778: 2019-10-28T21:09:52,939  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.compdb, returning 
NoSuchObjectException
Line 26963: 2019-10-28T21:09:53,273  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.db1, returning 
NoSuchObjectException
Line 26978: 2019-10-28T21:09:53,276  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.db2, returning 
NoSuchObjectException
Line 26986: 2019-10-28T21:09:53,277  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.db1, returning 
NoSuchObjectException
Line 27018: 2019-10-28T21:09:53,300  WARN [pool-6-thread-5] 
metastore.ObjectStore: Failed to get database hive.db2, returning 
NoSuchObjectException
{code}

This is a superfluous log message.  It might be pretty common for a database to 
not exists if, for example, a user fat-fingers the name of the database.  The 
code also has the bad habit of log-and-throw.  Just log or throw, not both.

Since I'm looking at this class, touch up some of the other logging as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22421) Improve Logging If Configuration File Not Found

2019-10-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22421:
-

 Summary: Improve Logging If Configuration File Not Found
 Key: HIVE-22421
 URL: https://issues.apache.org/jira/browse/HIVE-22421
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor


{code:none}
2019-10-28T21:07:27,599  INFO [main] conf.MetastoreConf: Unable to find config 
file metastore-site.xml
2019-10-28T21:07:27,599  INFO [main] conf.MetastoreConf: Found configuration 
file null
{code}

Prints 'unable to find' followed by 'null'.  Just print one or the other.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22423) Improve Logging In HadoopThriftAuthBridge

2019-10-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22423:
-

 Summary: Improve Logging In HadoopThriftAuthBridge
 Key: HIVE-22423
 URL: https://issues.apache.org/jira/browse/HIVE-22423
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


# Remove superfluous debug log guards
# Improve messages
# Improve message format



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22419) Improve Messages Emitted From HiveMetaStoreClient

2019-10-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22419:
-

 Summary: Improve Messages Emitted From HiveMetaStoreClient
 Key: HIVE-22419
 URL: https://issues.apache.org/jira/browse/HIVE-22419
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Affects Versions: 4.0.0, 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor


After reviewing some logs and errors emitted during a QTest run, I would like 
to propose some improvements to logging in {{HiveMetaStoreClient}}. 

* Remove duplicate logging
* Remove superfluous class {{StackTraceLogger}}
* Do not use contractions in public-facing error messages and logs
* Make all logging side-effect free (see {{connCount}})
* Code simplification



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22424) User PerfLogger in MetastoreDirectSqlUtils.java

2019-10-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22424:
-

 Summary: User PerfLogger in MetastoreDirectSqlUtils.java
 Key: HIVE-22424
 URL: https://issues.apache.org/jira/browse/HIVE-22424
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Affects Versions: 3.2.0
Reporter: David Mollitor
 Fix For: 4.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22425) ReplChangeManager Not Logging Database Name

2019-10-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22425:
-

 Summary: ReplChangeManager Not Logging Database Name
 Key: HIVE-22425
 URL: https://issues.apache.org/jira/browse/HIVE-22425
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22425.1.patch

{code:java|title=ReplChangeManager.java}
LOG.debug("Repl policy is not set for database ", db.getName());
{code}

The log statement is missing the placeholder '{}' so the DB name is not getting 
logged.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22390) Remove Dependency on JODA Time Library

2019-10-22 Thread David Mollitor (Jira)
David Mollitor created HIVE-22390:
-

 Summary: Remove Dependency on JODA Time Library
 Key: HIVE-22390
 URL: https://issues.apache.org/jira/browse/HIVE-22390
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


Hive uses Joda time library.

{quote}
Joda-Time is the de facto standard date and time library for Java prior to Java 
SE 8. Users are now asked to migrate to java.time (JSR-310).

https://www.joda.org/joda-time/
{quote}

Remove this dependency from classes, POM files, and licence files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22402) Deprecate Hive PerfLogger

2019-10-25 Thread David Mollitor (Jira)
David Mollitor created HIVE-22402:
-

 Summary: Deprecate Hive PerfLogger
 Key: HIVE-22402
 URL: https://issues.apache.org/jira/browse/HIVE-22402
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22404) Upgrade to Java 9

2019-10-25 Thread David Mollitor (Jira)
David Mollitor created HIVE-22404:
-

 Summary: Upgrade to Java 9
 Key: HIVE-22404
 URL: https://issues.apache.org/jira/browse/HIVE-22404
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22403) Beeline Should Print Location of Configuration Directory at Startup

2019-10-25 Thread David Mollitor (Jira)
David Mollitor created HIVE-22403:
-

 Summary: Beeline Should Print Location of Configuration Directory 
at Startup
 Key: HIVE-22403
 URL: https://issues.apache.org/jira/browse/HIVE-22403
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
Affects Versions: 2.4.0, 3.2.0
Reporter: David Mollitor


Beeline should print the CONF directory it is utilizing when it starts up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22547) Review txn compactor Package

2019-11-26 Thread David Mollitor (Jira)
David Mollitor created HIVE-22547:
-

 Summary: Review txn compactor Package
 Key: HIVE-22547
 URL: https://issues.apache.org/jira/browse/HIVE-22547
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


* Remove log-and-throw anti-pattern
* Use parameterized logging
* Add a CompactionException class to improve debug-ability
* Introduce Java Optional utility
* Other clean up



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22337) Improve and Expand Text-Based SerDes

2019-10-14 Thread David Mollitor (Jira)
David Mollitor created HIVE-22337:
-

 Summary: Improve and Expand Text-Based SerDes
 Key: HIVE-22337
 URL: https://issues.apache.org/jira/browse/HIVE-22337
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 4.0.0
Reporter: David Mollitor
Assignee: David Mollitor
 Fix For: 4.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22993) Include Bloom Filter in Column Statistics to Better Estimate nDV

2020-03-06 Thread David Mollitor (Jira)
David Mollitor created HIVE-22993:
-

 Summary: Include Bloom Filter in Column Statistics to Better 
Estimate nDV
 Key: HIVE-22993
 URL: https://issues.apache.org/jira/browse/HIVE-22993
 Project: Hive
  Issue Type: Improvement
  Components: CBO, Statistics
Reporter: David Mollitor


When performing an INSERT statement, Hive has no way to determine the number of 
distinct values since the distinct values themselves are not recorded.

{code:sql}
create table test_mm(`id` int, `my_dt` date);

insert into test_mm values (1, "2018-10-01"), (2, "2018-10-01"), (3, 
"2018-10-01"),
(4, "2017-10-01"), (5, "2017-10-01"), (6, "2017-10-01"),
(7, "2010-10-01"), (8, "2010-10-01"), (9, "2010-10-01"),
(10, "1998-10-01"), (11, "1998-10-01"), (12, "1998-10-01");

DESCRIBE FORMATTED test_mm my_dt;
-- distinct_count: 4

insert into test_mm values (13, "2030-10-01"), (14, "2030-10-01"), (15, 
"2030-10-01");

DESCRIBE FORMATTED test_mm my_dt;
-- distinct_count: 4
{code}

The first INSERT statement sees that there are 0 records, so it makes sense 
that any distinct values marked in the statistics.  However, for the second 
INSERT, Hive has no idea if "2030-10-01" is distinct, so the distinct_count is 
unchanged.  By introducing a bloom filter for column statistics, the second 
INSERT may be able to determine that "2030-10-01" is indeed unique and update 
the distinct_count accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22914) Make Hive Connection ZK Interactions Easier to Troubleshoot

2020-02-20 Thread David Mollitor (Jira)
David Mollitor created HIVE-22914:
-

 Summary: Make Hive Connection ZK Interactions Easier to 
Troubleshoot
 Key: HIVE-22914
 URL: https://issues.apache.org/jira/browse/HIVE-22914
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.1.2, 4.0.0
Reporter: David Mollitor
Assignee: David Mollitor


Add better logging and make errors more consistent and meaningful.

Recently was trying to troubleshoot an issue where the ZK namespace of the 
client and the HS2 were different and it was way too difficult to diagnose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22884) Put Reasons for Failed CBO In Explain Plan

2020-02-12 Thread David Mollitor (Jira)
David Mollitor created HIVE-22884:
-

 Summary: Put Reasons for Failed CBO In Explain Plan
 Key: HIVE-22884
 URL: https://issues.apache.org/jira/browse/HIVE-22884
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: David Mollitor


If a query cannot be processed by CBO, the reason for the failure is logged 
into the HiveServer2 application log.  In addition, please also put this 
information into the EXPLAIN plan output itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22679) Replace Base64 in metastore-common Package

2019-12-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22679:
-

 Summary: Replace Base64 in metastore-common Package
 Key: HIVE-22679
 URL: https://issues.apache.org/jira/browse/HIVE-22679
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22676) Replace Base64 in hive-service Package

2019-12-28 Thread David Mollitor (Jira)
David Mollitor created HIVE-22676:
-

 Summary: Replace Base64 in hive-service Package
 Key: HIVE-22676
 URL: https://issues.apache.org/jira/browse/HIVE-22676
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22680) Replace Base64 in druid-handler Package

2019-12-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22680:
-

 Summary: Replace Base64 in druid-handler Package
 Key: HIVE-22680
 URL: https://issues.apache.org/jira/browse/HIVE-22680
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22681) Replace Base64 in hcatalog-webhcat Package

2019-12-29 Thread David Mollitor (Jira)
David Mollitor created HIVE-22681:
-

 Summary: Replace Base64 in hcatalog-webhcat Package
 Key: HIVE-22681
 URL: https://issues.apache.org/jira/browse/HIVE-22681
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22678) Run Eclipse Cleanup Against hive-accumulo-handler Module

2019-12-28 Thread David Mollitor (Jira)
David Mollitor created HIVE-22678:
-

 Summary: Run Eclipse Cleanup Against hive-accumulo-handler Module
 Key: HIVE-22678
 URL: https://issues.apache.org/jira/browse/HIVE-22678
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22677) Run Eclipse Cleanup Against Hive Project

2019-12-28 Thread David Mollitor (Jira)
David Mollitor created HIVE-22677:
-

 Summary: Run Eclipse Cleanup Against Hive Project
 Key: HIVE-22677
 URL: https://issues.apache.org/jira/browse/HIVE-22677
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22683) Run Eclipse Cleanup Against beeline Module

2019-12-30 Thread David Mollitor (Jira)
David Mollitor created HIVE-22683:
-

 Summary: Run Eclipse Cleanup Against beeline Module
 Key: HIVE-22683
 URL: https://issues.apache.org/jira/browse/HIVE-22683
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22683.1.patch





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22684) Run Eclipse Cleanup Against hbase-handler Module

2019-12-30 Thread David Mollitor (Jira)
David Mollitor created HIVE-22684:
-

 Summary: Run Eclipse Cleanup Against hbase-handler Module
 Key: HIVE-22684
 URL: https://issues.apache.org/jira/browse/HIVE-22684
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22685) TestHiveSqlDateTimeFormatter Now Broken with New Year 2020

2019-12-30 Thread David Mollitor (Jira)
David Mollitor created HIVE-22685:
-

 Summary: TestHiveSqlDateTimeFormatter Now Broken with New Year 2020
 Key: HIVE-22685
 URL: https://issues.apache.org/jira/browse/HIVE-22685
 Project: Hive
  Issue Type: Bug
Reporter: David Mollitor


Unit test is now broken (n)(n):(

{code:java}
//Tests for these patterns would need changing every decade if done in the 
above way.
//Thursday of the first week in an ISO year always matches the Gregorian 
year.
checkParseTimestampIso("IY-IW-ID", "0-01-04", "iw, ", "01, " + 
thisYearString.substring(0, 3) + "0");
checkParseTimestampIso("I-IW-ID", "0-01-04", "iw, ", "01, " + 
thisYearString.substring(0, 3) + "0");
{code}

{code}
org.junit.ComparisonFailure: expected:<01, 20[1]0> but was:<01, 20[2]0>
at org.junit.Assert.assertEquals(Assert.java:115)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.hadoop.hive.common.format.datetime.TestHiveSqlDateTimeFormatter.checkParseTimestampIso(TestHiveSqlDateTimeFormatter.java:313)
at 
org.apache.hadoop.hive.common.format.datetime.TestHiveSqlDateTimeFormatter.testParseTimestamp(TestHiveSqlDateTimeFormatter.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22673) Replace Base64 in contrib Package

2019-12-28 Thread David Mollitor (Jira)
David Mollitor created HIVE-22673:
-

 Summary: Replace Base64 in contrib Package
 Key: HIVE-22673
 URL: https://issues.apache.org/jira/browse/HIVE-22673
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-22673.1.patch





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22674) Replace Base64 in serde Package

2019-12-28 Thread David Mollitor (Jira)
David Mollitor created HIVE-22674:
-

 Summary: Replace Base64 in serde Package
 Key: HIVE-22674
 URL: https://issues.apache.org/jira/browse/HIVE-22674
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22675) Replace Base64 in hive-standalone-metastore Package

2019-12-28 Thread David Mollitor (Jira)
David Mollitor created HIVE-22675:
-

 Summary: Replace Base64 in hive-standalone-metastore Package
 Key: HIVE-22675
 URL: https://issues.apache.org/jira/browse/HIVE-22675
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23012) Beeline Has Too Many Dependencies

2020-03-11 Thread David Mollitor (Jira)
David Mollitor created HIVE-23012:
-

 Summary: Beeline Has Too Many Dependencies
 Key: HIVE-23012
 URL: https://issues.apache.org/jira/browse/HIVE-23012
 Project: Hive
  Issue Type: Improvement
 Environment: * jetty-server
* ORC client libraries
* HBase client libraries
* Avro
* Something called 'twill'

Please investigate cleaning up the POM file and cutting down the number of 
dependencies.
Reporter: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23017) Remove Superfluous 'Transient' Keyword From FetchTask

2020-03-12 Thread David Mollitor (Jira)
David Mollitor created HIVE-23017:
-

 Summary: Remove Superfluous 'Transient' Keyword From FetchTask
 Key: HIVE-23017
 URL: https://issues.apache.org/jira/browse/HIVE-23017
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor


{code:java|title=FetchTask}
public class FetchTask extends Task implements Serializable {
  private static final long serialVersionUID = 1L;
  private int maxRows = 100;
  private FetchOperator fetch;
  private ListSinkOperator sink;
  private int totalRows;
  private static transient final Logger LOG = 
LoggerFactory.getLogger(FetchTask.class);
  JobConf job = null;
{code}

There is not need for this {{Logger}} to be transient.  Please remove as it is 
useless overheard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23016) Extract JdbcConnectionParams from Utils Class

2020-03-12 Thread David Mollitor (Jira)
David Mollitor created HIVE-23016:
-

 Summary: Extract JdbcConnectionParams from Utils Class
 Key: HIVE-23016
 URL: https://issues.apache.org/jira/browse/HIVE-23016
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor


And make it its own class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23005) Consider Default JDBC Fetch Size From HS2

2020-03-10 Thread David Mollitor (Jira)
David Mollitor created HIVE-23005:
-

 Summary: Consider Default JDBC Fetch Size From HS2
 Key: HIVE-23005
 URL: https://issues.apache.org/jira/browse/HIVE-23005
 Project: Hive
  Issue Type: Sub-task
  Components: JDBC
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23007) Server Should Return Default Fetch Size If One Is Not Sent By Client

2020-03-10 Thread David Mollitor (Jira)
David Mollitor created HIVE-23007:
-

 Summary: Server Should Return Default Fetch Size If One Is Not 
Sent By Client
 Key: HIVE-23007
 URL: https://issues.apache.org/jira/browse/HIVE-23007
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor
 Attachments: HIVE-23007.1.patch





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23171) Create Tool To Visualize Hive Parser Tree

2020-04-09 Thread David Mollitor (Jira)
David Mollitor created HIVE-23171:
-

 Summary: Create Tool To Visualize Hive Parser Tree
 Key: HIVE-23171
 URL: https://issues.apache.org/jira/browse/HIVE-23171
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23172) Quoted Backtick Columns Are Not Parsing Correctly

2020-04-09 Thread David Mollitor (Jira)
David Mollitor created HIVE-23172:
-

 Summary: Quoted Backtick Columns Are Not Parsing Correctly
 Key: HIVE-23172
 URL: https://issues.apache.org/jira/browse/HIVE-23172
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


I recently came across a weird behavior while examining failures of 
{{special_character_in_tabnames_2.q}} while working on HIVE-23150. I was 
surprised to see it fail because I couldn't see of any reason why it should... 
it's doing pretty standard SQL statements just like every other test, but for 
some reason this test is just a *little bit* differently than most others and 
it brought this issue to light.

Turns out,... the parsing of table names is pretty much wrong across the board.

The statement that caught my attention was this:
{code:sql}
DROP TABLE IF EXISTS `s/c`;
{code}
And here is the relevant grammar:
{code:none}
fragment
RegexComponent
: 'a'..'z' | 'A'..'Z' | '0'..'9' | '_'
| PLUS | STAR | QUESTION | MINUS | DOT
| LPAREN | RPAREN | LSQUARE | RSQUARE | LCURLY | RCURLY
| BITWISEXOR | BITWISEOR | DOLLAR | '!'
;

Identifier
:
(Letter | Digit) (Letter | Digit | '_')*
| {allowQuotedId()}? QuotedIdentifier  /* though at the language level we 
allow all Identifiers to be QuotedIdentifiers; 
  at the API level only columns are 
allowed to be of this form */
| '`' RegexComponent+ '`'
;

fragment
QuotedIdentifier 
:
'`'  ( '``' | ~('`') )* '`' { 
setText(StringUtils.replace(getText().substring(1, getText().length() -1 ), 
"``", "`")); }
;
{code}
The mystery for me was that, for some reason, this String {{`s/c`}} was being 
stripped of its back-ticks. Every other test I investigated did not have this 
behavior, the back ticks were always preserved around the table name. The main 
Hive Java code base would see the back-ticks and deal with it internally. For 
HIVE-23150, I introduced some sanity checks and they were failing because they 
were expecting the back ticks to be present.

With the help of HIVE-23171 I finally figured it out. So, what I discovered is 
that pretty much every table name is hitting the {{RegexComponent}} rule and 
the back ticks are carried forward. However, {{`s/c`}} the forward slash `/` is 
not allowable in {{RegexComponent}} so it hits on {{QuotedIdentifier}} rule 
which is trimming the back ticks.

I validated this by disabling {{QuotedIdentifier}}. When I did this, {{`s/c`}} 
fails in error but {{`sc`}} parses successfully... because {{`sc`}} is being 
treated as a {{RegexComponent}}.

So, if you have {{allowQuotedId}} disabled, table names can only use the 
characters defined in the {{RegexComponent}} rule (otherwise it errors), and it 
will *not* strip the back ticks. If you have {{allowQuotedId}} enabled, then if 
the table name has a character not specified in {{RegexComponent}}, it will 
identify it as a table name and it *will* strip the back ticks, if all the 
characters are part of {{RegexComponent}} then it will *not* strip the back 
ticks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23159) Cleanup ShowCreateTableOperation

2020-04-08 Thread David Mollitor (Jira)
David Mollitor created HIVE-23159:
-

 Summary: Cleanup ShowCreateTableOperation
 Key: HIVE-23159
 URL: https://issues.apache.org/jira/browse/HIVE-23159
 Project: Hive
  Issue Type: Bug
Reporter: David Mollitor
Assignee: David Mollitor


* Move StringTemplate templates to external files
 * Explore better leveraging StringTemplate capabilities to remove duplicate 
functionality in the class
 * General clean up and formatting



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23174) Remove TOK_TRUNCATETABLE

2020-04-09 Thread David Mollitor (Jira)
David Mollitor created HIVE-23174:
-

 Summary: Remove TOK_TRUNCATETABLE
 Key: HIVE-23174
 URL: https://issues.apache.org/jira/browse/HIVE-23174
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23177) Upgrade to ANTLR4

2020-04-10 Thread David Mollitor (Jira)
David Mollitor created HIVE-23177:
-

 Summary: Upgrade to ANTLR4
 Key: HIVE-23177
 URL: https://issues.apache.org/jira/browse/HIVE-23177
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor


Upgrade Hive to ANTL4, ANTLR3 lost support many moons ago.

This is going to be a big lift.  Many of the Hive rules use the "rule rewrite" 
feature which no longer exists in ANLTR4 and it must be completely 
re-implemented:

https://stackoverflow.com/questions/14565794/antlr-4-tree-inject-rewrite-operator



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23176) Remove REGEX Column Feature

2020-04-10 Thread David Mollitor (Jira)
David Mollitor created HIVE-23176:
-

 Summary: Remove REGEX Column Feature
 Key: HIVE-23176
 URL: https://issues.apache.org/jira/browse/HIVE-23176
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor


Remove the Hive feature: REGEX Column.

 

Hive has this interesting feature for doing REGEX to SELECT multiple columns.  
This needs to go.  It is not SQL standard and as currently implemented, it is 
impossible to determine if a column identifier is a REGEX or the actual name of 
the column.  If a column name is enclosed in back ticks then any UTF-8 
character is a valid table name.

 

[https://dev.mysql.com/doc/refman/8.0/en/identifiers.html]

[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23189) Change Explain ANALYZE to Explain PROFILE

2020-04-13 Thread David Mollitor (Jira)
David Mollitor created HIVE-23189:
-

 Summary: Change Explain ANALYZE to Explain PROFILE
 Key: HIVE-23189
 URL: https://issues.apache.org/jira/browse/HIVE-23189
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


{code:none}
EXPLAIN [EXTENDED|CBO|AST|DEPENDENCY|AUTHORIZATION|LOCKS|VECTORIZATION|ANALYZE] 
query
{code}

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain#LanguageManualExplain-TheANALYZEClause

In Hive, there is an {{EXPLAIN ANALYZE}} query.  This can get a bit confusing 
because you can run an {{EXPLAIN ANALYZE}} against an {{ANALYZE TABLE}} 
statement, so you have something like,...

{code:sql}
EXPLAIN ANALYZE ANALYZE TABLE `myTable` COMPUTE STATISTICS;
{code}

I would like to propose that the name be changed to {{EXPLAIN PROFILE}}.  This 
borrows from Apache Impala because it has a {{PROFILE}} command which produces 
the stats that actually occurred during the query run (much like this Hive 
feature).




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23187) Make TABLE Token Optional in ANALYZE Statement

2020-04-13 Thread David Mollitor (Jira)
David Mollitor created HIVE-23187:
-

 Summary: Make TABLE Token Optional in ANALYZE Statement
 Key: HIVE-23187
 URL: https://issues.apache.org/jira/browse/HIVE-23187
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23188) Allow STATS Token in Analyze Table

2020-04-13 Thread David Mollitor (Jira)
David Mollitor created HIVE-23188:
-

 Summary: Allow STATS Token in Analyze Table
 Key: HIVE-23188
 URL: https://issues.apache.org/jira/browse/HIVE-23188
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23186) Strict Check SemanticException Should Properly Quote Table Name

2020-04-13 Thread David Mollitor (Jira)
David Mollitor created HIVE-23186:
-

 Summary: Strict Check SemanticException Should Properly Quote 
Table Name
 Key: HIVE-23186
 URL: https://issues.apache.org/jira/browse/HIVE-23186
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


https://github.com/apache/hive/blob/029cab297a9ae40d249f63040721f93857398648/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java#L191-L192

{code:java}
throw new SemanticException(error + " No partition predicate for Alias 
\""
+ alias + "\" Table \"" + tab.getTableName() + "\"");
{code}

Use back ticks and use the database name as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23150) Create a Parser that All Components Use

2020-04-06 Thread David Mollitor (Jira)
David Mollitor created HIVE-23150:
-

 Summary: Create a Parser that All Components Use
 Key: HIVE-23150
 URL: https://issues.apache.org/jira/browse/HIVE-23150
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor


Create a parser for parsing (and validating) MySQL/MariaDB style object 
identifiers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23149) Consistency of Parsing Object Identifiers

2020-04-06 Thread David Mollitor (Jira)
David Mollitor created HIVE-23149:
-

 Summary: Consistency of Parsing Object Identifiers
 Key: HIVE-23149
 URL: https://issues.apache.org/jira/browse/HIVE-23149
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


There needs to be better consistency with handling of object identifiers 
(database, tables, column, view, function, etc.).  I think it makes sense to 
standardize on the same rules which MySQL/MariaDB uses for their column names 
so that Hive can be more of a drop-in replacement for these.
 
The two important things to keep in mind are:
 
1// Permitted characters in quoted identifiers include the full Unicode Basic 
Multilingual Plane (BMP), except U+
 
2// If any components of a multiple-part name require quoting, quote them 
individually rather than quoting the name as a whole. For example, write 
{{`my-table`.`my-column`}}, not {{`my-table.my-column`}}.  
 
[https://dev.mysql.com/doc/refman/8.0/en/identifiers.html]
[https://dev.mysql.com/doc/refman/8.0/en/identifier-qualifiers.html]  

 
That is to say:
 
{code:sql}
-- Select all rows from a table named `default.mytable`
-- (Yes, the table name itself has a period in it. This is valid)
SELECT * FROM `default.mytable`;
 
-- Select all rows from database `default`, table `mytable`
SELECT * FROM `default`.`mytable`;  
{code}
 
This plays out in a couple of ways.  There may be more, but these are the ones 
I know about already:
 
1// Hive generates incorrect syntax: [HIVE-23128]
 
2// Hive throws exception if there is a period in the table name.  This is an 
invalid response.  Table name may have a period in them. More likely than not, 
it will throw 'table not found' exception since the user most likely 
accidentally used backticks incorrectly and meant to specify a db and a table 
separately. [HIVE-16907]

Once we have the parsing figured out and support for backticks to enclose UTF-8 
strings, then the backend database needs to actually support the UTF-8 
character set.  It currently does not: [HIVE-1808]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23193) Review of Subset of Debug Logging

2020-04-13 Thread David Mollitor (Jira)
David Mollitor created HIVE-23193:
-

 Summary: Review of Subset of Debug Logging
 Key: HIVE-23193
 URL: https://issues.apache.org/jira/browse/HIVE-23193
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


bq. Better yet, use parameterized messages
bq.  Will outperform the first form by a factor of at least 30, in case of a 
disabled logging statement.

http://www.slf4j.org/faq.html

* Use parameterized logging where appropriate
* Add logging guards {{if (Log.isDebugEnabled()}} around loops and complex 
debug message

Simplify the code, remove lines of code, and potentially increase performance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23194) Use Queue Instead of List for CollectOperator

2020-04-13 Thread David Mollitor (Jira)
David Mollitor created HIVE-23194:
-

 Summary: Use Queue Instead of List for CollectOperator
 Key: HIVE-23194
 URL: https://issues.apache.org/jira/browse/HIVE-23194
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


https://github.com/apache/hive/blob/d6948a28ab3e34e5116591a60a96bdf031185e47/ql/src/java/org/apache/hadoop/hive/ql/exec/CollectOperator.java#L85-L88

{code:java|title=CollectOperator.java}
   rowList = new ArrayList();
...
} else {
  result.o = rowList.remove(0);
  result.oi = standardRowInspector;
}
{code}

Removing from the head of an {{ArrayList}} is an expensive operation because it 
needs to shift all of the elements down in the array for each call.  Better to 
use a {{Queue}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23182) Semantic Exception: rule Identifier failed predicate allowQuotedId

2020-04-11 Thread David Mollitor (Jira)
David Mollitor created HIVE-23182:
-

 Summary: Semantic Exception: rule Identifier failed predicate 
allowQuotedId
 Key: HIVE-23182
 URL: https://issues.apache.org/jira/browse/HIVE-23182
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor


Querying a Hive Table (via Hiveserver2) with Column Masking enabled via Ranger 
Hive Plugin returns with an error.

{code:none}
[42000]: Error while compiling statement: FAILED: SemanticException 
org.apache.hadoop.hive.ql.parse.ParseException: line 1:62 rule Identifier 
failed predicate: {allowQuotedId()}? line 1:74 rule Identifier failed 
predicate: {allowQuotedId()}? line 1:94 rule Identifier failed predicate: 
{allowQuotedId()}? line 1:117 rule Identifier failed predicate: 
{allowQuotedId()}?
{code}

Querying a Hive Table (via Hiveserver2) with Column Masking enabled via Ranger 
Hive Plugin returns with an error.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23183) Make TABLE Token Optional in TRUNCATE Statement

2020-04-12 Thread David Mollitor (Jira)
David Mollitor created HIVE-23183:
-

 Summary: Make TABLE Token Optional in TRUNCATE Statement
 Key: HIVE-23183
 URL: https://issues.apache.org/jira/browse/HIVE-23183
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


It's optional in MySQL, let's make it optional for Hive too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23258) Remove BoneCP Connection Pool

2020-04-20 Thread David Mollitor (Jira)
David Mollitor created HIVE-23258:
-

 Summary: Remove BoneCP Connection Pool
 Key: HIVE-23258
 URL: https://issues.apache.org/jira/browse/HIVE-23258
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor


{quote}
BoneCP is a Java JDBC connection pool implementation that is tuned for high 
performance by minimizing lock contention to give greater throughput for your 
application ... but SHOULD NOW BE CONSIDERED DEPRECATED in favour of HikariCP.
{quote}

https://github.com/wwadge/bonecp

The default in Hive 3.x is already HikariCP, so just remove BoneCP in 4.x

https://github.com/apache/hive/blob/branch-3.1/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java#L392



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23079) Remove Calls to printStackTrace in Module hive-serde

2020-03-25 Thread David Mollitor (Jira)
David Mollitor created HIVE-23079:
-

 Summary: Remove Calls to printStackTrace in Module hive-serde
 Key: HIVE-23079
 URL: https://issues.apache.org/jira/browse/HIVE-23079
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23077) Remove Calls to printStackTrace in Module hive-exec

2020-03-25 Thread David Mollitor (Jira)
David Mollitor created HIVE-23077:
-

 Summary: Remove Calls to printStackTrace in Module hive-exec
 Key: HIVE-23077
 URL: https://issues.apache.org/jira/browse/HIVE-23077
 Project: Hive
  Issue Type: Sub-task
Reporter: David Mollitor
Assignee: David Mollitor


Only one "tricky" change.  Throw an Exception instead of {{printStackTrace}} in 
the static Driver loader as suggested from the reference here:

https://github.com/mariadb-corporation/mariadb-connector-j/blob/3bc66153b51aca188afc50ff35a0123f16c099ed/src/main/java/org/mariadb/jdbc/Driver.java#L72



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23078) Remove HiveDriver SecurityManager Check

2020-03-25 Thread David Mollitor (Jira)
David Mollitor created HIVE-23078:
-

 Summary: Remove HiveDriver SecurityManager Check
 Key: HIVE-23078
 URL: https://issues.apache.org/jira/browse/HIVE-23078
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Reporter: David Mollitor
Assignee: David Mollitor


{code:java|title=HiveDriver.java}
  public HiveDriver() {
// TODO Auto-generated constructor stub
SecurityManager security = System.getSecurityManager();
if (security != null) {
  security.checkWrite("foobah");
}
  }
{code}

Not sure why it needs to write a file called "foobah" but I checked out some 
other JDBC drivers and they do nothing like this.  Remove this check; remove 
the constructor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >