[jira] [Created] (HIVE-27201) Inconsistency between session Hive and thread-local Hive may cause HS2 deadlock

2023-03-31 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-27201:
--

 Summary: Inconsistency between session Hive and thread-local Hive 
may cause HS2 deadlock
 Key: HIVE-27201
 URL: https://issues.apache.org/jira/browse/HIVE-27201
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Zhihua Deng
Assignee: Zhihua Deng


The HiveServer2’s server handler can switch to process the operation from other 
session, in such case, the Hive cached in ThreadLocal is not the same as the 
Hive in SessionState, and can be referenced by another session. 

If the two handlers swap their sessions to process the DatabaseMetaData 
request, and the HiveMetastoreClientFactory obtains the Hive via Hive.get(), 
then there is a chance that the deadlock can happen.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27179) HS2 WebUI throws NPE when JspFactory loaded from jetty-runner

2023-03-27 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-27179:
--

 Summary: HS2 WebUI throws NPE when JspFactory loaded from 
jetty-runner
 Key: HIVE-27179
 URL: https://issues.apache.org/jira/browse/HIVE-27179
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Zhihua Deng


In HIVE-17088{*},{*} we resolved a NPE thrown from HS2 WebUI by introducing 

javax.servlet.jsp-api. It works as expected when the javax.servlet.jsp-api jar 
prevails jetty-runner jar, but things can be different in some environments, it 
still throws NPE when opening the HS2 web:
{noformat}
java.lang.NullPointerException at 
org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:286)
 at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:71) at 
javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at 
org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1443)
 at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791) at 
org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1626)
...{noformat}
The jetty-runner JspFactory.getDefaultFactory() just returns null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27139) Log details when hiveserver2.sh doing sanity check with the process id

2023-03-14 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-27139:
--

 Summary: Log details when hiveserver2.sh doing sanity check with 
the process id
 Key: HIVE-27139
 URL: https://issues.apache.org/jira/browse/HIVE-27139
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Zhihua Deng


HiveServer2 always persists the process id into a file after HIVE-22193. When 
some other process reuses the same pid, restarting the HiveServer2 would be 
failed, print the details of the process if in case, and delete the old pid 
file when the HiveServer2 is decommissioning. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27091) Add double quotes for tables in PartitionProjectionEvaluator

2023-02-17 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-27091:
--

 Summary: Add double quotes for tables in 
PartitionProjectionEvaluator
 Key: HIVE-27091
 URL: https://issues.apache.org/jira/browse/HIVE-27091
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Zhihua Deng
Assignee: Zhihua Deng


When PartitionProjectionEvaluator requests partitions against PostgreSQL, there 
throws exception:
{noformat}
javax.jdo.JDODataStoreException: Error executing SQL query "select 
"SDS"."LOCATION","PARTITIONS"."CREATE_TIME","SDS"."SD_ID","PARTITIONS"."PART_ID"
 from PARTITIONS left outer join SDS on PARTITIONS."SD_ID" = SDS."SD_ID"   left 
outer join SERDES on SDS."SERDE_ID" = SERDES."SERDE_ID" where "PART_ID" in 
(92731,92732,92733,92734,92735,92736) order by "PART_NAME" asc".
…
Caused by: org.postgresql.util.PSQLException: ERROR: relation "partitions" does 
not exist{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26965) Docker image for Apache Hive

2023-01-18 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26965:
--

 Summary: Docker image for Apache Hive
 Key: HIVE-26965
 URL: https://issues.apache.org/jira/browse/HIVE-26965
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


This feature work is to provide docker image for Hive and track further 
improvements.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26794) Explore changing TxnHandler#connPoolMutex to NoPoolConnectionPool

2022-11-30 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26794:
--

 Summary: Explore changing TxnHandler#connPoolMutex to 
NoPoolConnectionPool
 Key: HIVE-26794
 URL: https://issues.apache.org/jira/browse/HIVE-26794
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng


Instead of creating a fixed size connection pool for TxnHandler#MutexAPI, the 
pool can be assigned to NoPoolConnectionPool due to: 
 * TxnHandler#MutexAPI is primarily designed to provide coarse-grained mutex 
support to maintenance tasks running inside the Metastore, these tasks are not 
user faced;
 * A fixed size connection pool as same as the pool used in ObjectStore is a 
waste for other non leaders in the warehouse; 

The NoPoolConnectionPool provides connection on demand, and TxnHandler#MutexAPI 
only uses getConnection method to fetch a connection from the pool, so it's 
doable to change the pool to NoPoolConnectionPool, this would make the HMS more 
scaleable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26773) Update Avro version to 1.10.2

2022-11-23 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26773:
--

 Summary: Update Avro version to 1.10.2
 Key: HIVE-26773
 URL: https://issues.apache.org/jira/browse/HIVE-26773
 Project: Hive
  Issue Type: Improvement
  Components: Avro
Reporter: Zhihua Deng


Update the avro version to 1.10.2, there is a transitive dependency to velocity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26667) Incompatible expression deserialization against latest HMS

2022-10-24 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26667:
--

 Summary: Incompatible expression deserialization against latest HMS
 Key: HIVE-26667
 URL: https://issues.apache.org/jira/browse/HIVE-26667
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Zhihua Deng


When an old Hive Metastore client issues listPartitionsByExpr against the 
lastest HMS, an exception would be thrown:

 
{noformat}
MetaException(message:Unable to find class: 
)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_by_expr_result$get_partitions_by_expr_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_by_expr_result$get_partitions_by_expr_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_by_expr_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions_by_expr(ThriftHiveMetastore.java:3273)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_by_expr(ThriftHiveMetastore.java:3260)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByExpr(HiveMetaStoreClient.java:1488){noformat}
This was caused by a gap between old client and server on (de)serializing the 
expression. In old client, we don’t stream the expression’s class type into 
bytes, while the server should read the class type from serialized bytes 
firstly, which makes the trouble. Other APIs that need to (de)serialize 
expression may be suffered as well.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26644) Introduce auto sizing in HMS

2022-10-18 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26644:
--

 Summary: Introduce auto sizing in HMS
 Key: HIVE-26644
 URL: https://issues.apache.org/jira/browse/HIVE-26644
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng
Assignee: Zhihua Deng


HMS should have some ability to auto-size itself based on enabled features. 
Server thread pool sizes-to-HMS connection pool sizes, larger pool sizes on 
compaction-disabled-instances for better performance etc. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26617) Remove some useless properties

2022-10-10 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26617:
--

 Summary: Remove some useless properties
 Key: HIVE-26617
 URL: https://issues.apache.org/jira/browse/HIVE-26617
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


Some properties in HiveConf or MetastoreConf don't use at all, it's better to 
clean up them:
 * hive.metastore.initial.metadata.count.enabled
 * hive.timedout.txn.reaper.start
 * metastore.acid.housekeeper.start
 * metastore.initial.metadata.count.enabled



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26561) Fix test TestMiniLlapLocalCliDriver#stats_part2

2022-09-26 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26561:
--

 Summary: Fix test TestMiniLlapLocalCliDriver#stats_part2
 Key: HIVE-26561
 URL: https://issues.apache.org/jira/browse/HIVE-26561
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Zhihua Deng


The test is flaky, sometimes failed by: 
{noformat}
Caused by: org.apache.derby.iapi.error.StandardException: Invalid character 
string format for type DECIMAL.
        at org.apache.derby.iapi.error.StandardException.newException(Unknown 
Source) ~[derby-10.14.2.0.jar:?]
        at org.apache.derby.iapi.error.StandardException.newException(Unknown 
Source) ~[derby-10.14.2.0.jar:?]
        at org.apache.derby.iapi.types.DataType.invalidFormat(Unknown Source) 
~[derby-10.14.2.0.jar:?]
        at org.apache.derby.iapi.types.DataType.setValue(Unknown Source) 
~[derby-10.14.2.0.jar:?]
        at 
org.apache.derby.exe.ac29cfd09cx0183x5e87xdb0ax2168460057f.e4(Unknown 
Source) ~[?:?]
        at org.apache.derby.impl.services.reflect.DirectCall.invoke(Unknown 
Source) ~[derby-10.14.2.0.jar:?]
        at 
org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown
 Source) ~[derby-10.14.2.0.jar:?]
        at 
org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown
 Source) ~[derby-10.14.2.0.jar:?]
        at 
org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown
 Source) ~[derby-10.14.2.0.jar:?]
        at 
org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow(Unknown 
Source) ~[derby-10.14.2.0.jar:?]
        at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown 
Source) ~[derby-10.14.2.0.jar:?]
        at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) 
~[derby-10.14.2.0.jar:?]
        at 
org.apache.hive.com.zaxxer.hikari.pool.HikariProxyResultSet.next(HikariProxyResultSet.java)
 ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
        at 
org.datanucleus.store.rdbms.query.ForwardQueryResult.initialise(ForwardQueryResult.java:93)
 ~[datanucleus-rdbms-5.2.10.jar:?]
        at 
org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:687) 
~[datanucleus-rdbms-5.2.10.jar:?]
        at org.datanucleus.store.query.Query.executeQuery(Query.java:1975) 
~[datanucleus-core-5.2.10.jar:?]
        at 
org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:818) 
~[datanucleus-rdbms-5.2.10.jar:?]
        at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:433) 
~[datanucleus-api-jdo-5.2.8.jar:?]{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26553) Decrease the overhead of Metastore benchmarks

2022-09-21 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26553:
--

 Summary: Decrease the overhead of Metastore benchmarks
 Key: HIVE-26553
 URL: https://issues.apache.org/jira/browse/HIVE-26553
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng


When running Metastore micro-benchmarks, every partitioned related method 
should add new partitions before measuring, this adds lots of overhead when 
performing with a mass of partitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26539) Prevent unsafe deserialization in PartitionExpressionForMetastore

2022-09-15 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26539:
--

 Summary: Prevent unsafe deserialization in 
PartitionExpressionForMetastore
 Key: HIVE-26539
 URL: https://issues.apache.org/jira/browse/HIVE-26539
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26538) MetastoreDefaultTransformer should revise the location when it's empty

2022-09-15 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26538:
--

 Summary: MetastoreDefaultTransformer should revise the location 
when it's empty
 Key: HIVE-26538
 URL: https://issues.apache.org/jira/browse/HIVE-26538
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng


The table's location is treated as null when it's empty, this takes place 
somewhere such as:

[https://github.com/apache/hive/blob/82f319773cb2361a98963e861fb903ab8eecd9c4/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L2367]

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDefaultTransformer.java#L729]
  

MetastoreDefaultTransformer should revise the empty location when 
altering/creating tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26509) Introduce dynamic leader election in HMS

2022-09-01 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26509:
--

 Summary: Introduce dynamic leader election in HMS
 Key: HIVE-26509
 URL: https://issues.apache.org/jira/browse/HIVE-26509
 Project: Hive
  Issue Type: New Feature
  Components: Standalone Metastore
Reporter: Zhihua Deng


>From HIVE-21841 we have a leader HMS selected by configuring 
>metastore.housekeeping.leader.hostname on startup. This approach saves us from 
>running duplicated HMS's housekeeping tasks cluster-wide. 

In this jira, we introduce another dynamic leader election: adopt hive lock to 
implement the leader election. Once a HMS owns the lock, then it becomes the 
leader, carries out the housekeeping tasks, and sends heartbeats to renew the 
lock before timeout. If the leader fails to reclaim the lock, then stops the 
already started tasks if it has, the electing event is audited. We can achieve 
a more dynamic leader when the original goes down or in the public cloud 
without well configured property, and reduce the leader’s burdens by running 
these tasks among different leaders.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26494) Fix flaky test TestJdbcWithMiniHS2 testHttpRetryOnServerIdleTimeout

2022-08-24 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26494:
--

 Summary: Fix flaky test TestJdbcWithMiniHS2 
testHttpRetryOnServerIdleTimeout
 Key: HIVE-26494
 URL: https://issues.apache.org/jira/browse/HIVE-26494
 Project: Hive
  Issue Type: Test
Reporter: Zhihua Deng


The TestJdbcWithMiniHS2#testHttpRetryOnServerIdleTimeout fails on master:

[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/1362/tests]

It can be fixed by setting hive.server2.thrift.http.max.idle.time to a larger 
value, other than 5ms.

Flaky check: http://ci.hive.apache.org/job/hive-flaky-check/585/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26402) HiveSchemaTool does not honor metastore-site.xml

2022-07-18 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26402:
--

 Summary: HiveSchemaTool does not honor metastore-site.xml
 Key: HIVE-26402
 URL: https://issues.apache.org/jira/browse/HIVE-26402
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Zhihua Deng


When using following scripts for initializing metastore schema,

 
{code:java}
export HIVE_CONF_DIR='/path/to/metastore_conf'
./bin/schematool -dbType mysql -initSchema{code}
the schematool command will be failed though we have a valid metastore-site.xml 
under the config path, it tries to init the default embeded db.
{noformat}
Metastore connection URL:     jdbc:derby:;databaseName=metastore_db;create=true
Metastore connection Driver :     org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User:     APP
Initializing the schema to: 4.0.0-alpha-2{noformat}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26400) Provide a self-contained docker

2022-07-16 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26400:
--

 Summary: Provide a self-contained docker
 Key: HIVE-26400
 URL: https://issues.apache.org/jira/browse/HIVE-26400
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Zhihua Deng
Assignee: Zhihua Deng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26322) Upgrade gson to 2.9.0 due to CVE

2022-06-13 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26322:
--

 Summary: Upgrade gson to 2.9.0 due to CVE
 Key: HIVE-26322
 URL: https://issues.apache.org/jira/browse/HIVE-26322
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26058) Choose meaningful names for the Metastore pool threads

2022-03-22 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26058:
--

 Summary: Choose meaningful names for the Metastore pool threads
 Key: HIVE-26058
 URL: https://issues.apache.org/jira/browse/HIVE-26058
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


Due to TThreadPoolServer#createDefaultExecutorService setting the thread name 
by 

 
{code:java}
thread.setName("TThreadPoolServer WorkerProcess-%d");  {code}
 

The logger output the thread name like:
{noformat}
[TThreadPoolServer WorkerProcess-%d] utils.FileUtils: Renaming 
pfile:/{noformat}
, which makes it hard to identify and debug a thread.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26057) Cleanup QueryWrapper

2022-03-21 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26057:
--

 Summary: Cleanup QueryWrapper
 Key: HIVE-26057
 URL: https://issues.apache.org/jira/browse/HIVE-26057
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


Now the QueryWrapper implements Query which has dozens of overridden methods no 
use in codebase, these methods can be cleaned to keep it simple for maintaining.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26056) Retire the api metrics of HMSHandler

2022-03-21 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-26056:
--

 Summary: Retire the api metrics of HMSHandler
 Key: HIVE-26056
 URL: https://issues.apache.org/jira/browse/HIVE-26056
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng


We are using PerfLogger to measure and log the time spent for the metastore 
thrift apis,  this is more complete and simpler than inserting start/end 
functions in HMSHandler to do the same thing.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25896) Remove getThreadId from IHMSHandler

2022-01-25 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25896:
--

 Summary: Remove getThreadId from IHMSHandler
 Key: HIVE-25896
 URL: https://issues.apache.org/jira/browse/HIVE-25896
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng


In IHMSHandler which is annotated as 'InterfaceAudience.Private', we use 
getThreadId to log the thread information now,  the threadId can be logged 
automatically if we configure the logger properly, the method can be removed 
for better maintenance of IMSHandler.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25892) Group HMSHandler's thread locals into a single context

2022-01-24 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25892:
--

 Summary: Group HMSHandler's thread locals into a single context
 Key: HIVE-25892
 URL: https://issues.apache.org/jira/browse/HIVE-25892
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng


There are more than six ThreadLocal variables in HMSHandler, we can group them 
together into a single context to improve the management of variables and the 
code readability.
 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25783) Provide rat check to the CI

2021-12-07 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25783:
--

 Summary: Provide rat check to the CI
 Key: HIVE-25783
 URL: https://issues.apache.org/jira/browse/HIVE-25783
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Zhihua Deng


The Jira tries to investigate if we can provide rat check to the CI, make sure 
that the newly added source files contain the ASF license information. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25774) Add ASF license for newly created files in standalone-metastore

2021-12-05 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25774:
--

 Summary: Add ASF license for newly created files in 
standalone-metastore
 Key: HIVE-25774
 URL: https://issues.apache.org/jira/browse/HIVE-25774
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Affects Versions: 4.0.0
Reporter: Zhihua Deng






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25729) ThriftUnionObjectInspector should be notified when fully inited

2021-11-21 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25729:
--

 Summary: ThriftUnionObjectInspector should be notified when fully 
inited
 Key: HIVE-25729
 URL: https://issues.apache.org/jira/browse/HIVE-25729
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Zhihua Deng


For thread safe purpose,  a ReflectionStructObjectInspector instance would wait 
for 3 seconds to ensure the returning ObjectInspector is fully inited, 
{code:java}
synchronized (soi) {
  while (!soi.isFullyInited(checkedTypes)) {
//   
    soi.wait(3000);  
  }
} {code}
It seems that we are missing to notify ThriftUnionObjectInspector when it has 
been inited.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25582) Empty result when using offset limit with MR

2021-09-30 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25582:
--

 Summary: Empty result when using offset limit with MR
 Key: HIVE-25582
 URL: https://issues.apache.org/jira/browse/HIVE-25582
 Project: Hive
  Issue Type: Bug
  Components: Operators
Affects Versions: 4.0.0
Reporter: Zhihua Deng
Assignee: Zhihua Deng


The _mr.ObjectCache_ caches nothing, every time when the limit [retrieving 
global counter from the 
cache|https://github.com/apache/hive/blob/7b3ecf617a6d46f48a3b6f77e0339fd4ad95a420/ql/src/java/org/apache/hadoop/hive/ql/exec/LimitOperator.java#L150-L161],
 a new AtomicInteger will be returned. This make offset _<= 
currentCountForAllTasksInt_ always __ be __ evaluated to false_,_ as _offset > 
0_, the operator will skip all rows.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25448) Invalid partition columns when skew with distinct

2021-08-15 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25448:
--

 Summary: Invalid partition columns when skew with distinct
 Key: HIVE-25448
 URL: https://issues.apache.org/jira/browse/HIVE-25448
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Reporter: Zhihua Deng


When hive.groupby.skewindata is enabled,  we spray by the grouping key and 
distinct key if distinct is present in the first reduce sink operator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25383) Make TestMarkPartitionRemote more stable

2021-07-25 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25383:
--

 Summary: Make TestMarkPartitionRemote more stable
 Key: HIVE-25383
 URL: https://issues.apache.org/jira/browse/HIVE-25383
 Project: Hive
  Issue Type: Test
  Components: Standalone Metastore
Reporter: Zhihua Deng


Sometimes the TestMarkPartitionRemote failed by
{noformat}
org.apache.hadoop.hive.metastore.api.MetaException: Exception determining 
external table location:Default location is not available for table: 
file:/path/to/tableat 
org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:660)
 ~[classes/:?]at 
org.apache.hadoop.hive.metastore.HMSHandler.create_table_core(HMSHandler.java:2325)
 ~[classes/:?]at 
org.apache.hadoop.hive.metastore.HMSHandler.create_table_req(HMSHandler.java:2578)
 [classes/:?]{noformat}
[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-2441/15/tests]
[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-2473/3/tests]
 
The cause is that the table path is existed before the test executed, 
TableLocationStrategy with prohibit does not allow alternate locations. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25365) Insufficient priviledges to show partitions when partition columns are authorized

2021-07-21 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25365:
--

 Summary: Insufficient priviledges to show partitions when 
partition columns are authorized
 Key: HIVE-25365
 URL: https://issues.apache.org/jira/browse/HIVE-25365
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Zhihua Deng


When the privileges of partition columns have granted to user, showing 
partitions still needs select privilege on the table, though they are able to 
query from partition columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25294) Optimise the metadata count queries for local mode

2021-06-28 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25294:
--

 Summary: Optimise the metadata count queries for local mode
 Key: HIVE-25294
 URL: https://issues.apache.org/jira/browse/HIVE-25294
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng
Assignee: Zhihua Deng


When Metastore is in local mode,  the client uses his own private HMSHandler to 
get the meta data,  the HMSHandler should be initialized before being ready to 
serve. When the metrics is enabled, HMSHandler will count the number of db, 
table, partitions,  which cloud lead to some problems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25261) RetryingHMSHandler should wrap the MetaException with short description of the target

2021-06-16 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25261:
--

 Summary: RetryingHMSHandler should wrap the MetaException with 
short description of the target
 Key: HIVE-25261
 URL: https://issues.apache.org/jira/browse/HIVE-25261
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Zhihua Deng
Assignee: Zhihua Deng


[RetryingMetaStoreClient|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java#L267-L276]
 relies on the message of MetaException to make decision on retrying the 
current operation when failed. However the RetryingHMSHandler only wraps the 
message into MetaException, which may cause the client unable to retry with 
other metastore instances.

For example, if we got exception:
{code:java}
Caused by: javax.jdo.JDOFatalUserException: Persistence Manager has been closed
 at 
org.datanucleus.api.jdo.JDOPersistenceManager.assertIsOpen(JDOPersistenceManager.java:2235)
 at 
org.datanucleus.api.jdo.JDOPersistenceManager.evictAll(JDOPersistenceManager.java:481)
 at 
org.apache.hadoop.hive.metastore.ObjectStore.rollbackTransaction(ObjectStore.java:635)
 at org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:1415)
 at sun.reflect.GeneratedMethodAccessor153.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498){code}
RetryingHMSHandler will throw MetaException with message 'Persistence Manager 
has been closed', which not in the recoverable pattern defined in client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25192) No need to create table directory for the non-native table

2021-06-03 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25192:
--

 Summary: No need to create table directory for the non-native table
 Key: HIVE-25192
 URL: https://issues.apache.org/jira/browse/HIVE-25192
 Project: Hive
  Issue Type: Bug
Reporter: Zhihua Deng


When creating non-native tables like kudu, hbase and so on,  we always create a 
warehouse location for these tables, though these tables may not use the 
location to store data or for job plan, so there is no need to create such 
location. 
We also should skip getting the input summary of non-native tables in some 
cases, this will avoid oom problem of building the hash table when the 
non-native table is on build side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25188) JsonSerDe: Unable to read the string value from a nested json

2021-06-02 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25188:
--

 Summary: JsonSerDe: Unable to read the string value from a nested 
json
 Key: HIVE-25188
 URL: https://issues.apache.org/jira/browse/HIVE-25188
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 4.0.0
Reporter: Zhihua Deng
Assignee: Zhihua Deng


Steps to reproduce:
create table json_table(data string, messageid string, publish_time bigint, 
attributes string);
 
if the data of the table stored like:
{code:java}
{"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}{code}
Exception will be thrown when trying to deserialize the data:
 
Caused by: java.lang.IllegalArgumentException
 at com.google.common.base.Preconditions.checkArgument(Preconditions.java:108)
 at 
org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitLeafNode(HiveJsonReader.java:374)
 at 
org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitNode(HiveJsonReader.java:216)
 at 
org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitStructNode(HiveJsonReader.java:327)
 at 
org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitNode(HiveJsonReader.java:221)
 at 
org.apache.hadoop.hive.serde2.json.HiveJsonReader.parseStruct(HiveJsonReader.java:198)
 at org.apache.hadoop.hive.serde2.JsonSerDe.deserialize(JsonSerDe.java:181)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25055) Improve the exception handling in HMSHandler

2021-04-23 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25055:
--

 Summary: Improve the exception handling in HMSHandler
 Key: HIVE-25055
 URL: https://issues.apache.org/jira/browse/HIVE-25055
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng
Assignee: Zhihua Deng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25048) Refine the start/end functions in HMSHandler

2021-04-22 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-25048:
--

 Summary: Refine the start/end functions in HMSHandler
 Key: HIVE-25048
 URL: https://issues.apache.org/jira/browse/HIVE-25048
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng
Assignee: Zhihua Deng


Some start/end functions are incomplete in the HMSHandler, the functions can 
audit the use actions, monitor the performance, and notify the listeners.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24969) Predicates are removed by PPD when left semi join followed by lateral view

2021-04-02 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24969:
--

 Summary: Predicates are removed by PPD when left semi join 
followed by lateral view
 Key: HIVE-24969
 URL: https://issues.apache.org/jira/browse/HIVE-24969
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Reporter: Zhihua Deng
Assignee: Zhihua Deng


Step to reproduce:
{code:java}
select count(distinct logItem.triggerId)
from service_stat_log LATERAL VIEW explode(logItems) LogItemTable AS logItem
where logItem.dsp in ('delivery', 'ocpa')
and logItem.iswin = true
and logItem.adid in (
 select distinct adId
 from ad_info
 where subAccountId in (16010, 14863));  {code}
For predicates _logItem.dsp in ('delivery', 'ocpa')_  and _logItem.iswin = 
true_ are removed when doing ppd: JOIN ->   RS  -> LVJ.  The JOIN has 
candicates: logitem -> [logItem.dsp in ('delivery', 'ocpa'), logItem.iswin = 
true],when pushing them to the RS followed by LVJ,  none of them are pushed, 
the candicates of logitem are removed finally by default, which cause to the 
wrong result.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24901) Re-enable tests in TestBeeLineWithArgs

2021-03-17 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24901:
--

 Summary: Re-enable tests in TestBeeLineWithArgs
 Key: HIVE-24901
 URL: https://issues.apache.org/jira/browse/HIVE-24901
 Project: Hive
  Issue Type: Test
  Components: Test
Reporter: Zhihua Deng


Re-enable the tests in TestBeeLineWithArgs, cause they are stable on master now:

http://ci.hive.apache.org/job/hive-flaky-check/219/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24802) Show operation log at webui

2021-02-20 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24802:
--

 Summary: Show operation log at webui
 Key: HIVE-24802
 URL: https://issues.apache.org/jira/browse/HIVE-24802
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Zhihua Deng


Currently we provide getQueryLog in HiveStatement to fetch the operation log,  
and the operation log would be deleted on operation closing(delay for the 
canceled operation).  Sometimes it's would be not easy for the user(jdbc) or 
administrators to deep into the details of the finished(failed) operation, so 
we present the operation log on webui and keep the operation log for some time 
for latter analysis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24792) Potential thread leak in Operation

2021-02-18 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24792:
--

 Summary: Potential thread leak in Operation
 Key: HIVE-24792
 URL: https://issues.apache.org/jira/browse/HIVE-24792
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Zhihua Deng


The _scheduledExecutorService_  in _Operation_ does not shut down after 
scheduling delay operationlog cleanup, which may result to thread leak in 
hiveserver2...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24752) Returned operation's drilldown link may be broken since HIVE-23625

2021-02-08 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24752:
--

 Summary: Returned operation's drilldown link may be broken since 
HIVE-23625
 Key: HIVE-24752
 URL: https://issues.apache.org/jira/browse/HIVE-24752
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Zhihua Deng


The path spec for the query page has changed from _query_page_ to 
_query_page.html_,

 
{code:java}
webServer.addServlet("query_page", "/query_page.html", 
QueryProfileServlet.class);{code}
 

the drilldown link of the operation returned may be broken if 
hive.server2.show.operation.drilldown.link is enabled...

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24694) Early connection close to release server resources during creating

2021-01-27 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24694:
--

 Summary: Early connection close to release server resources during 
creating
 Key: HIVE-24694
 URL: https://issues.apache.org/jira/browse/HIVE-24694
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Zhihua Deng
Assignee: Zhihua Deng


If exception happens during we try to get the connection from HiveDriver,  the 
opened transport or session may leave unclosed as the connection returned is 
null, we cannot call the close method to release the server 
resources(threads/connection quota), this could make things more worse if the 
user rearches the connection limit,  the following calls to get the connection 
will be failed until we restart the hs2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24666) Vectorized UDFToBoolean may unable to filter rows if input is string

2021-01-20 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24666:
--

 Summary: Vectorized UDFToBoolean may unable to filter rows if 
input is string
 Key: HIVE-24666
 URL: https://issues.apache.org/jira/browse/HIVE-24666
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Zhihua Deng
Assignee: Zhihua Deng


If we use cast boolean in where conditions to filter rows,  in vectorization 
execution the filter is unable to filter rows,  step to reproduce:
{code:java}
create table vtb (key string, value string);
insert into table vtb values('0', 'val0'), ('false', 'valfalse'),('off', 
'valoff'),('no','valno'),('vk', 'valvk');
select distinct value from vtb where cast(key as boolean); {code}
It's seems we don't generate a SelectColumnIsTrue to filter the rows if the 
casted type is string:
 
https://github.com/apache/hive/blob/ff6f3565e50148b7bcfbcf19b970379f2bd59290/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java#L2995-L2996



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24639) Raises SemanticException other than ClassCastException when filter has non-boolean expressions

2021-01-14 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24639:
--

 Summary: Raises SemanticException other than ClassCastException 
when filter has non-boolean expressions
 Key: HIVE-24639
 URL: https://issues.apache.org/jira/browse/HIVE-24639
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


Sometimes we see ClassCastException in filters when fetching some rows of a 
table or executing the query.  The 
GenericUDFOPOr/GenericUDFOPAnd/FilterOperator assume that the output of their 
conditions should be a boolean,  but there is no garanteed.  For example: 

_select * from ccn_table where src + 1;_ 

will throw ClassCastException:
{code:java}
Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to 
java.lang.Boolean
at 
org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:125)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:173)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:553)
...{code}
We'd better to validate the filter during analyzing instead of at runtime and 
bring more meaningful messages.

 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24632) Replace with null when GenericUDFBaseCompare has a non-interpretable val

2021-01-13 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24632:
--

 Summary: Replace with null when GenericUDFBaseCompare has a 
non-interpretable val
 Key: HIVE-24632
 URL: https://issues.apache.org/jira/browse/HIVE-24632
 Project: Hive
  Issue Type: Improvement
  Components: Parser
Affects Versions: 4.0.0
Reporter: Zhihua Deng


The query

 
{code:java}
create table ccn_table(key int, value string);
set hive.cbo.enable=false;
select * from ccn_table where key > '123a'  ;
{code}
 

will scan all records(partitions) compared to older version,  as the plan 
tells: 

 
{noformat}
STAGE PLANS:
 Stage: Stage-0
   Fetch Operator
 limit: -1
 Processor Tree:
   TableScan
 alias: ccn_table
 filterExpr: (key > '123a') (type: boolean)
 Statistics: Num rows: 2 Data size: 180 Basic stats: COMPLETE Column 
stats: COMPLETE
 GatherStats: false
 Filter Operator
   isSamplingPred: false
   predicate: (key > '123a') (type: boolean)
   Statistics: Num rows: 1 Data size: 90 Basic stats: COMPLETE Column 
stats: COMPLETE
   Select Operator
 expressions: key (type: int), value (type: string)
 outputColumnNames: _col0, _col1
 Statistics: Num rows: 1 Data size: 90 Basic stats: COMPLETE Column 
stats: COMPLETE
 ListSink{noformat}
 

 

 

 

 

 

When the TypeCheckProcFactory#getXpathOrFuncExprNodeDesc validates the expr: 
+key > '123a',+  the operator(>) is not an equal operator(=),  so the factory 
returns +key > '123a'+ as it is.  However all the subclass of 
GenericUDFBaseCompare(except GenericUDFOPEqualNS and GenericUDFOPNotEqualNS) 
would return null if either side of the function children is null,  so it's 
safe to return constant null when processing the expr +`key > '123a'`+.  This 
will  benifit some queries when the cbo is disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24575) VectorGroupByOperator reusing keys can lead to wrong results

2020-12-31 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24575:
--

 Summary: VectorGroupByOperator reusing keys can lead to wrong 
results
 Key: HIVE-24575
 URL: https://issues.apache.org/jira/browse/HIVE-24575
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Zhihua Deng
Assignee: Zhihua Deng


 A common sql like
{code:java}
select category as category, count(distinct maskdid) as uv from 
dwd_internal_inc_d group by category{code}
can have a wrong result on the trunk,  the result of column category can be 
confused and
aggregate of distinct maskdid is also wrong. 
After some debugging, We find that the problem is caused by wrong byteStarts[i] 
when using it to copy the current keys to the reusable keys: 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/wrapper/VectorHashKeyWrapperGeneral.java#L351-L362]
The byteStarts[i] is always 0 due to Arrays.fill(byteStarts, 0); so it copies 
the range from 0 other then the real start index to len of the current keys to 
the reusable keys when clone.byteValues[i].length >= byteValues[i].length met, 
which results to the problem.
 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24511) Fix typo in SerDeStorageSchemaReader

2020-12-09 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24511:
--

 Summary: Fix typo in SerDeStorageSchemaReader
 Key: HIVE-24511
 URL: https://issues.apache.org/jira/browse/HIVE-24511
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Zhihua Deng


1,  Close the created classloader to release resources.
2,  More detail error messages on MetaException when throwing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24422) Throw SemanticException when CTE alias is conflicted with table name

2020-11-24 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24422:
--

 Summary: Throw SemanticException when CTE alias is conflicted with 
table name
 Key: HIVE-24422
 URL: https://issues.apache.org/jira/browse/HIVE-24422
 Project: Hive
  Issue Type: Improvement
  Components: Parser
Reporter: Zhihua Deng


If the alias of CTE is conflicted with the table name, we use the alias 
fetching the table other than replacing it with the ASTNode tree, this may 
cause some confusing problems. For example:

{noformat}
create table game_info (game_name string);

with game_info as (
select distinct ext_id, dev_app_id, game_name
from game_info_extend )
select count(game_name) from game_info;{noformat}
The query will return the number of rows of the table game_info, instead of the 
game_info_extend. Maybe we should better throw an exception to avoid such cases.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24411) Make ThreadPoolExecutorWithOomHook more awareness of OutOfMemoryError

2020-11-23 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24411:
--

 Summary: Make ThreadPoolExecutorWithOomHook more awareness of 
OutOfMemoryError
 Key: HIVE-24411
 URL: https://issues.apache.org/jira/browse/HIVE-24411
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Zhihua Deng
Assignee: Zhihua Deng


Now the ThreadPoolExecutorWithOomHook invokes some oom hooks and stops the 
HiveServer2 in case of OutOfMemoryError when executing the tasks. The exception 
is obtained by calling method `future.get()`, however the exception may never 
be an instance of OutOfMemoryError,  as the exception is wrapped in 
ExecutionException,  see the method report in class FutureTask.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24358) Some tasks should set exception on failures

2020-11-04 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24358:
--

 Summary: Some tasks should set exception on failures
 Key: HIVE-24358
 URL: https://issues.apache.org/jira/browse/HIVE-24358
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Zhihua Deng


Some tasks miss setting exception on failures. This information is useful for 
beeline users figuring out the problem and the configured failure hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24351) Report progress to prevent merge task from timeout

2020-11-03 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24351:
--

 Summary: Report progress to prevent merge task from timeout
 Key: HIVE-24351
 URL: https://issues.apache.org/jira/browse/HIVE-24351
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


If the MergeFileTask tries to merge lots of empty files,  the task may be 
terminated due to task timeout. It’s rare, but it happens.  Report the progress 
regularly to prevent the mapper from timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24310) Allow specified number of deserialize errors to be ignored

2020-10-23 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24310:
--

 Summary: Allow specified number of deserialize errors to be ignored
 Key: HIVE-24310
 URL: https://issues.apache.org/jira/browse/HIVE-24310
 Project: Hive
  Issue Type: Improvement
  Components: Operators
Reporter: Zhihua Deng
Assignee: Zhihua Deng


Sometimes we see some corrupted records in user's raw data,  like one corrupted 
in a file which contains over thousands of records, user has to either give up 
all records or replay the whole data in order to run successfully on hive, we 
should provide a way to ignore such corrupted records. 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24248:
--

 Summary: TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
 Key: HIVE-24248
 URL: https://issues.apache.org/jira/browse/HIVE-24248
 Project: Hive
  Issue Type: Bug
Reporter: Zhihua Deng


[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
{code:java}
java.lang.AssertionError:
Client Execution succeeded but contained differences (error code = 1) after 
executing subquery_join_rewrite.q
241,244d240
< 1 1
< 1 2
< 2 1
< 2 2
245a242,243
> 2 2
{code}
 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24146) Cleanup TaskExecutionException in GenericUDTFExplode

2020-09-10 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24146:
--

 Summary: Cleanup TaskExecutionException in GenericUDTFExplode
 Key: HIVE-24146
 URL: https://issues.apache.org/jira/browse/HIVE-24146
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Zhihua Deng
Assignee: Zhihua Deng


- Remove TaskExecutionException, which may be not used anymore;
- Remove the default handling in GenericUDTFExplode#process, which has been 
verified during the function initializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24107) Fix typo in ReloadFunctionsOperation

2020-09-02 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24107:
--

 Summary: Fix typo in ReloadFunctionsOperation
 Key: HIVE-24107
 URL: https://issues.apache.org/jira/browse/HIVE-24107
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Zhihua Deng


Hive.get() will register all functions as doRegisterAllFns is true,  so 
Hive.get().reloadFunctions() may load all functions from metastore twice, use 
Hive.get(false) instead may be better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted

2020-09-02 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24106:
--

 Summary: Abort polling on the operation state when the current 
thread is interrupted
 Key: HIVE-24106
 URL: https://issues.apache.org/jira/browse/HIVE-24106
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Reporter: Zhihua Deng


If running HiveStatement asynchronously as a task like in a thread or future,  
if we interrupt the task,  the HiveStatement would continue to poll on the 
operation state until finish. It's may better to provide a way to abort the 
executing in such case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24069) HiveHistory should log the task that ends abnormally

2020-08-24 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24069:
--

 Summary: HiveHistory should log the task that ends abnormally
 Key: HIVE-24069
 URL: https://issues.apache.org/jira/browse/HIVE-24069
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Zhihua Deng


When the task returns with the exitVal not equal to 0,  The Executor would skip 
marking the task return code and calling endTask.  This may make the history 
log incomplete for such tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24063) SqlFunctionConverter#getHiveUDF handles cast before geting FunctionInfo

2020-08-24 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24063:
--

 Summary: SqlFunctionConverter#getHiveUDF handles cast before 
geting FunctionInfo
 Key: HIVE-24063
 URL: https://issues.apache.org/jira/browse/HIVE-24063
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Zhihua Deng


When the current SqlOperator is SqlCastFunction, 
FunctionRegistry.getFunctionInfo would return null, 
but when hive.allow.udf.load.on.demand is enabled, HiveServer2 will refer to 
metastore for the function definition,  an exception stack trace can be seen 
here in HiveServer2 log:

INFO exec.FunctionRegistry: Unable to look up default.cast in metastore
org.apache.hadoop.hive.ql.metadata.HiveException: 
NoSuchObjectException(message:Function @hive#default.cast does not exist)
 at org.apache.hadoop.hive.ql.metadata.Hive.getFunction(Hive.java:5495) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.Registry.getFunctionInfoFromMetastoreNoLock(Registry.java:788)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.Registry.getQualifiedFunctionInfo(Registry.java:657)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.hadoop.hive.ql.exec.Registry.getFunctionInfo(Registry.java:351) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:597)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.optimizer.calcite.translator.SqlFunctionConverter.getHiveUDF(SqlFunctionConverter.java:158)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:112)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:68)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:134)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:68)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:134)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:68)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:134)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] 
 
So it's may be better to handle explicit cast before geting the FunctionInfo 
from Registry. Even if there is no cast in the query,  the method 
handleExplicitCast returns null quickly when op.kind is not a SqlKind.CAST.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24044) Implement listPartitionNames with filter or order on temporary tables

2020-08-16 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24044:
--

 Summary: Implement listPartitionNames with filter or order on 
temporary tables 
 Key: HIVE-24044
 URL: https://issues.apache.org/jira/browse/HIVE-24044
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 4.0.0
Reporter: Zhihua Deng


Temporary tables can have their own partitions,  and IMetaStoreClient use
{code:java}
List listPartitionNames(PartitionsByExprRequest request){code}
to filter or sort the results. This method can be implemented on temporary 
tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23997) Some logs in ConstantPropagateProcFactory are not straightforward

2020-08-05 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23997:
--

 Summary: Some logs in ConstantPropagateProcFactory are not 
straightforward
 Key: HIVE-23997
 URL: https://issues.apache.org/jira/browse/HIVE-23997
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Reporter: Zhihua Deng
Assignee: Zhihua Deng


Some logs in ConstantPropagateProcFactory are not easy to understand,  like 
query:
 select * from tbl where a = 'a1';
showing some logs like this:
optimizer.ConstantPropagateProcFactory: Filter 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqual@78907a46 is identified 
as a value assignment, propagate it.
 
Maybe It's better to log like this:
optimizer.ConstantPropagateProcFactory: Filter (a = 'a1') is identified as a 
value assignment, propagate it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23893) Extract deterministic conditions for pdd when the predicate contains non-deterministic function

2020-07-21 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23893:
--

 Summary: Extract deterministic conditions for pdd when the 
predicate contains non-deterministic function
 Key: HIVE-23893
 URL: https://issues.apache.org/jira/browse/HIVE-23893
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Reporter: Zhihua Deng


Taken the following query for example, assume unix_timestamp is 
non-deterministic before version 1.3.0:
 
{{SELECT}}
{{        from_unixtime(unix_timestamp(a.first_dt), 'MMdd') AS ft,}}
{{        b.game_id AS game_id,}}
{{        b.game_name AS game_name,}}
{{        count(DISTINCT a.sha1_imei) uv}}
{{FROM}}
{{        gamesdk_userprofile a}}
{{        JOIN game_info_all b ON a.appid = b.dev_app_id}}
{{WHERE}}
{{        a.date = 20200704}}
{{        AND from_unixtime(unix_timestamp(a.first_dt), 'MMdd') = 20200704}}
{{        AND b.date = 20200704}}
{{GROUP BY}}
{{        from_unixtime(unix_timestamp(a.first_dt), 'MMdd'),}}
{{        b.game_id,}}
{{        b.game_name}}
{{ORDER BY}}
{{        uv DESC}}
{{LIMIT 200;}}
 
The predicates(a.date = 20200704, b.date = 20200704) are unable to push down to 
join op, make the optimizer unable to prune partitions, which may result  to a 
full scan on tables gamesdk_userprofile and game_info_all.
{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23850) Allow PPD when subject is not a column with grouping sets present

2020-07-15 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23850:
--

 Summary: Allow PPD when subject is not a column with grouping sets 
present
 Key: HIVE-23850
 URL: https://issues.apache.org/jira/browse/HIVE-23850
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Reporter: Zhihua Deng


After [HIVE-19653|https://issues.apache.org/jira/browse/HIVE-19653],  filters 
with only columns and constants are pushed down, but in some cases,  this may 
not work as well, for example:
SET hive.cbo.enable=false;
SELECT a, b, sum(s)
FROM T1
GROUP BY a, b GROUPING SETS ((a), (a, b))
HAVING upper(a) = "AAA" AND sum(s) > 100;
 
SELECT upper(a), b, sum(s)
FROM T1
GROUP BY upper(a), b GROUPING SETS ((upper(a)), (upper(a), b))
HAVING upper(a) = "AAA" AND sum(s) > 100;
 
The filters pushed down to GBY can be f(gbyKey) or gbyKey with udf ,  not only 
the column groupby keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23800) Make HiveServer2 oom hook interface

2020-07-03 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23800:
--

 Summary: Make HiveServer2 oom hook interface
 Key: HIVE-23800
 URL: https://issues.apache.org/jira/browse/HIVE-23800
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Zhihua Deng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23797) Throwing exception when no metastore spec found in zookeeper

2020-07-01 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23797:
--

 Summary: Throwing exception when no metastore spec found in 
zookeeper
 Key: HIVE-23797
 URL: https://issues.apache.org/jira/browse/HIVE-23797
 Project: Hive
  Issue Type: Bug
Reporter: Zhihua Deng


When enable service discovery for metastore, there is a chance that the client 
may find no metastore uris available in zookeeper, such as during metastores 
startup or the client wrongly configured the path. This results to redundant 
retries and finally MetaException with "Unknown exception" message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23727) Improve SQLOperation log handling when cleanup

2020-06-19 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23727:
--

 Summary: Improve SQLOperation log handling when cleanup
 Key: HIVE-23727
 URL: https://issues.apache.org/jira/browse/HIVE-23727
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


The SQLOperation checks _if (shouldRunAsync() && state != 
OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the 
background task. If true, the state should not be OperationState.CANCELED, so 
logging under the state == OperationState.CANCELED should never happen.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23722) Emit operation's drilldown link to client

2020-06-17 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23722:
--

 Summary: Emit operation's drilldown link to client
 Key: HIVE-23722
 URL: https://issues.apache.org/jira/browse/HIVE-23722
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


Now the HiveServer2 webui provides a drilldown link for many collected metrics 
or messages about a operation, but it's not easy for a end user to find the 
target url of his submitted query. Less knowledge on the deployment, ha based 
environment(such as using LVS for balancing or routing), and the multiple 
running queries can make things more difficult. The jira provides a way to emit 
the link to the interested end user when enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23720) Background task should be interrupted when operation being canceled or timeout

2020-06-17 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23720:
--

 Summary: Background task should be interrupted when operation 
being canceled or timeout
 Key: HIVE-23720
 URL: https://issues.apache.org/jira/browse/HIVE-23720
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Zhihua Deng


Currently SQLOperation cancels the background task only when the condition is 
met:

if (shouldRunAsync() && state != OperationState.CANCELED && state != 
OperationState.TIMEDOUT)

The conditions is evaluated to false when state is OperationState.CANCELED or 
OperationState.TIMEDOUT,  but operations in such states should stop the 
background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23633) Metastore some JDO query objects do not close properly

2020-06-08 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23633:
--

 Summary: Metastore some JDO query objects do not close properly
 Key: HIVE-23633
 URL: https://issues.apache.org/jira/browse/HIVE-23633
 Project: Hive
  Issue Type: Bug
Reporter: Zhihua Deng


After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895],  
The metastore still has seen a memory leak on db resources: many StatementImpls 
left unclosed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23546) Skip authorization when user is a superuser

2020-05-25 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23546:
--

 Summary: Skip authorization when user is a superuser
 Key: HIVE-23546
 URL: https://issues.apache.org/jira/browse/HIVE-23546
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


If the current user is a superuser, there is no need to do authorization. This 
can speed up queries, especially for those ddl queries. For example, the 
superuser use show partitions to determine whether is OK to add partitions when 
the external data is ready, or take a work flow one step further in a busy hive 
cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23526) Out of sequence seen in Beeline may swallow the real problem

2020-05-21 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23526:
--

 Summary: Out of sequence seen in Beeline may swallow the real 
problem 
 Key: HIVE-23526
 URL: https://issues.apache.org/jira/browse/HIVE-23526
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
 Environment: Hive 1.2.2
Reporter: Zhihua Deng


Sometimes we can see 'out of sequence response' message in beeline, for example:

Error: org.apache.thrift.TApplicationException: CloseOperation failed: out of 
sequence response (state=08S01,code=0)
java.sql.SQLException: org.apache.thrift.TApplicationException: CloseOperation 
failed: out of sequence response
at 
org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:198)
at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:217)
at org.apache.hive.beeline.Commands.execute(Commands.java:891)
at org.apache.hive.beeline.Commands.sql(Commands.java:713)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:976)
at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:816)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:774)
at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:487)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:470)

and there is no other usage messages to figured it out, this makes problem 
puzzled as beeline does not have concurrency problem on underlying thrift 
transport.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23269) Unsafe compares bigints and chars

2020-04-22 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23269:
--

 Summary: Unsafe compares bigints and chars
 Key: HIVE-23269
 URL: https://issues.apache.org/jira/browse/HIVE-23269
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


Comparing bigints and varchars or chars may result to wrong result,  for 
example:

CREATE TABLE test_a (appid1 varchar(256),  appid2 char(20));
INSERT INTO  test_a VALUES ('2882303761517473127', '2882303761517473127'), 
('2882303761517473276','2882303761517473276');

SET hive.strict.checks.type.safety=false;
SELECT appid1 FROM test_a WHERE appid1 = 2882303761517473127;
SELECT appid2 FROM test_a WHERE appid2 = 2882303761517473127;​

Both queries will output the row: ('2882303761517473276','2882303761517473276')




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23185) Historic queries lost after HS2 restart

2020-04-13 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23185:
--

 Summary: Historic queries lost after HS2 restart
 Key: HIVE-23185
 URL: https://issues.apache.org/jira/browse/HIVE-23185
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Zhihua Deng


QueryInfoCache caches historic queries in memory, when HS2 restart due to OOM 
or upgrade, the queries are no longer seen at webui.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22989) Don't close parent classloader when session being closed

2020-03-06 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-22989:
--

 Summary: Don't close parent classloader when session being closed
 Key: HIVE-22989
 URL: https://issues.apache.org/jira/browse/HIVE-22989
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Zhihua Deng


When hiveserver2 load udfs,  Registry will use session specified classloader to 
load them and add cache the classloader.  When user don't set the aux jars,  
the classloader cached is equal to the session's parent classloader, in our 
case, we don't set the aux jars while update the session's parent classloader 
periodicity to update user jars dynamically. It's should do a sanity check when 
Registry closes the cached classloaders.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22983) Address the comments on ConstantPropagate

2020-03-05 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-22983:
--

 Summary: Address the comments on ConstantPropagate
 Key: HIVE-22983
 URL: https://issues.apache.org/jira/browse/HIVE-22983
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Reporter: Zhihua Deng


The constantPropagate traverse the DAG from root to child, the child won’t 
start until all his parents have been visited.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22458) Add more constraints on showing partitions

2019-11-05 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-22458:
--

 Summary: Add more constraints on showing partitions
 Key: HIVE-22458
 URL: https://issues.apache.org/jira/browse/HIVE-22458
 Project: Hive
  Issue Type: New Feature
Reporter: Zhihua Deng


When we showing partitions of a table with thousands of partitions,  all the 
partitions will be returned and it's not easy to catch the specified one from 
them, this make showing partitions hard to use. We can add where/limit/order by 
constraints to show partitions like:

 show partitions table_name [partition_specs] where partition_field >= value 
order by partition_field desc limit n;

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-19818) SessionState getQueryId returns an empty string

2018-06-06 Thread Zhihua Deng (JIRA)
Zhihua Deng created HIVE-19818:
--

 Summary: SessionState getQueryId returns an empty string
 Key: HIVE-19818
 URL: https://issues.apache.org/jira/browse/HIVE-19818
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.2
Reporter: Zhihua Deng


When we execute sql asynchronously,  a new configuration based on the session 
holds will be created and passed to the driver instance, which resulting to 
return an empty string when SessionState#getQueryId called later on. This 
problem can be seen in HadoopJobExecHelper.java.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-16114) NullPointerException in TezSessionPoolManager when getting the session

2017-03-05 Thread Zhihua Deng (JIRA)
Zhihua Deng created HIVE-16114:
--

 Summary: NullPointerException in TezSessionPoolManager when 
getting the session
 Key: HIVE-16114
 URL: https://issues.apache.org/jira/browse/HIVE-16114
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Zhihua Deng
Priority: Minor


hive version: apache-hive-2.1.1 
we use hue(3.11.0) connecting to the HiveServer2.  when hue starts up, it works 
with no problems, a few hours passed, when we use the same sql, an exception 
about unable to initialize TezTask will come into being.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)