[jira] [Created] (HIVE-23981) Use task counter enum to get the approximate counter value

2020-08-03 Thread mahesh kumar behera (Jira)
mahesh kumar behera created HIVE-23981:
--

 Summary: Use task counter enum to get the approximate counter value
 Key: HIVE-23981
 URL: https://issues.apache.org/jira/browse/HIVE-23981
 Project: Hive
  Issue Type: Bug
Reporter: mahesh kumar behera


There are cases when compiler misestimates key count and this results in a 
number of hashtable resizes during runtime.

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTableLoader.java#L128]

In such cases, it would be good to get "approximate_input_records" (TEZ-4207) 
counter from upstream to compute the key count more accurately at runtime.

 
 * 
 * 
Options
h4.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23980) Shade guava from existing Hive modules

2020-08-03 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HIVE-23980:
--

 Summary: Shade guava from existing Hive modules
 Key: HIVE-23980
 URL: https://issues.apache.org/jira/browse/HIVE-23980
 Project: Hive
  Issue Type: Bug
Reporter: L. C. Hsieh


I'm trying to upgrade Guava version in Spark. The JIRA ticket is SPARK-32502.

Running test hits an error:
{code}
sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: java.lang.IllegalAccessError: 
tried to access method 
com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator;
 from class org.apache.hadoop.hive.ql.exec.FetchOperator
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108)
at 
org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
{code}

I know that hive-exec doesn't shade Guava until HIVE-22126 but that work 
targets 4.0.0. I'm wondering if there is a solution for current Hive versions, 
e.g. Hive 2.3.7? Any ideas?

Thanks.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23979) Resolve spotbugs errors in JsonReporter.java, Metrics.java, and PerfLogger.java

2020-08-03 Thread Soumyakanti Das (Jira)
Soumyakanti Das created HIVE-23979:
--

 Summary: Resolve spotbugs errors in JsonReporter.java, 
Metrics.java, and PerfLogger.java
 Key: HIVE-23979
 URL: https://issues.apache.org/jira/browse/HIVE-23979
 Project: Hive
  Issue Type: New Feature
Reporter: Soumyakanti Das
Assignee: Soumyakanti Das


Resolve these spotbugs errors:
[ERROR] Found reliance on default encoding in 
org.apache.hadoop.hive.metastore.metrics.JsonReporter.report(SortedMap, 
SortedMap, SortedMap, SortedMap, SortedMap): new java.io.FileWriter(File) 
[org.apache.hadoop.hive.metastore.metrics.JsonReporter] At 
JsonReporter.java:[line 159] DM_DEFAULT_ENCODING

[ERROR] Incorrect lazy initialization of static field 
org.apache.hadoop.hive.metastore.metrics.Metrics.self in 
org.apache.hadoop.hive.metastore.metrics.Metrics.shutdown() 
[org.apache.hadoop.hive.metastore.metrics.Metrics] At Metrics.java:[lines 
79-85] LI_LAZY_INIT_STATIC

[ERROR] The method name 
org.apache.hadoop.hive.metastore.metrics.PerfLogger.PerfLogBegin(String, 
String) doesn't start with a lower case letter 
[org.apache.hadoop.hive.metastore.metrics.PerfLogger] At PerfLogger.java:[lines 
92-98] NM_METHOD_NAMING_CONVENTION

[ERROR] The method name 
org.apache.hadoop.hive.metastore.metrics.PerfLogger.PerfLogEnd(String, String) 
doesn't start with a lower case letter 
[org.apache.hadoop.hive.metastore.metrics.PerfLogger] At PerfLogger.java:[line 
106] NM_METHOD_NAMING_CONVENTION

[ERROR] The method name 
org.apache.hadoop.hive.metastore.metrics.PerfLogger.PerfLogEnd(String, String, 
String) doesn't start with a lower case letter 
[org.apache.hadoop.hive.metastore.metrics.PerfLogger] At PerfLogger.java:[lines 
116-138] NM_METHOD_NAMING_CONVENTION



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23978) Enable logging with PerfLogger in HMS client

2020-08-03 Thread Soumyakanti Das (Jira)
Soumyakanti Das created HIVE-23978:
--

 Summary: Enable logging with PerfLogger in HMS client
 Key: HIVE-23978
 URL: https://issues.apache.org/jira/browse/HIVE-23978
 Project: Hive
  Issue Type: New Feature
Reporter: Soumyakanti Das


Currently we cannot use PerfLogger in HiveMetaStoreClient.java to log duration 
of API calls. When PerfLogger.java is moved from metastore-server to 
metastore-common, without changing the package definition, many tests fail, 
although metastore-server has a dependency on metastore-common.
More analysis and investigation is needed to understand the root cause of this 
issue.

Related to [HIVE-23949|https://issues.apache.org/jira/browse/HIVE-23949]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23977) Consolidate partition fetch to one place

2020-08-03 Thread Steve Carlin (Jira)
Steve Carlin created HIVE-23977:
---

 Summary: Consolidate partition fetch to one place
 Key: HIVE-23977
 URL: https://issues.apache.org/jira/browse/HIVE-23977
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Steve Carlin






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23976) Enable vectorization for multi-col semi join reducers

2020-08-03 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created HIVE-23976:
--

 Summary: Enable vectorization for multi-col semi join reducers
 Key: HIVE-23976
 URL: https://issues.apache.org/jira/browse/HIVE-23976
 Project: Hive
  Issue Type: Improvement
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis


HIVE-21196 introduces multi-column semi-join reducers in the query engine. 
However, the implementation relies on GenericUDFMurmurHash which is not 
vectorized thus the respective operators cannot be executed in vectorized mode. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)