[
https://issues.apache.org/jira/browse/HIVE-27929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806248#comment-17806248
]
Sungwoo Park commented on HIVE-27929:
-------------------------------------
=== Build Tez 0.10.2 --> okay
pom.xml should be updated as follows. Without this update, Hive generates
ClassNotFoundException if Tez jars appear before Hive jars in the classpath.
+ <dependency>
+ <groupId>org.apache.commons</groupId>
+ <artifactId>commons-lang3</artifactId>
+ <version>3.12.0</version>
+ </dependency>
=== Build Hive branch-4.0 --> okay
We use the latest commit as of Jan 12:
commit f355c82a5aa77ef1496b35c22b8ac9b84dfe1780
Author: seonggon <[email protected]>
Date: Wed Jan 10 18:15:39 2024 +0900
HIVE-27988: Don't convert FullOuterJoin with filter to MapJoin (Seonggon
Namgung, reviewed by Denys Kuzmenko)
=== Run Metastore and HiveServer2 --> okay
Use hive.execution.engine=tez and hive.execution.mode=container
=== Loading TPC-DS 1TB text data in external tables --> okay
Example:
create external table catalog_page(
cp_catalog_page_sk bigint
, cp_catalog_page_id string
, cp_start_date_sk bigint
, cp_end_date_sk bigint
, cp_department string
, cp_catalog_number int
, cp_catalog_page_number int
, cp_description string
, cp_type string
)
row format delimited fields terminated by '|'
location 'hdfs://blue0:8020/tmp/tpcds-generate/1000/catalog_page';
=== Loading TPC-DS 1TB ORC data, transactional=false --> okay
Example:
create table catalog_page
stored as orc
TBLPROPERTIES('transactional'='false', 'transactional_properties'='default')
as select * from tpcds_text_1000_tez.catalog_page;
=== Run TPC-DS --> all okay with correct results
Query 64 gets stuck at the first attempt (killed after running for 6000
seconds). It succeeds at the second attempt.
Some configuration keys in hive-site.xml that affected the correctness in
earlier versions:
-- HIVE-26621
<property>
<name>hive.optimize.shared.work.dppunion</name>
<value>false</value>
</property>
<property>
<name>hive.optimize.shared.work.dppunion.merge.eventops</name>
<value>false</value>
</property>
<property>
<name>hive.optimize.shared.work.downstream.merge</name>
<value>false</value>
</property>
<property>
<name>hive.optimize.shared.work.parallel.edge.support</name>
<value>false</value>
</property>
<property>
<name>hive.optimize.shared.work.merge.ts.schema</name>
<value>false</value>
</property>
<property>
<name>hive.optimize.cte.materialize.threshold</name>
<value>-1</value>
</property>
<property>
<name>hive.tez.bloom.filter.merge.threads</name>
<value>0</value>
</property>
<property>
<name>hive.auto.convert.anti.join</name>
<value>false</value>
</property>
=== Test with hive.auto.convert.anti.join=true (HIVE-26659) --> okay
Query 16: correct
Query 69: correct
Query 94: correct
So, we can set hive.auto.convert.anti.join=true.
=== Test with hive.tez.bloom.filter.merge.threads=2/4 (HIVE-26655) --> okay
Query 17, hive.tez.bloom.filter.merge.threads=0: 173.713 seconds
Query 17, hive.tez.bloom.filter.merge.threads=2: 184.626 seconds
Query 17, hive.tez.bloom.filter.merge.threads=4: 184.726 seconds
=== Test with hive.optimize.shared.work.dppunion=true (HIVE-26621) --> okay
Query 2: correct
=== Loading TPC-DS 1TB ORC data, transactional=true --> fail
Error: Error while compiling statement: FAILED: Hive Internal Error:
org.apache.hadoop.hive.ql.lockmgr.LockException(org.apache.thrift.TApplicationException:
Internal error processing get_latest_txnid_in_conflict) (state=42000,code=13)
This error originates from Metastore:
24/01/13 03:32:33 ERROR thrift.ProcessFunction: Internal error processing
get_latest_txnid_in_conflict
java.lang.RuntimeException: java.lang.NoClassDefFoundError:
com/sun/tools/javac/util/List
at
org.apache.hadoop.hive.metastore.txn.TransactionalRetryProxy.lambda$invoke$6(TransactionalRetryProxy.java:182)
...
Caused by: java.lang.ClassNotFoundException: com.sun.tools.javac.util.List
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 35 more
> Run TPC-DS queries and validate results correctness
> ---------------------------------------------------
>
> Key: HIVE-27929
> URL: https://issues.apache.org/jira/browse/HIVE-27929
> Project: Hive
> Issue Type: Sub-task
> Reporter: Denys Kuzmenko
> Assignee: Simhadri Govindappa
> Priority: Major
>
> release branch: *branch-4.0*
> https://github.com/apache/hive/tree/branch-4.0
--
This message was sent by Atlassian Jira
(v8.20.10#820010)