[ 
https://issues.apache.org/jira/browse/HIVE-27929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806248#comment-17806248
 ] 

Sungwoo Park commented on HIVE-27929:
-------------------------------------

=== Build Tez 0.10.2 --> okay

pom.xml should be updated as follows. Without this update, Hive generates 
ClassNotFoundException if Tez jars appear before Hive jars in the classpath.

+    <dependency>
+      <groupId>org.apache.commons</groupId>
+      <artifactId>commons-lang3</artifactId>
+      <version>3.12.0</version>
+    </dependency>

=== Build Hive branch-4.0 --> okay

We use the latest commit as of Jan 12:

commit f355c82a5aa77ef1496b35c22b8ac9b84dfe1780
Author: seonggon <[email protected]>
Date:   Wed Jan 10 18:15:39 2024 +0900

    HIVE-27988: Don't convert FullOuterJoin with filter to MapJoin (Seonggon 
Namgung, reviewed by Denys Kuzmenko)

=== Run Metastore and HiveServer2 --> okay

Use hive.execution.engine=tez and hive.execution.mode=container

=== Loading TPC-DS 1TB text data in external tables --> okay

Example:

create external table catalog_page(
      cp_catalog_page_sk        bigint
,     cp_catalog_page_id        string
,     cp_start_date_sk          bigint
,     cp_end_date_sk            bigint
,     cp_department             string
,     cp_catalog_number         int
,     cp_catalog_page_number    int
,     cp_description            string
,     cp_type                   string
)
row format delimited fields terminated by '|'
location 'hdfs://blue0:8020/tmp/tpcds-generate/1000/catalog_page';

=== Loading TPC-DS 1TB ORC data, transactional=false --> okay

Example:

create table catalog_page
stored as orc
TBLPROPERTIES('transactional'='false', 'transactional_properties'='default')
as select * from tpcds_text_1000_tez.catalog_page;

=== Run TPC-DS --> all okay with correct results

Query 64 gets stuck at the first attempt (killed after running for 6000 
seconds). It succeeds at the second attempt.

Some configuration keys in hive-site.xml that affected the correctness in 
earlier versions:

-- HIVE-26621
<property>
  <name>hive.optimize.shared.work.dppunion</name>
  <value>false</value>
</property>

<property>
  <name>hive.optimize.shared.work.dppunion.merge.eventops</name>
  <value>false</value>
</property>

<property>
  <name>hive.optimize.shared.work.downstream.merge</name>
  <value>false</value>
</property>

<property>
  <name>hive.optimize.shared.work.parallel.edge.support</name>
  <value>false</value>
</property>

<property>
  <name>hive.optimize.shared.work.merge.ts.schema</name>
  <value>false</value>
</property>

<property>
  <name>hive.optimize.cte.materialize.threshold</name>
  <value>-1</value>
</property>

<property>
  <name>hive.tez.bloom.filter.merge.threads</name>
  <value>0</value>
</property>

<property>
  <name>hive.auto.convert.anti.join</name>
  <value>false</value>
</property>

=== Test with hive.auto.convert.anti.join=true (HIVE-26659) --> okay

Query 16: correct
Query 69: correct
Query 94: correct

So, we can set hive.auto.convert.anti.join=true.

=== Test with hive.tez.bloom.filter.merge.threads=2/4 (HIVE-26655) --> okay

Query 17, hive.tez.bloom.filter.merge.threads=0: 173.713 seconds
Query 17, hive.tez.bloom.filter.merge.threads=2: 184.626 seconds
Query 17, hive.tez.bloom.filter.merge.threads=4: 184.726 seconds

=== Test with hive.optimize.shared.work.dppunion=true (HIVE-26621) --> okay

Query 2: correct

=== Loading TPC-DS 1TB ORC data, transactional=true --> fail

Error: Error while compiling statement: FAILED: Hive Internal Error: 
org.apache.hadoop.hive.ql.lockmgr.LockException(org.apache.thrift.TApplicationException:
 Internal error processing get_latest_txnid_in_conflict) (state=42000,code=13)

This error originates from Metastore:

24/01/13 03:32:33 ERROR thrift.ProcessFunction: Internal error processing 
get_latest_txnid_in_conflict
java.lang.RuntimeException: java.lang.NoClassDefFoundError: 
com/sun/tools/javac/util/List
  at 
org.apache.hadoop.hive.metastore.txn.TransactionalRetryProxy.lambda$invoke$6(TransactionalRetryProxy.java:182)
...
Caused by: java.lang.ClassNotFoundException: com.sun.tools.javac.util.List
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 35 more





> Run TPC-DS queries and validate results correctness
> ---------------------------------------------------
>
>                 Key: HIVE-27929
>                 URL: https://issues.apache.org/jira/browse/HIVE-27929
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Denys Kuzmenko
>            Assignee: Simhadri Govindappa
>            Priority: Major
>
> release branch: *branch-4.0*
> https://github.com/apache/hive/tree/branch-4.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to