[
https://issues.apache.org/jira/browse/HIVE-28178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HIVE-28178:
--------------------------------
Description:
confusing things:
1. having a conf and jobConf:
https://github.com/apache/hive/blob/77ca03509112655e81d3e56b71c938d548da8b7c/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java#L90-L91
2. inputInitializerContext != null check
https://github.com/apache/hive/blob/77ca03509112655e81d3e56b71c938d548da8b7c/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java#L261C13-L261C44
initializer context must be non-null, otherwise it's a serious bug in tez
we should not indent the whole code because of a confusing null check
UPDATE: context can be non-null when the split generator is used in HS2, e.g.
on this codepath:
{code}
java.lang.NullPointerException: null
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator$SplitSerializer.<init>(HiveSplitGenerator.java:193)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.getSplitSerializer(HiveSplitGenerator.java:521)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.createEventList(HiveSplitGenerator.java:475)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:425)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplits(GenericUDTFGetSplits.java:475)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplitResult(GenericUDTFGetSplits.java:254)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits2.process(GenericUDTFGetSplits2.java:78)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
{code}
was:
confusing things:
1. having a conf and jobConf:
https://github.com/apache/hive/blob/77ca03509112655e81d3e56b71c938d548da8b7c/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java#L90-L91
2. inputInitializerContext != null check
https://github.com/apache/hive/blob/77ca03509112655e81d3e56b71c938d548da8b7c/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java#L261C13-L261C44
initializer context must be non-null, otherwise it's a serious bug in tez
we should not indent the whole code because of a confusing null check
UPDATE: context can be non-null when the split generator is used in HS2, e.g.
on this codepath:
{code}
2024-04-05T00:57:35,207 ERROR [HiveServer2-Background-Pool: Thread-1193]
tez.HiveSplitGenerator: An exception was caught while running split generation,
this is not recoverable
java.lang.NullPointerException: null
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator$SplitSerializer.<init>(HiveSplitGenerator.java:193)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.getSplitSerializer(HiveSplitGenerator.java:521)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.createEventList(HiveSplitGenerator.java:475)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:425)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplits(GenericUDTFGetSplits.java:475)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplitResult(GenericUDTFGetSplits.java:254)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits2.process(GenericUDTFGetSplits2.java:78)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
{code}
> Cleanup some stuff in HiveSplitGenerator
> ----------------------------------------
>
> Key: HIVE-28178
> URL: https://issues.apache.org/jira/browse/HIVE-28178
> Project: Hive
> Issue Type: Improvement
> Reporter: László Bodor
> Priority: Major
>
> confusing things:
> 1. having a conf and jobConf:
> https://github.com/apache/hive/blob/77ca03509112655e81d3e56b71c938d548da8b7c/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java#L90-L91
>
> 2. inputInitializerContext != null check
> https://github.com/apache/hive/blob/77ca03509112655e81d3e56b71c938d548da8b7c/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java#L261C13-L261C44
> initializer context must be non-null, otherwise it's a serious bug in tez
> we should not indent the whole code because of a confusing null check
> UPDATE: context can be non-null when the split generator is used in HS2, e.g.
> on this codepath:
> {code}
> java.lang.NullPointerException: null
> at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator$SplitSerializer.<init>(HiveSplitGenerator.java:193)
> ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.getSplitSerializer(HiveSplitGenerator.java:521)
> ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.createEventList(HiveSplitGenerator.java:475)
> ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:425)
> ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplits(GenericUDTFGetSplits.java:475)
> ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplitResult(GenericUDTFGetSplits.java:254)
> ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits2.process(GenericUDTFGetSplits2.java:78)
> ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)