[jira] [Commented] (HIVE-7765) Null pointer error with UNION ALL on partitioned tables using Tez

Alex Bush (JIRA) Mon, 29 Jun 2015 07:33:15 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605693#comment-14605693
 ]


Alex Bush commented on HIVE-7765:
---------------------------------

Stack trace from error:
SELECT * FROM test5_a
UNION ALL
SELECT * FROM test5_b
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=Driver.run 
from=org.apache.hadoop.hive.ql.Driver>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=TimeToSubmit 
from=org.apache.hadoop.hive.ql.Driver>
15/06/29 15:11:35 [main]: INFO ql.Driver: Concurrency mode is disabled, not 
creating a lock manager
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=compile 
from=org.apache.hadoop.hive.ql.Driver>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=parse 
from=org.apache.hadoop.hive.ql.Driver>
15/06/29 15:11:35 [main]: INFO parse.ParseDriver: Parsing command:

SELECT * FROM test5_a
UNION ALL
SELECT * FROM test5_b
15/06/29 15:11:35 [main]: INFO parse.ParseDriver: Parse Completed
15/06/29 15:11:35 [main]: INFO log.PerfLogger: </PERFLOG method=parse 
start=1435587095311 end=1435587095313 duration=2 
from=org.apache.hadoop.hive.ql.Driver>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=semanticAnalyze 
from=org.apache.hadoop.hive.ql.Driver>
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Starting Semantic 
Analysis
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Completed phase 1 of 
Semantic Analysis
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Get metadata for source 
tables
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Get metadata for 
subqueries
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Get metadata for source 
tables
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Get metadata for 
subqueries
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Get metadata for 
destination tables
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Get metadata for source 
tables
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Get metadata for 
subqueries
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Get metadata for 
destination tables
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Get metadata for 
destination tables
15/06/29 15:11:35 [main]: INFO ql.Context: New scratch dir is 
hdfs://upgtst226/tmp/hive/hdp_batch/57614a3b-aa9a-4bf8-82ca-4451f72b9d28/hive_2015-06-29_15-11-35_310_293431230946478382-1
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Completed getting 
MetaData in Semantic Analysis
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Not invoking CBO because 
the statement has too few joins
15/06/29 15:11:35 [main]: INFO parse.SemanticAnalyzer: Set stats collection dir 
: 
hdfs://upgtst226/tmp/hive/hdp_batch/57614a3b-aa9a-4bf8-82ca-4451f72b9d28/hive_2015-06-29_15-11-35_310_293431230946478382-1/-ext-10002
15/06/29 15:11:35 [main]: INFO ppd.OpProcFactory: Processing for FS(6)
15/06/29 15:11:35 [main]: INFO ppd.OpProcFactory: Processing for SEL(5)
15/06/29 15:11:35 [main]: INFO ppd.OpProcFactory: Processing for UNION(4)
15/06/29 15:11:35 [main]: INFO ppd.OpProcFactory: Processing for SEL(1)
15/06/29 15:11:35 [main]: INFO ppd.OpProcFactory: Processing for TS(0)
15/06/29 15:11:35 [main]: INFO ppd.OpProcFactory: Processing for SEL(3)
15/06/29 15:11:35 [main]: INFO ppd.OpProcFactory: Processing for TS(2)
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG 
method=partition-retrieving 
from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: </PERFLOG 
method=partition-retrieving start=1435587095494 end=1435587095606 duration=112 
from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG 
method=partition-retrieving 
from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: </PERFLOG 
method=partition-retrieving start=1435587095606 end=1435587095735 duration=129 
from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner>
15/06/29 15:11:35 [main]: INFO parse.TezCompiler: Cycle free: true
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=serializePlan 
from=org.apache.hadoop.hive.ql.exec.Utilities>
15/06/29 15:11:35 [main]: INFO exec.Utilities: Serializing ArrayList via kryo
15/06/29 15:11:35 [main]: INFO log.PerfLogger: </PERFLOG method=serializePlan 
start=1435587095740 end=1435587095743 duration=3 
from=org.apache.hadoop.hive.ql.exec.Utilities>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=deserializePlan 
from=org.apache.hadoop.hive.ql.exec.Utilities>
15/06/29 15:11:35 [main]: INFO exec.Utilities: Deserializing ArrayList via kryo
15/06/29 15:11:35 [main]: INFO log.PerfLogger: </PERFLOG method=deserializePlan 
start=1435587095743 end=1435587095746 duration=3 
from=org.apache.hadoop.hive.ql.exec.Utilities>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=serializePlan 
from=org.apache.hadoop.hive.ql.exec.Utilities>
15/06/29 15:11:35 [main]: INFO exec.Utilities: Serializing ArrayList via kryo
15/06/29 15:11:35 [main]: INFO log.PerfLogger: </PERFLOG method=serializePlan 
start=1435587095747 end=1435587095747 duration=0 
from=org.apache.hadoop.hive.ql.exec.Utilities>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=deserializePlan 
from=org.apache.hadoop.hive.ql.exec.Utilities>
15/06/29 15:11:35 [main]: INFO exec.Utilities: Deserializing ArrayList via kryo
15/06/29 15:11:35 [main]: INFO log.PerfLogger: </PERFLOG method=deserializePlan 
start=1435587095747 end=1435587095747 duration=0 
from=org.apache.hadoop.hive.ql.exec.Utilities>
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Looking for 
table scans where optimization is applicable
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Found 0 null 
table scans
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Looking for 
table scans where optimization is applicable
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Found 0 null 
table scans
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Looking for 
table scans where optimization is applicable
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Found 0 null 
table scans
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Looking for 
table scans where optimization is applicable
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Found 0 null 
table scans
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Looking for 
table scans where optimization is applicable
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Found 0 null 
table scans
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Looking for 
table scans where optimization is applicable
15/06/29 15:11:35 [main]: INFO physical.NullScanTaskDispatcher: Found 0 null 
table scans
15/06/29 15:11:35 [main]: INFO physical.Vectorizer: Validating MapWork...
FAILED: NullPointerException null
15/06/29 15:11:35 [main]: ERROR ql.Driver: FAILED: NullPointerException null
java.lang.NullPointerException
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
        at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateMapWork(Vectorizer.java:357)
        at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:321)
        at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:307)
        at 
org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
        at 
org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
        at 
org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
        at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:847)
        at 
org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:468)
        at 
org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:223)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170)
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345)
        at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:733)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

15/06/29 15:11:35 [main]: INFO log.PerfLogger: </PERFLOG method=compile 
start=1435587095308 end=1435587095766 duration=458 
from=org.apache.hadoop.hive.ql.Driver>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks 
from=org.apache.hadoop.hive.ql.Driver>
15/06/29 15:11:35 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks 
start=1435587095766 end=1435587095766 duration=0 
from=org.apache.hadoop.hive.ql.Driver>




> Null pointer error with UNION ALL on partitioned tables using Tez
> -----------------------------------------------------------------
>
>                 Key: HIVE-7765
>                 URL: https://issues.apache.org/jira/browse/HIVE-7765
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.14.0, 0.13.1
>         Environment: Tez 0.4.1, Ubuntu 12.04, Hadoop 2.4.1.
>            Reporter: Chris Dragga
>            Priority: Minor
>
> When executing a UNION ALL query in Tez over partitioned tables where at 
> least one table is empty, Hive fails to execute the query, returning the 
> message "FAILED: NullPointerException null".  No stack trace accompanies this 
> message.  Removing partitioning solves this problem, as does switching to 
> MapReduce as the execution engine.
> This can be reproduced using a variant of the example tables from the 
> "Getting Started" documentation on the Hive wiki.  To create the schema, use
> CREATE TABLE invites (foo INT, bar STRING) PARTITIONED BY (ds STRING);
> CREATE TABLE empty_invites (foo INT, bar STRING) PARTITIONED BY (ds STRING);
> Then, load invites with data (e.g., using the instructions 
> [here|https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-DMLOperations])
>  and execute the following:
> SELECT * FROM invites
> UNION ALL
> SELECT * FROM empty_invites;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7765) Null pointer error with UNION ALL on partitioned tables using Tez

Reply via email to