[jira] [Commented] (HIVE-13300) Hive on spark throws exception for multi-insert with join

Hive QA (JIRA) Sun, 20 Mar 2016 06:16:09 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-13300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203288#comment-15203288
 ]


Hive QA commented on HIVE-13300:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12794110/HIVE-13300.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9838 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_with_join
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7319/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7319/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7319/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12794110 - PreCommit-HIVE-TRUNK-Build

> Hive on spark throws exception for multi-insert with join
> ---------------------------------------------------------
>
>                 Key: HIVE-13300
>                 URL: https://issues.apache.org/jira/browse/HIVE-13300
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: 2.0.0
>            Reporter: Szehon Ho
>            Assignee: Szehon Ho
>         Attachments: HIVE-13300.patch
>
>
> For certain multi-insert queries, Hive on Spark throws a deserialization 
> error.
> {noformat}
> create table status_updates(userid int,status string,ds string);
> create table profiles(userid int,school string,gender int);
> drop table school_summary; create table school_summary(school string,cnt int) 
> partitioned by (ds string);
> drop table gender_summary; create table gender_summary(gender int,cnt int) 
> partitioned by (ds string);
> insert into status_updates values (1, "status_1", "2016-03-16");
> insert into profiles values (1, "school_1", 0);
> set hive.auto.convert.join=false;
> set hive.execution.engine=spark;
> FROM (SELECT a.status, b.school, b.gender
> FROM status_updates a JOIN profiles b
> ON (a.userid = b.userid and
> a.ds='2009-03-20' )
> ) subq1
> INSERT OVERWRITE TABLE gender_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.gender, COUNT(1) GROUP BY subq1.gender
> INSERT OVERWRITE TABLE school_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.school, COUNT(1) GROUP BY subq1.school
> {noformat}
> Error:
> {noformat}
> 16/03/17 13:29:00 [task-result-getter-3]: WARN scheduler.TaskSetManager: Lost 
> task 0.0 in stage 2.0 (TID 3, localhost): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable 
> to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>       at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:279)
>       at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>       at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>       at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
>       at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>       at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>       at org.apache.spark.scheduler.Task.run(Task.scala:89)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:724)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error: Unable to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>       at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:251)
>       ... 12 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
>       at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:241)
>       at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:249)
>       ... 12 more
> Caused by: java.io.EOFException
>       at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>       at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:597)
>       at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:288)
>       at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:237)
>       ... 13 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13300) Hive on spark throws exception for multi-insert with join

Reply via email to