[ 
https://issues.apache.org/jira/browse/HIVE-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-15259:
------------------------------------
    Attachment: Deserialization_HOS20.PNG
                Deserialization_HOS16.PNG

[~xuefuz] , [~lirui] and [~Ferd]: please help see this problem.
I guess the problem is because before we link spark-assembly.jar to the 
$HIVE_HOME/lib/ while in latest code we need copy all jars from 
$SPARK_HOME/jars/ to $HIVE_HOME/lib/.

HOS20 log
{code}
2016-11-22T00:50:41,710  INFO [stderr-redir-1] client.SparkClientImpl: 16/11/22 
00:50:41 INFO yarn.Client: Uploading resource 
file:/tmp/spark-fe2aeecc-12a7-427f-9a5d-cf6e7335bf46/__spark_libs__3968994973591034858.zip
 -> 
hdfs://bdpe42:8020/user/root/.sparkStaging/application_1479702875308_0033/__spark_libs__3968994973591034858.zip
2016-11-22T00:50:42,376  INFO [stderr-redir-1] client.SparkClientImpl: 16/11/22 
00:50:42 INFO yarn.Client: Uploading resource 
file:/home/apache-hive-2.2.0-SNAPSHOT-bin/lib/hive-exec-2.2.0-SNAPSHOT.jar -> 
hdfs://bdpe42:8020/user/root/.sparkStaging/application_1479702875308_0033/hive-exec-2.2.0-SNAPSHOT.jar
2016-11-22T00:50:42,542  INFO [stderr-redir-1] client.SparkClientImpl: 16/11/22 
00:50:42 INFO yarn.Client: Uploading resource 
file:/tmp/spark-fe2aeecc-12a7-427f-9a5d-cf6e7335bf46/__spark_conf__7123360802254473357.zip
 -> 
hdfs://bdpe42:8020/user/root/.sparkStaging/application_1479702875308_0033/__spark_conf__.zip
{code}


HOS16 log
{code}
yarn.Client: Uploading resource 
file:/home/spark16/spark-1.6.2-bin-hadoop2-without-hive/lib/spark-assembly-1.6.2-hadoop2.6.0.jar
 -> hdfs://bdpe42:8020/user/root/.sparkStaging/application_1    
479702875308_0034/spark-assembly-1.6.2-hadoop2.6.0.jar
777 2016-11-22T00:55:30,145  INFO [stderr-redir-1] client.SparkClientImpl: 
16/11/22 00:55:30 INFO yarn.Client: Uploading resource file:/home/spar    
k16/spark16-apache-hive-2.2.0-SNAPSHOT-bin/lib/hive-exec-2.2.0-SNAPSHOT.jar -> 
hdfs://bdpe42:8020/user/root/.sparkStaging/application_1479702    
875308_0034/hive-exec-2.2.0-SNAPSHOT.jar
778 2016-11-22T00:55:30,325  INFO [stderr-redir-1] client.SparkClientImpl: 
16/11/22 00:55:30 INFO yarn.Client: Uploading resource file:/tmp/spark    
-35202f4e-8054-47a1-ae50-0c2468f374f6/__spark_conf__611310910041274518.zip -> 
hdfs://bdpe42:8020/user/root/.sparkStaging/application_14797028    
75308_0034/__spark_conf__611310910041274518.zip
{code}

Above is log snippet when i run HOS20 and HOS16.
in HOS20, it uploads 
/tmp/spark-fe2aeecc-12a7-427f-9a5d-cf6e7335bf46/__spark_libs__3968994973591034858.zip
 to hdfs while 
in HOS16, it uploads 
/home/spark16/spark-1.6.2-bin-hadoop2-without-hive/lib/spark-assembly-1.6.2-hadoop2.6.0.jar
 to hdfs.

*Not* clear spark will put *which* jars to  
/tmp/spark-fe2aeecc-12a7-427f-9a5d-cf6e7335bf46/__spark_libs__3968994973591034858.zip
 in HOS20. i will investigate on it.



> The deserialization time of HOS20 is longer than what in  HOS16
> ---------------------------------------------------------------
>
>                 Key: HIVE-15259
>                 URL: https://issues.apache.org/jira/browse/HIVE-15259
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: liyunzhang_intel
>         Attachments: Deserialization_HOS16.PNG, Deserialization_HOS20.PNG
>
>
> deploy Hive on Spark on spark 1.6 version and spark 2.0 version.
> run query and in latest code(with spark2.0) the deserialization time of a 
> task is 4 sec while the deserialization time of spark1.6 is 1 sec. The detail 
> is in attached picture.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to