[ https://issues.apache.org/jira/browse/HIVE-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
liyunzhang_intel updated HIVE-15259: ------------------------------------ Attachment: Deserialization_HOS20.PNG Deserialization_HOS16.PNG [~xuefuz] , [~lirui] and [~Ferd]: please help see this problem. I guess the problem is because before we link spark-assembly.jar to the $HIVE_HOME/lib/ while in latest code we need copy all jars from $SPARK_HOME/jars/ to $HIVE_HOME/lib/. HOS20 log {code} 2016-11-22T00:50:41,710 INFO [stderr-redir-1] client.SparkClientImpl: 16/11/22 00:50:41 INFO yarn.Client: Uploading resource file:/tmp/spark-fe2aeecc-12a7-427f-9a5d-cf6e7335bf46/__spark_libs__3968994973591034858.zip -> hdfs://bdpe42:8020/user/root/.sparkStaging/application_1479702875308_0033/__spark_libs__3968994973591034858.zip 2016-11-22T00:50:42,376 INFO [stderr-redir-1] client.SparkClientImpl: 16/11/22 00:50:42 INFO yarn.Client: Uploading resource file:/home/apache-hive-2.2.0-SNAPSHOT-bin/lib/hive-exec-2.2.0-SNAPSHOT.jar -> hdfs://bdpe42:8020/user/root/.sparkStaging/application_1479702875308_0033/hive-exec-2.2.0-SNAPSHOT.jar 2016-11-22T00:50:42,542 INFO [stderr-redir-1] client.SparkClientImpl: 16/11/22 00:50:42 INFO yarn.Client: Uploading resource file:/tmp/spark-fe2aeecc-12a7-427f-9a5d-cf6e7335bf46/__spark_conf__7123360802254473357.zip -> hdfs://bdpe42:8020/user/root/.sparkStaging/application_1479702875308_0033/__spark_conf__.zip {code} HOS16 log {code} yarn.Client: Uploading resource file:/home/spark16/spark-1.6.2-bin-hadoop2-without-hive/lib/spark-assembly-1.6.2-hadoop2.6.0.jar -> hdfs://bdpe42:8020/user/root/.sparkStaging/application_1 479702875308_0034/spark-assembly-1.6.2-hadoop2.6.0.jar 777 2016-11-22T00:55:30,145 INFO [stderr-redir-1] client.SparkClientImpl: 16/11/22 00:55:30 INFO yarn.Client: Uploading resource file:/home/spar k16/spark16-apache-hive-2.2.0-SNAPSHOT-bin/lib/hive-exec-2.2.0-SNAPSHOT.jar -> hdfs://bdpe42:8020/user/root/.sparkStaging/application_1479702 875308_0034/hive-exec-2.2.0-SNAPSHOT.jar 778 2016-11-22T00:55:30,325 INFO [stderr-redir-1] client.SparkClientImpl: 16/11/22 00:55:30 INFO yarn.Client: Uploading resource file:/tmp/spark -35202f4e-8054-47a1-ae50-0c2468f374f6/__spark_conf__611310910041274518.zip -> hdfs://bdpe42:8020/user/root/.sparkStaging/application_14797028 75308_0034/__spark_conf__611310910041274518.zip {code} Above is log snippet when i run HOS20 and HOS16. in HOS20, it uploads /tmp/spark-fe2aeecc-12a7-427f-9a5d-cf6e7335bf46/__spark_libs__3968994973591034858.zip to hdfs while in HOS16, it uploads /home/spark16/spark-1.6.2-bin-hadoop2-without-hive/lib/spark-assembly-1.6.2-hadoop2.6.0.jar to hdfs. *Not* clear spark will put *which* jars to /tmp/spark-fe2aeecc-12a7-427f-9a5d-cf6e7335bf46/__spark_libs__3968994973591034858.zip in HOS20. i will investigate on it. > The deserialization time of HOS20 is longer than what in HOS16 > --------------------------------------------------------------- > > Key: HIVE-15259 > URL: https://issues.apache.org/jira/browse/HIVE-15259 > Project: Hive > Issue Type: Improvement > Reporter: liyunzhang_intel > Attachments: Deserialization_HOS16.PNG, Deserialization_HOS20.PNG > > > deploy Hive on Spark on spark 1.6 version and spark 2.0 version. > run query and in latest code(with spark2.0) the deserialization time of a > task is 4 sec while the deserialization time of spark1.6 is 1 sec. The detail > is in attached picture. -- This message was sent by Atlassian JIRA (v6.3.4#6332)