[jira] [Updated] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-26 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15859:
--
Attachment: HIVE-15859.3.patch

Thanks [~KaiXu] for the clarifications. Update patch v3 to make sure we log the 
exception caught in the pipeline.
[~xuefuz], [~vanzin] could you please take a look? Thanks!

> Hive client side shows Spark Driver disconnected while Spark Driver side 
> could not get RPC header 
> --
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch, 
> HIVE-15859.3.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d

[jira] [Updated] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-15 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15859:
--
Attachment: HIVE-15859.2.patch

Fix test.

> Hive client side shows Spark Driver disconnected while Spark Driver side 
> could not get RPC header 
> --
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d2fc0020c5c3
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1137.0 in 
> stage 3.0 (TID 2515)
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 897.0 in st

[jira] [Updated] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-13 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15859:
--
Status: Patch Available  (was: Open)

> Hive client side shows Spark Driver disconnected while Spark Driver side 
> could not get RPC header 
> --
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Attachments: HIVE-15859.1.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d2fc0020c5c3
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1137.0 in 
> stage 3.0 (TID 2515)
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 897.0 in stage 
> 3.0 (TID 2417)
> 1

[jira] [Updated] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-13 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15859:
--
Attachment: HIVE-15859.1.patch

Patch v1 based on the Livy PR.
[~KaiXu], could you test if the patch fixes your problem? Thanks.

> Hive client side shows Spark Driver disconnected while Spark Driver side 
> could not get RPC header 
> --
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Attachments: HIVE-15859.1.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d2fc0020c5c3
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1137.0 in 
> stage 3.0 (TID 2515)
> 17/02/0

[jira] [Updated] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-13 Thread KaiXu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KaiXu updated HIVE-15859:
-
Affects Version/s: (was: 2.1.1)
   2.2.0

> Hive client side shows Spark Driver disconnected while Spark Driver side 
> could not get RPC header 
> --
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d2fc0020c5c3
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1137.0 in 
> stage 3.0 (TID 2515)
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 897.0 in stage 
> 3.0 (TID 2417)
> 17/02/08 09:51:04 INFO executor.Executor: Executo

[jira] [Updated] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-13 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15859:
--
Description: 
Hive on Spark, failed with error:
{noformat}
2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 961(+383)/1520 
Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
channel is closed.)'
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask
{noformat}

application log shows the driver commanded a shutdown with some unknown reason, 
but hive's log shows Driver could not get RPC header( Expected RPC header, got 
org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).


{noformat}
17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in stage 
3.0 (TID 2519)
17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver commanded 
a shutdown
17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
(hsx-node1:42777) driver disconnected.
17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
192.168.1.1:42777 disassociated! Shutting down.
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in stage 
3.0 (TID 2511)
17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Shutting down remote daemon.
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote 
daemon shut down; proceeding with flushing remote transports.
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Remoting shut down.
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d2fc0020c5c3
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1137.0 in stage 
3.0 (TID 2515)
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 897.0 in stage 
3.0 (TID 2417)
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1225.0 in stage 
3.0 (TID 2526)
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 905.0 in stage 
3.0 (TID 2423)
{noformat}

in hive's log,
{noformat}
2017-02-08T09:51:04,327 INFO [stderr-redir-1] client.SparkClientImpl: 17/02/08 
09:51:04 INFO scheduler.TaskSetManager: Finished task 971.0 in stage 3.0 (TID 
2218) in 5948 ms on hsx-node8 (1338/1520)
2017-02-08T09:51:04,346 INFO [stderr-redir-1] client.SparkClientImpl: 17/02/08 
09:51:04 INFO rpc.RpcDispatcher: [DriverProtocol] Closing channel due to 
exception in pipeline 
(org.apache.hive.spark.client.RemoteDriver$DriverProtocol.handle(io.netty.channel.ChannelHandlerContext,
 org.apache.hive.spark.client

[jira] [Updated] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-09 Thread KaiXu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KaiXu updated HIVE-15859:
-
Description: 
Hive on Spark, failed with error:
2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 961(+383)/1520 
Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
channel is closed.)'
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask

application log shows the driver commanded a shutdown with some unknown reason, 
but hive's log shows Driver could not get RPC header( Expected RPC header, got 
org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).


17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in stage 
3.0 (TID 2519)
17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver commanded 
a shutdown
17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
(hsx-node1:42777) driver disconnected.
17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
192.168.1.1:42777 disassociated! Shutting down.
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in stage 
3.0 (TID 2511)
17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Shutting down remote daemon.
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote 
daemon shut down; proceeding with flushing remote transports.
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Remoting shut down.
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
/mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d2fc0020c5c3
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1137.0 in stage 
3.0 (TID 2515)
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 897.0 in stage 
3.0 (TID 2417)
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1225.0 in stage 
3.0 (TID 2526)
17/02/08 09:51:04 INFO executor.Executor: Executor killed task 905.0 in stage 
3.0 (TID 2423)


in hive's log,
2017-02-08T09:51:04,327 INFO [stderr-redir-1] client.SparkClientImpl: 17/02/08 
09:51:04 INFO scheduler.TaskSetManager: Finished task 971.0 in stage 3.0 (TID 
2218) in 5948 ms on hsx-node8 (1338/1520)
2017-02-08T09:51:04,346 INFO [stderr-redir-1] client.SparkClientImpl: 17/02/08 
09:51:04 INFO rpc.RpcDispatcher: [DriverProtocol] Closing channel due to 
exception in pipeline 
(org.apache.hive.spark.client.RemoteDriver$DriverProtocol.handle(io.netty.channel.ChannelHandlerContext,
 org.apache.hive.spark.client.rpc.Rpc$MessageHeader)).
2017-02-08T09:51:04,346 INFO [