[jira] [Commented] (SPARK-38079) Not waiting for configmap before starting driver

2023-05-24 Thread zuotingbing (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17726030#comment-17726030
 ] 

zuotingbing commented on SPARK-38079:
-

We face the same problem. Is there anybody follow this issue?

 

pod event:

Warning  FailedMount     13s   kubelet            MountVolume.SetUp failed for 
volume "hadoop-properties" : configmap 
"spark-pi-6-a63eb888484c3bf1-hadoop-config" not found
Warning  FailedMount     13s   kubelet            MountVolume.SetUp failed for 
volume "spark-conf-volume-driver" : configmap 
"spark-drv-70e59d88484c3e59-conf-map" not found

> Not waiting for configmap before starting driver
> 
>
> Key: SPARK-38079
> URL: https://issues.apache.org/jira/browse/SPARK-38079
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.0, 3.2.1
>Reporter: Ben
>Priority: Major
>
> *The problem*
> When you spark-submit to kubernetes in cluster-mode:
>  # Kubernetes creates the driver
>  # Kubernetes creates a configmap that the driver depends on
> This is a race condition. If the configmap is not created quickly enough, 
> then the driver will fail to start up properly.
> See [this stackoverflow post|https://stackoverflow.com/a/58508313] for an 
> alternate description of this problem.
>  
> *To Reproduce*
>  # Download spark 3.2.0 or 3.2.1 from 
> [https://spark.apache.org/downloads.html]
>  # Create an image with 
> {code:java}
> bin/docker-image-tool.sh{code}
>  # Spark submit one of the examples to some kubernetes instance
>  # Observe the race condition



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28014) All waiting apps will be changed to the wrong state of Running after master changed

2019-06-12 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-28014:

Summary: All waiting apps will be changed to the wrong state of Running 
after master changed  (was: All waiting apps  will be changed to the wrong 
state of Running )

> All waiting apps will be changed to the wrong state of Running after master 
> changed
> ---
>
> Key: SPARK-28014
> URL: https://issues.apache.org/jira/browse/SPARK-28014
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.3
>Reporter: zuotingbing
>Priority: Major
> Attachments: image-2019-06-12-15-36-14-211.png, 
> image-2019-06-12-15-38-45-367.png
>
>
> These waiting apps which granted 0 cores will be changed to Running state 
> after master changed, that is a little weird.
>  
> before master changed:
> !image-2019-06-12-15-36-14-211.png!
> after master changed from zdh112 to zdh113:
> !image-2019-06-12-15-38-45-367.png!
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28014) All waiting apps will be changed to the wrong state of Running

2019-06-12 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-28014:

Description: 
These waiting apps which granted 0 cores will be changed to Running state after 
master changed, that is a little weird.

 

before master changed:

!image-2019-06-12-15-36-14-211.png!

after master changed from zdh112 to zdh113:

!image-2019-06-12-15-38-45-367.png!

 

 

  was:These waiting apps which granted 0 cores will be changed to Running 
state, that is a little weird.


> All waiting apps  will be changed to the wrong state of Running 
> 
>
> Key: SPARK-28014
> URL: https://issues.apache.org/jira/browse/SPARK-28014
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.3
>Reporter: zuotingbing
>Priority: Major
> Attachments: image-2019-06-12-15-36-14-211.png, 
> image-2019-06-12-15-38-45-367.png
>
>
> These waiting apps which granted 0 cores will be changed to Running state 
> after master changed, that is a little weird.
>  
> before master changed:
> !image-2019-06-12-15-36-14-211.png!
> after master changed from zdh112 to zdh113:
> !image-2019-06-12-15-38-45-367.png!
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28014) All waiting apps will be changed to the wrong state of Running

2019-06-12 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-28014:

Attachment: image-2019-06-12-15-38-45-367.png

> All waiting apps  will be changed to the wrong state of Running 
> 
>
> Key: SPARK-28014
> URL: https://issues.apache.org/jira/browse/SPARK-28014
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.3
>Reporter: zuotingbing
>Priority: Major
> Attachments: image-2019-06-12-15-36-14-211.png, 
> image-2019-06-12-15-38-45-367.png
>
>
> These waiting apps which granted 0 cores will be changed to Running state, 
> that is a little weird.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28014) All waiting apps will be changed to the wrong state of Running

2019-06-12 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-28014:

Attachment: image-2019-06-12-15-36-14-211.png

> All waiting apps  will be changed to the wrong state of Running 
> 
>
> Key: SPARK-28014
> URL: https://issues.apache.org/jira/browse/SPARK-28014
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.3
>Reporter: zuotingbing
>Priority: Major
> Attachments: image-2019-06-12-15-36-14-211.png
>
>
> These waiting apps which granted 0 cores will be changed to Running state, 
> that is a little weird.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28014) All waiting apps will be changed to the wrong state of Running

2019-06-12 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-28014:

Priority: Major  (was: Minor)

> All waiting apps  will be changed to the wrong state of Running 
> 
>
> Key: SPARK-28014
> URL: https://issues.apache.org/jira/browse/SPARK-28014
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.3
>Reporter: zuotingbing
>Priority: Major
>
> These waiting apps which granted 0 cores will be changed to Running state, 
> that is a little weird.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28014) All waiting apps will be changed to the wrong state of Running

2019-06-12 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-28014:

Description: These waiting apps which granted 0 cores will be changed to 
Running state, that is a little weird.

> All waiting apps  will be changed to the wrong state of Running 
> 
>
> Key: SPARK-28014
> URL: https://issues.apache.org/jira/browse/SPARK-28014
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.3
>Reporter: zuotingbing
>Priority: Minor
>
> These waiting apps which granted 0 cores will be changed to Running state, 
> that is a little weird.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28014) All waiting apps will be changed to the wrong state of Running

2019-06-12 Thread zuotingbing (JIRA)
zuotingbing created SPARK-28014:
---

 Summary: All waiting apps  will be changed to the wrong state of 
Running 
 Key: SPARK-28014
 URL: https://issues.apache.org/jira/browse/SPARK-28014
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.4.3
Reporter: zuotingbing






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23191) Workers registration failes in case of network drop

2019-05-14 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835256#comment-16835256
 ] 

zuotingbing edited comment on SPARK-23191 at 5/15/19 3:07 AM:
--

See these detail logs, master changed from vmax18 to vmax17.

In master vmax18,  worker be removed because got no heartbeat but soon got 
heartbeat and asking to re-register with master vmax18(will  
tryRegisterAllMaster() which include master vmax17).

In the same time, worker has bean registered with master vmax17 when master 
vmax17 got leadership.

So Worker registration failed: Duplicate worker ID.

 

spark-mr-master-vmax18.log:
{code:java}
2019-03-15 20:22:09,441 INFO ZooKeeperLeaderElectionAgent: We have lost 
leadership
2019-03-15 20:22:14,544 WARN Master: Removing 
worker-20190218183101-vmax18-33129 because we got no heartbeat in 60 seconds
2019-03-15 20:22:14,544 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:14,864 WARN Master: Got heartbeat from unregistered worker 
worker-20190218183101-vmax18-33129. Asking it to re-register.
2019-03-15 20:22:14,975 ERROR Master: Leadership has been revoked -- master 
shutting down.
{code}
 

spark-mr-master-vmax17.log:
{code:java}
2019-03-15 20:22:14,870 INFO Master: Registering worker vmax18:33129 with 21 
cores, 125.0 GB RAM
2019-03-15 20:22:15,261 INFO Master: vmax18:33129 got disassociated, removing 
it.
2019-03-15 20:22:15,263 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:15,311 ERROR Inbox: Ignoring error
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /spark/master_status/worker_worker-20190218183101-vmax18-33129
{code}
 

spark-mr-worker-vmax18.log:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already.
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process!
2019-03-15 20:22:14,896 INFO ShutdownHookManager: Shutdown hook called{code}
 

PS, this will result another issue: The leader will always in 
COMPLETING_RECOVERY state.

worker-vmax18 shut down cause duplicate worker ID,and clear the worker's node 
on persist Engine(we use zookeeper). Then the new leader(master-vmax17) find 
the worker died and trying to remove it ,and try to clear the node on 
zookeeper,but the node has been removed yet during worker-vmax18 shut down ,so 
{color:#ff}*an exception was thrown in function completeRecovery()* *. Then 
the leader will always in COMPLETING_RECOVERY state.*{color}

 


was (Author: zuo.tingbing9):
See these detail logs, master changed from vmax18 to vmax17.

In master vmax18,  worker be removed because got no heartbeat but soon got 
heartbeat and asking to re-register with master vmax18.

In the same time, worker has bean registered with master vmax17 when master 
vmax17 got leadership.

So Worker registration failed: Duplicate worker ID.

 

spark-mr-master-vmax18.log:
{code:java}
2019-03-15 20:22:09,441 INFO ZooKeeperLeaderElectionAgent: We have lost 
leadership
2019-03-15 20:22:14,544 WARN Master: Removing 
worker-20190218183101-vmax18-33129 because we got no heartbeat in 60 seconds
2019-03-15 20:22:14,544 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:14,864 WARN Master: Got heartbeat from unregistered worker 
worker-20190218183101-vmax18-33129. Asking it to re-register.
2019-03-15 20:22:14,975 ERROR Master: Leadership has been revoked -- master 
shutting down.
{code}
 

spark-mr-master-vmax17.log:
{code:java}
2019-03-15 20:22:14,870 INFO Master: Registering worker vmax18:33129 with 21 
cores, 125.0 GB RAM
2019-03-15 20:22:15,261 INFO Master: vmax18:33129 got disassociated, removing 
it.
2019-03-15 20:22:15,263 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:15,311 ERROR Inbox: Ignoring error
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /spark/master_status/worker_worker-20190218183101-vmax18-33129
{code}
 

spark-mr-worker-vmax18.log:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already.
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID
2019-03-15 20:22:14,895 INFO 

[jira] [Comment Edited] (SPARK-23191) Workers registration failes in case of network drop

2019-05-07 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835256#comment-16835256
 ] 

zuotingbing edited comment on SPARK-23191 at 5/8/19 2:21 AM:
-

See these detail logs, master changed from vmax18 to vmax17.

In master vmax18,  worker be removed because got no heartbeat but soon got 
heartbeat and asking to re-register with master vmax18.

In the same time, worker has bean registered with master vmax17 when master 
vmax17 got leadership.

So Worker registration failed: Duplicate worker ID.

 

spark-mr-master-vmax18.log:
{code:java}
2019-03-15 20:22:09,441 INFO ZooKeeperLeaderElectionAgent: We have lost 
leadership
2019-03-15 20:22:14,544 WARN Master: Removing 
worker-20190218183101-vmax18-33129 because we got no heartbeat in 60 seconds
2019-03-15 20:22:14,544 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:14,864 WARN Master: Got heartbeat from unregistered worker 
worker-20190218183101-vmax18-33129. Asking it to re-register.
2019-03-15 20:22:14,975 ERROR Master: Leadership has been revoked -- master 
shutting down.
{code}
 

spark-mr-master-vmax17.log:
{code:java}
2019-03-15 20:22:14,870 INFO Master: Registering worker vmax18:33129 with 21 
cores, 125.0 GB RAM
2019-03-15 20:22:15,261 INFO Master: vmax18:33129 got disassociated, removing 
it.
2019-03-15 20:22:15,263 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:15,311 ERROR Inbox: Ignoring error
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /spark/master_status/worker_worker-20190218183101-vmax18-33129
{code}
 

spark-mr-worker-vmax18.log:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already.
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process!
2019-03-15 20:22:14,896 INFO ShutdownHookManager: Shutdown hook called{code}
 

PS, this will result another issue: The leader will always in 
COMPLETING_RECOVERY state.

worker-vmax18 shut down cause duplicate worker ID,and clear the worker's node 
on persist Engine(we use zookeeper). Then the new leader(master-vmax17) find 
the worker died and trying to remove it ,and try to clear the node on 
zookeeper,but the node has been removed yet during worker-vmax18 shut down ,so 
{color:#ff}*an exception was thrown in function completeRecovery()* *. Then 
the leader will always in COMPLETING_RECOVERY state.*{color}

 


was (Author: zuo.tingbing9):
See these detail logs, master changed from vmax18 to vmax17.

In master vmax18,  worker be removed because got no heartbeat but soon got 
heartbeat and asking to re-register with master vmax18.

In the same time, worker has bean registered with master vmax17 when master 
vmax17 got leadership.

So Worker registration failed: Duplicate worker ID.

 

spark-mr-master-vmax18.log:
**
{code:java}
2019-03-15 20:22:09,441 INFO ZooKeeperLeaderElectionAgent: We have lost 
leadership
2019-03-15 20:22:14,544 WARN Master: Removing 
worker-20190218183101-vmax18-33129 because we got no heartbeat in 60 seconds
2019-03-15 20:22:14,544 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:14,864 WARN Master: Got heartbeat from unregistered worker 
worker-20190218183101-vmax18-33129. Asking it to re-register.
2019-03-15 20:22:14,975 ERROR Master: Leadership has been revoked -- master 
shutting down.
{code}
 

spark-mr-master-vmax17.log:

 
{code:java}
2019-03-15 20:22:14,870 INFO Master: Registering worker vmax18:33129 with 21 
cores, 125.0 GB RAM
2019-03-15 20:22:15,261 INFO Master: vmax18:33129 got disassociated, removing 
it.
2019-03-15 20:22:15,263 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:15,311 ERROR Inbox: Ignoring error
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /spark/master_status/worker_worker-20190218183101-vmax18-33129
{code}
 

 

spark-mr-worker-vmax18.log:

 
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already.
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process!
2019-03-15 

[jira] [Commented] (SPARK-23191) Workers registration failes in case of network drop

2019-05-07 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835256#comment-16835256
 ] 

zuotingbing commented on SPARK-23191:
-

See these detail logs, master changed from vmax18 to vmax17.

In master vmax18,  worker be removed because got no heartbeat but soon got 
heartbeat and asking to re-register with master vmax18.

In the same time, worker has bean registered with master vmax17 when master 
vmax17 got leadership.

So Worker registration failed: Duplicate worker ID.

 

spark-mr-master-vmax18.log:
**
{code:java}
2019-03-15 20:22:09,441 INFO ZooKeeperLeaderElectionAgent: We have lost 
leadership
2019-03-15 20:22:14,544 WARN Master: Removing 
worker-20190218183101-vmax18-33129 because we got no heartbeat in 60 seconds
2019-03-15 20:22:14,544 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:14,864 WARN Master: Got heartbeat from unregistered worker 
worker-20190218183101-vmax18-33129. Asking it to re-register.
2019-03-15 20:22:14,975 ERROR Master: Leadership has been revoked -- master 
shutting down.
{code}
 

spark-mr-master-vmax17.log:

 
{code:java}
2019-03-15 20:22:14,870 INFO Master: Registering worker vmax18:33129 with 21 
cores, 125.0 GB RAM
2019-03-15 20:22:15,261 INFO Master: vmax18:33129 got disassociated, removing 
it.
2019-03-15 20:22:15,263 INFO Master: Removing worker 
worker-20190218183101-vmax18-33129 on vmax18:33129
2019-03-15 20:22:15,311 ERROR Inbox: Ignoring error
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /spark/master_status/worker_worker-20190218183101-vmax18-33129
{code}
 

 

spark-mr-worker-vmax18.log:

 
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already.
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process!
2019-03-15 20:22:14,896 INFO ShutdownHookManager: Shutdown hook called{code}
 

 

PS, this will result another issue: The leader will always in 
COMPLETING_RECOVERY state.

worker-vmax18 shut down cause duplicate worker ID,and clear the worker's node 
on persist Engine(we use zookeeper). Then the new leader(master-vmax17) find 
the worker died and trying to remove it ,and try to clear the node on 
zookeeper,but the node has been removed yet during worker-vmax18 shut down ,so 
{color:#FF}*an exception was thrown in function completeRecovery()* *. Then 
the leader will always in COMPLETING_RECOVERY state.*{color}

 

> Workers registration failes in case of network drop
> ---
>
> Key: SPARK-23191
> URL: https://issues.apache.org/jira/browse/SPARK-23191
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.3, 2.2.1, 2.3.0
> Environment: OS:- Centos 6.9(64 bit)
>  
>Reporter: Neeraj Gupta
>Priority: Critical
>
> We have a 3 node cluster. We were facing issues of multiple driver running in 
> some scenario in production.
> On further investigation we were able to reproduce iin both 1.6.3 and 2.2.1 
> versions the scenario with following steps:-
>  # Setup a 3 node cluster. Start master and slaves.
>  # On any node where the worker process is running block the connections on 
> port 7077 using iptables.
> {code:java}
> iptables -A OUTPUT -p tcp --dport 7077 -j DROP
> {code}
>  # After about 10-15 secs we get the error on node that it is unable to 
> connect to master.
> {code:java}
> 2018-01-23 12:08:51,639 [rpc-client-1-1] WARN  
> org.apache.spark.network.server.TransportChannelHandler - Exception in 
> connection from 
> java.io.IOException: Connection timed out
>     at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>     at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>     at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>     at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>     at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>     at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
>     at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
>     at 
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275)
>     at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
>     at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
>     at 
> 

[jira] [Comment Edited] (SPARK-23191) Workers registration failes in case of network drop

2019-04-29 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829051#comment-16829051
 ] 

zuotingbing edited comment on SPARK-23191 at 4/29/19 9:28 AM:
--

we faced the same issue in standalone HA mode. Could you please take a view on 
this issue?
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax18:7077... 
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax17:7077... 
2019-03-15 20:22:14,865 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,868 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,871 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,871 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,896 INFO ShutdownHookManager: Shutdown hook called 
2019-03-15 20:22:14,898 INFO ShutdownHookManager: Deleting directory 
/data4/zdh/spark/tmp/spark-c578bf32-6a5e-44a5-843b-c796f44648ee 
2019-03-15 20:22:14,908 INFO ShutdownHookManager: Deleting directory 
/data3/zdh/spark/tmp/spark-7e57e77d-cbb7-47d3-a6dd-737b57788533 
2019-03-15 20:22:14,920 INFO ShutdownHookManager: Deleting directory 
/data2/zdh/spark/tmp/spark-0beebf20-abbd-4d99-a401-3ef0e88e0b05{code}
 

[~andrewor14]  [~cloud_fan] [~vanzin]


was (Author: zuo.tingbing9):
we faced the same issue in standalone HA mode. Could you please take a view on 
this issue?

[~andrewor14]  [~cloud_fan] [~vanzin]

> Workers registration failes in case of network drop
> ---
>
> Key: SPARK-23191
> URL: https://issues.apache.org/jira/browse/SPARK-23191
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.3, 2.2.1, 2.3.0
> Environment: OS:- Centos 6.9(64 bit)
>  
>Reporter: Neeraj Gupta
>Priority: Critical
>
> We have a 3 node cluster. We were facing issues of multiple driver running in 
> some scenario in production.
> On further investigation we were able to reproduce iin both 1.6.3 and 2.2.1 
> versions the scenario with following steps:-
>  # Setup a 3 node cluster. Start master and slaves.
>  # On any node where the worker process is running block the connections on 
> port 7077 using iptables.
> {code:java}
> iptables -A OUTPUT -p tcp --dport 7077 -j DROP
> {code}
>  # After about 10-15 secs we get the error on node that it is unable to 
> connect to master.
> {code:java}
> 2018-01-23 12:08:51,639 [rpc-client-1-1] WARN  
> org.apache.spark.network.server.TransportChannelHandler - Exception in 
> connection from 
> java.io.IOException: Connection timed out
>     at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>     at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>     at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>     at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>     at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>     at 
> 

[jira] [Commented] (SPARK-23191) Workers registration failes in case of network drop

2019-04-29 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829051#comment-16829051
 ] 

zuotingbing commented on SPARK-23191:
-

we faced the same issue in standalone HA mode. Could you please take a view on 
this issue?

[~andrewor14]  [~cloud_fan] [~vanzin]

> Workers registration failes in case of network drop
> ---
>
> Key: SPARK-23191
> URL: https://issues.apache.org/jira/browse/SPARK-23191
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.3, 2.2.1, 2.3.0
> Environment: OS:- Centos 6.9(64 bit)
>  
>Reporter: Neeraj Gupta
>Priority: Critical
>
> We have a 3 node cluster. We were facing issues of multiple driver running in 
> some scenario in production.
> On further investigation we were able to reproduce iin both 1.6.3 and 2.2.1 
> versions the scenario with following steps:-
>  # Setup a 3 node cluster. Start master and slaves.
>  # On any node where the worker process is running block the connections on 
> port 7077 using iptables.
> {code:java}
> iptables -A OUTPUT -p tcp --dport 7077 -j DROP
> {code}
>  # After about 10-15 secs we get the error on node that it is unable to 
> connect to master.
> {code:java}
> 2018-01-23 12:08:51,639 [rpc-client-1-1] WARN  
> org.apache.spark.network.server.TransportChannelHandler - Exception in 
> connection from 
> java.io.IOException: Connection timed out
>     at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>     at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>     at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>     at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>     at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>     at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
>     at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
>     at 
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275)
>     at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
>     at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
>     at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
>     at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
>     at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
>     at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
>     at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
>     at java.lang.Thread.run(Thread.java:745)
> 2018-01-23 12:08:51,647 [dispatcher-event-loop-0] ERROR 
> org.apache.spark.deploy.worker.Worker - Connection to master failed! Waiting 
> for master to reconnect...
> 2018-01-23 12:08:51,647 [dispatcher-event-loop-0] ERROR 
> org.apache.spark.deploy.worker.Worker - Connection to master failed! Waiting 
> for master to reconnect...
> {code}
>  # Once we get this exception we renable the connections to port 7077 using
> {code:java}
> iptables -D OUTPUT -p tcp --dport 7077 -j DROP
> {code}
>  # Worker tries to register again with master but is unable to do so. It 
> gives following error
> {code:java}
> 2018-01-23 12:08:58,657 [worker-register-master-threadpool-2] WARN  
> org.apache.spark.deploy.worker.Worker - Failed to connect to master 
> :7077
> org.apache.spark.SparkException: Exception thrown in awaitResult:
>     at 
> org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
>     at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
>     at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
>     at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
>     at 
> org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1$$anon$1.run(Worker.scala:241)
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Failed to connect to :7077
>     at 
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:232)
>     at 
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182)
>     at 

[jira] [Commented] (SPARK-16190) Worker registration failed: Duplicate worker ID

2019-03-19 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-16190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796772#comment-16796772
 ] 

zuotingbing commented on SPARK-16190:
-

i faced the same issue. worker log as follows:
{code:java}
// code placeholder
{code}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 2019-03-15 20:22:14,862 INFO Worker: Master with url 
spark://vmax18:7077 requested this worker to reconnect. 2019-03-15 20:22:14,863 
INFO Worker: Connecting to master vmax18:7077... 2019-03-15 20:22:14,863 INFO 
Worker: Connecting to master vmax17:7077... 2019-03-15 20:22:14,865 INFO 
Worker: Master with url spark://vmax18:7077 requested this worker to reconnect. 
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 2019-03-15 
20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 requested this 
worker to reconnect. 2019-03-15 20:22:14,868 INFO Worker: Not spawning another 
attempt to register with the master, since there is an attempt scheduled 
already. 2019-03-15 20:22:14,871 INFO Worker: Master with url 
spark://vmax18:7077 requested this worker to reconnect. 2019-03-15 20:22:14,871 
INFO Worker: Not spawning another attempt to register with the master, since 
there is an attempt scheduled already. 2019-03-15 20:22:14,879 ERROR Worker: 
Worker registration failed: Duplicate worker ID 2019-03-15 20:22:14,891 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,891 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 2019-03-15 
20:22:14,893 INFO ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 2019-03-15 
20:22:14,894 INFO ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 2019-03-15 
20:22:14,895 INFO ExecutorRunner: Killing process! 2019-03-15 20:22:14,895 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,895 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,896 INFO ShutdownHookManager: Shutdown hook called 
2019-03-15 20:22:14,898 INFO ShutdownHookManager: Deleting directory 
/data4/zdh/spark/tmp/spark-c578bf32-6a5e-44a5-843b-c796f44648ee 2019-03-15 
20:22:14,908 INFO ShutdownHookManager: Deleting directory 
/data3/zdh/spark/tmp/spark-7e57e77d-cbb7-47d3-a6dd-737b57788533 2019-03-15 
20:22:14,920 INFO ShutdownHookManager: Deleting directory 
/data2/zdh/spark/tmp/spark-0beebf20-abbd-4d99-a401-3ef0e88e0b05

> Worker registration failed: Duplicate worker ID
> ---
>
> Key: SPARK-16190
> URL: https://issues.apache.org/jira/browse/SPARK-16190
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 1.6.1
>Reporter: Thomas Huang
>Priority: Minor
> Attachments: 
> spark-mqq-org.apache.spark.deploy.worker.Worker-1-slave19.out, 
> spark-mqq-org.apache.spark.deploy.worker.Worker-1-slave2.out, 
> spark-mqq-org.apache.spark.deploy.worker.Worker-1-slave7.out, 
> spark-mqq-org.apache.spark.deploy.worker.Worker-1-slave8.out
>
>
> Several worker crashed simultaneously due to this error: 
> Worker registration failed: Duplicate worker ID
> This is the worker log on one of those crashed workers:
> 16/06/24 16:28:53 INFO ExecutorRunner: Killing process!
> 16/06/24 16:28:53 INFO ExecutorRunner: Runner thread for executor 
> app-20160624003013-0442/26 interrupted
> 16/06/24 16:28:53 INFO ExecutorRunner: Killing process!
> 16/06/24 16:29:03 WARN ExecutorRunner: Failed to terminate process: 
> java.lang.UNIXProcess@31340137. This process will likely be orphaned.
> 16/06/24 16:29:03 WARN ExecutorRunner: Failed to terminate process: 
> java.lang.UNIXProcess@4d3bdb1d. This process will likely be orphaned.
> 16/06/24 16:29:03 INFO Worker: Executor app-20160624003013-0442/8 finished 
> with state KILLED
> 16/06/24 16:29:03 INFO Worker: Executor app-20160624003013-0442/26 finished 
> with state KILLED
> 16/06/24 16:29:03 INFO Worker: Cleaning up local directories for application 
> app-20160624003013-0442
> 16/06/24 16:31:18 INFO ExternalShuffleBlockResolver: Application 
> app-20160624003013-0442 removed, cleanupLocalDirs = true
> 16/06/24 16:31:18 INFO Worker: Asked to launch executor 
> 

[jira] [Comment Edited] (SPARK-16190) Worker registration failed: Duplicate worker ID

2019-03-19 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-16190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796772#comment-16796772
 ] 

zuotingbing edited comment on SPARK-16190 at 3/20/19 3:57 AM:
--

i faced the same issue. worker log as follows:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 2019-03-15 20:22:14,862 INFO Worker: Master with url 
spark://vmax18:7077 requested this worker to reconnect. 2019-03-15 20:22:14,863 
INFO Worker: Connecting to master vmax18:7077... 2019-03-15 20:22:14,863 INFO 
Worker: Connecting to master vmax17:7077... 2019-03-15 20:22:14,865 INFO 
Worker: Master with url spark://vmax18:7077 requested this worker to reconnect. 
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 2019-03-15 
20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 requested this 
worker to reconnect. 2019-03-15 20:22:14,868 INFO Worker: Not spawning another 
attempt to register with the master, since there is an attempt scheduled 
already. 2019-03-15 20:22:14,871 INFO Worker: Master with url 
spark://vmax18:7077 requested this worker to reconnect. 2019-03-15 20:22:14,871 
INFO Worker: Not spawning another attempt to register with the master, since 
there is an attempt scheduled already. 2019-03-15 20:22:14,879 ERROR Worker: 
Worker registration failed: Duplicate worker ID 2019-03-15 20:22:14,891 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,891 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 2019-03-15 
20:22:14,893 INFO ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 2019-03-15 
20:22:14,894 INFO ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 2019-03-15 
20:22:14,895 INFO ExecutorRunner: Killing process! 2019-03-15 20:22:14,895 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,895 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,896 INFO ShutdownHookManager: Shutdown hook called 
2019-03-15 20:22:14,898 INFO ShutdownHookManager: Deleting directory 
/data4/zdh/spark/tmp/spark-c578bf32-6a5e-44a5-843b-c796f44648ee 2019-03-15 
20:22:14,908 INFO ShutdownHookManager: Deleting directory 
/data3/zdh/spark/tmp/spark-7e57e77d-cbb7-47d3-a6dd-737b57788533 2019-03-15 
20:22:14,920 INFO ShutdownHookManager: Deleting directory 
/data2/zdh/spark/tmp/spark-0beebf20-abbd-4d99-a401-3ef0e88e0b05{code}
 


was (Author: zuo.tingbing9):
i faced the same issue. worker log as follows:
{code:java}
// code placeholder
{code}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 2019-03-15 20:22:14,862 INFO Worker: Master with url 
spark://vmax18:7077 requested this worker to reconnect. 2019-03-15 20:22:14,863 
INFO Worker: Connecting to master vmax18:7077... 2019-03-15 20:22:14,863 INFO 
Worker: Connecting to master vmax17:7077... 2019-03-15 20:22:14,865 INFO 
Worker: Master with url spark://vmax18:7077 requested this worker to reconnect. 
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 2019-03-15 
20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 requested this 
worker to reconnect. 2019-03-15 20:22:14,868 INFO Worker: Not spawning another 
attempt to register with the master, since there is an attempt scheduled 
already. 2019-03-15 20:22:14,871 INFO Worker: Master with url 
spark://vmax18:7077 requested this worker to reconnect. 2019-03-15 20:22:14,871 
INFO Worker: Not spawning another attempt to register with the master, since 
there is an attempt scheduled already. 2019-03-15 20:22:14,879 ERROR Worker: 
Worker registration failed: Duplicate worker ID 2019-03-15 20:22:14,891 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,891 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 2019-03-15 
20:22:14,893 INFO ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 

[jira] [Comment Edited] (SPARK-16190) Worker registration failed: Duplicate worker ID

2019-03-19 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-16190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796772#comment-16796772
 ] 

zuotingbing edited comment on SPARK-16190 at 3/20/19 4:02 AM:
--

i faced the same issue in standalone HA mode. worker log as follows:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax18:7077... 
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax17:7077... 
2019-03-15 20:22:14,865 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,868 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,871 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,871 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,896 INFO ShutdownHookManager: Shutdown hook called 
2019-03-15 20:22:14,898 INFO ShutdownHookManager: Deleting directory 
/data4/zdh/spark/tmp/spark-c578bf32-6a5e-44a5-843b-c796f44648ee 
2019-03-15 20:22:14,908 INFO ShutdownHookManager: Deleting directory 
/data3/zdh/spark/tmp/spark-7e57e77d-cbb7-47d3-a6dd-737b57788533 
2019-03-15 20:22:14,920 INFO ShutdownHookManager: Deleting directory 
/data2/zdh/spark/tmp/spark-0beebf20-abbd-4d99-a401-3ef0e88e0b05{code}
 


was (Author: zuo.tingbing9):
i faced the same issue. worker log as follows:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax18:7077... 
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax17:7077... 
2019-03-15 20:22:14,865 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,868 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,871 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,871 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 

[jira] [Comment Edited] (SPARK-16190) Worker registration failed: Duplicate worker ID

2019-03-19 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-16190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796772#comment-16796772
 ] 

zuotingbing edited comment on SPARK-16190 at 3/20/19 3:59 AM:
--

i faced the same issue. worker log as follows:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax18:7077... 
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax17:7077... 
2019-03-15 20:22:14,865 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,868 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,871 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,871 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID 2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,896 INFO ShutdownHookManager: Shutdown hook called 
2019-03-15 20:22:14,898 INFO ShutdownHookManager: Deleting directory 
/data4/zdh/spark/tmp/spark-c578bf32-6a5e-44a5-843b-c796f44648ee 
2019-03-15 20:22:14,908 INFO ShutdownHookManager: Deleting directory 
/data3/zdh/spark/tmp/spark-7e57e77d-cbb7-47d3-a6dd-737b57788533 
2019-03-15 20:22:14,920 INFO ShutdownHookManager: Deleting directory 
/data2/zdh/spark/tmp/spark-0beebf20-abbd-4d99-a401-3ef0e88e0b05{code}
 


was (Author: zuo.tingbing9):
i faced the same issue. worker log as follows:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 2019-03-15 20:22:14,862 INFO Worker: Master with url 
spark://vmax18:7077 requested this worker to reconnect. 2019-03-15 20:22:14,863 
INFO Worker: Connecting to master vmax18:7077... 2019-03-15 20:22:14,863 INFO 
Worker: Connecting to master vmax17:7077... 2019-03-15 20:22:14,865 INFO 
Worker: Master with url spark://vmax18:7077 requested this worker to reconnect. 
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 2019-03-15 
20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 requested this 
worker to reconnect. 2019-03-15 20:22:14,868 INFO Worker: Not spawning another 
attempt to register with the master, since there is an attempt scheduled 
already. 2019-03-15 20:22:14,871 INFO Worker: Master with url 
spark://vmax18:7077 requested this worker to reconnect. 2019-03-15 20:22:14,871 
INFO Worker: Not spawning another attempt to register with the master, since 
there is an attempt scheduled already. 2019-03-15 20:22:14,879 ERROR Worker: 
Worker registration failed: Duplicate worker ID 2019-03-15 20:22:14,891 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,891 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 2019-03-15 
20:22:14,893 INFO ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO 
ExecutorRunner: Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: 
Killing process! 2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: 

[jira] [Comment Edited] (SPARK-16190) Worker registration failed: Duplicate worker ID

2019-03-19 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-16190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796772#comment-16796772
 ] 

zuotingbing edited comment on SPARK-16190 at 3/20/19 4:00 AM:
--

i faced the same issue. worker log as follows:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax18:7077... 
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax17:7077... 
2019-03-15 20:22:14,865 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,868 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,871 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,871 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,895 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,896 INFO ShutdownHookManager: Shutdown hook called 
2019-03-15 20:22:14,898 INFO ShutdownHookManager: Deleting directory 
/data4/zdh/spark/tmp/spark-c578bf32-6a5e-44a5-843b-c796f44648ee 
2019-03-15 20:22:14,908 INFO ShutdownHookManager: Deleting directory 
/data3/zdh/spark/tmp/spark-7e57e77d-cbb7-47d3-a6dd-737b57788533 
2019-03-15 20:22:14,920 INFO ShutdownHookManager: Deleting directory 
/data2/zdh/spark/tmp/spark-0beebf20-abbd-4d99-a401-3ef0e88e0b05{code}
 


was (Author: zuo.tingbing9):
i faced the same issue. worker log as follows:
{code:java}
2019-03-15 20:22:10,474 INFO Worker: Master has changed, new master is at 
spark://vmax17:7077 
2019-03-15 20:22:14,862 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax18:7077... 
2019-03-15 20:22:14,863 INFO Worker: Connecting to master vmax17:7077... 
2019-03-15 20:22:14,865 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect.
2019-03-15 20:22:14,865 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,868 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,868 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,871 INFO Worker: Master with url spark://vmax18:7077 
requested this worker to reconnect. 
2019-03-15 20:22:14,871 INFO Worker: Not spawning another attempt to register 
with the master, since there is an attempt scheduled already. 
2019-03-15 20:22:14,879 ERROR Worker: Worker registration failed: Duplicate 
worker ID 2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,891 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,893 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO ExecutorRunner: Killing process! 
2019-03-15 20:22:14,894 INFO 

[jira] [Comment Edited] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781160#comment-16781160
 ] 

zuotingbing edited comment on SPARK-27010 at 3/1/19 1:27 AM:
-

Currently, if we set *SPARK_MASTER_PORT=0*, we can  easily find out the actual 
port number in log which would help us to better get the correct *spark.master* 
address.

!2019-03-01_092511.png!

 


was (Author: zuo.tingbing9):
Currently, if we set *SPARK_MASTER_PORT=0*, we can  easily find out the actual 
port number in log which would help us to better get the correct *spark.master* 
address.

!2019-03-01_090847.png!

 

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png, 2019-03-01_092511.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we should use when using beeline to connect..
> before:
> !2019-02-28_170942.png!
> after:
> !2019-02-28_170904.png!
> use beeline to connect success:
> !2019-02-28_170844.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Attachment: 2019-03-01_092511.png

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png, 2019-03-01_092511.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we should use when using beeline to connect..
> before:
> !2019-02-28_170942.png!
> after:
> !2019-02-28_170904.png!
> use beeline to connect success:
> !2019-02-28_170844.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Attachment: (was: 2019-03-01_090847.png)

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we should use when using beeline to connect..
> before:
> !2019-02-28_170942.png!
> after:
> !2019-02-28_170904.png!
> use beeline to connect success:
> !2019-02-28_170844.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781160#comment-16781160
 ] 

zuotingbing commented on SPARK-27010:
-

Currently, if we set *SPARK_MASTER_PORT=0*, we can  easily find out the actual 
port number in log which would help us to better get the correct *spark.master* 
address.

!2019-03-01_090847.png!

 

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png, 2019-03-01_090847.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we should use when using beeline to connect..
> before:
> !2019-02-28_170942.png!
> after:
> !2019-02-28_170904.png!
> use beeline to connect success:
> !2019-02-28_170844.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Attachment: 2019-03-01_090847.png

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png, 2019-03-01_090847.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we should use when using beeline to connect..
> before:
> !2019-02-28_170942.png!
> after:
> !2019-02-28_170904.png!
> use beeline to connect success:
> !2019-02-28_170844.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Description: 
Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we should use when using beeline to connect..

before:

!2019-02-28_170942.png!

after:

!2019-02-28_170904.png!

use beeline to connect success:

!2019-02-28_170844.png!

 

  was:
Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we should use beeline to connect to.

before:

!2019-02-28_170942.png!

after:

!2019-02-28_170904.png!

use beeline to connect success:

!2019-02-28_170844.png!

 


> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we should use when using beeline to connect..
> before:
> !2019-02-28_170942.png!
> after:
> !2019-02-28_170904.png!
> use beeline to connect success:
> !2019-02-28_170844.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27010) display the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)
zuotingbing created SPARK-27010:
---

 Summary: display the actual port number when 
hive.server2.thrift.port=0
 Key: SPARK-27010
 URL: https://issues.apache.org/jira/browse/SPARK-27010
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.4.0
Reporter: zuotingbing


Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we use beeline to connect.

before:

!image-2019-02-28-17-00-21-251.png!

after:

!image-2019-02-28-17-03-45-779.png!

use beeline to connect success:

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Description: 
Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we use beeline to connect.

before:

 

after:

 

use beeline to connect success:

 

 

  was:
Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we use beeline to connect.

before:

!image-2019-02-28-17-00-21-251.png!

after:

!image-2019-02-28-17-03-45-779.png!

use beeline to connect success:

 

 


> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we use beeline to connect.
> before:
>  
> after:
>  
> use beeline to connect success:
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Description: 
Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we should use beeline to connect to.

before:

!2019-02-28_170942.png!

after:

!2019-02-28_170904.png!

use beeline to connect success:

!2019-02-28_170844.png!

 

  was:
Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we should use beeline to connect.

before:

!2019-02-28_170942.png!

after:

!2019-02-28_170904.png!

use beeline to connect success:

!2019-02-28_170844.png!

 


> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we should use beeline to connect to.
> before:
> !2019-02-28_170942.png!
> after:
> !2019-02-28_170904.png!
> use beeline to connect success:
> !2019-02-28_170844.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Description: 
Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we should use beeline to connect.

before:

!2019-02-28_170942.png!

after:

!2019-02-28_170904.png!

use beeline to connect success:

!2019-02-28_170844.png!

 

  was:
Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we use beeline to connect.

before:

 

after:

 

use beeline to connect success:

 

 


> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we should use beeline to connect.
> before:
> !2019-02-28_170942.png!
> after:
> !2019-02-28_170904.png!
> use beeline to connect success:
> !2019-02-28_170844.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Attachment: 2019-02-28_170844.png

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we use beeline to connect.
> before:
> !image-2019-02-28-17-00-21-251.png!
> after:
> !image-2019-02-28-17-03-45-779.png!
> use beeline to connect success:
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Attachment: 2019-02-28_170904.png

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we use beeline to connect.
> before:
> !image-2019-02-28-17-00-21-251.png!
> after:
> !image-2019-02-28-17-03-45-779.png!
> use beeline to connect success:
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Attachment: 2019-02-28_170942.png

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2019-02-28_170844.png, 2019-02-28_170904.png, 
> 2019-02-28_170942.png
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we use beeline to connect.
> before:
> !image-2019-02-28-17-00-21-251.png!
> after:
> !image-2019-02-28-17-03-45-779.png!
> use beeline to connect success:
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27010) find out the actual port number when hive.server2.thrift.port=0

2019-02-28 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-27010:

Summary: find out the actual port number when hive.server2.thrift.port=0  
(was: display the actual port number when hive.server2.thrift.port=0)

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: SPARK-27010
> URL: https://issues.apache.org/jira/browse/SPARK-27010
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: zuotingbing
>Priority: Minor
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we use beeline to connect.
> before:
> !image-2019-02-28-17-00-21-251.png!
> after:
> !image-2019-02-28-17-03-45-779.png!
> use beeline to connect success:
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers with freeCores>=CPUS_PER_TASK at first for better performance

2018-11-09 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Priority: Trivial  (was: Major)

> we should filter the workOffers with freeCores>=CPUS_PER_TASK at first for 
> better performance
> -
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Trivial
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers with freeCores>=CPUS_PER_TASK for better 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers with freeCores>=CPUS_PER_TASK at first for better performance

2018-10-30 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Summary: we should filter the workOffers with freeCores>=CPUS_PER_TASK at 
first for better performance  (was: we should filter the workOffers of which 
freeCores>=CPUS_PER_TASK at first for better performance)

> we should filter the workOffers with freeCores>=CPUS_PER_TASK at first for 
> better performance
> -
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers with freeCores>=CPUS_PER_TASK for better 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>=CPUS_PER_TASK at first for better performance

2018-10-30 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Description: We should filter the workOffers with freeCores>=CPUS_PER_TASK 
for better performance.  (was: We should filter the workOffers of which 
freeCores>=CPUS_PER_TASK for better performance.)

> we should filter the workOffers of which freeCores>=CPUS_PER_TASK at first 
> for better performance
> -
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers with freeCores>=CPUS_PER_TASK for better 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>=CPUS_PER_TASK at first for better performance

2018-10-30 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Summary: we should filter the workOffers of which freeCores>=CPUS_PER_TASK 
at first for better performance  (was: we should filter the workOffers of which 
freeCores>CPUS_PER_TASK at first for better performance)

> we should filter the workOffers of which freeCores>=CPUS_PER_TASK at first 
> for better performance
> -
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers of which freeCores>=CPUS_PER_TASK for better 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>CPUS_PER_TASK at first for better performance

2018-10-30 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Description: We should filter the workOffers of which 
freeCores>=CPUS_PER_TASK for better performance.  (was: We should filter the 
workOffers of which freeCores=0 for better performance.)

> we should filter the workOffers of which freeCores>CPUS_PER_TASK at first for 
> better performance
> 
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers of which freeCores>=CPUS_PER_TASK for better 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>CPUS_PER_TASK at first for better performance

2018-10-29 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Summary: we should filter the workOffers of which freeCores>CPUS_PER_TASK 
at first for better performance  (was: we should filter the workOffers of which 
freeCores>0 for better performance)

> we should filter the workOffers of which freeCores>CPUS_PER_TASK at first for 
> better performance
> 
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers of which freeCores=0 for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>0 for better performance

2018-10-26 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Priority: Major  (was: Minor)

> we should filter the workOffers of which freeCores>0 for better performance
> ---
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers of which freeCores=0 for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>0 for better performance

2018-10-26 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Description: We should filter the workOffers of which freeCores=0 for 
better performance.  (was: We should filter the workOffers of which freeCores=0 
when make fake resource offers on all executors.)

> we should filter the workOffers of which freeCores>0 for better performance
> ---
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers of which freeCores=0 for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>0 for better performance

2018-10-26 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Summary: we should filter the workOffers of which freeCores>0 for better 
performance  (was: we should filter the workOffers of which freeCores>0 when 
make fake resource offers on all executors)

> we should filter the workOffers of which freeCores>0 for better performance
> ---
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers of which freeCores=0 when make fake resource 
> offers on all executors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>0 when make fake resource offers on all executors

2018-10-26 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Component/s: (was: Spark Core)
 Scheduler

> we should filter the workOffers of which freeCores>0 when make fake resource 
> offers on all executors
> 
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers of which freeCores=0 when make fake resource 
> offers on all executors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>0 when make fake resource offers on all executors

2018-10-26 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Attachment: 2018-10-26_162822.png

> we should filter the workOffers of which freeCores>0 when make fake resource 
> offers on all executors
> 
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Minor
> Attachments: 2018-10-26_162822.png
>
>
> We should filter the workOffers of which freeCores=0 when make fake resource 
> offers on all executors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>0 when make fake resource offers on all executors

2018-10-26 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25852:

Summary: we should filter the workOffers of which freeCores>0 when make 
fake resource offers on all executors  (was: we should filter the workOffers of 
which freeCores=0 when make fake resource offers on all executors)

> we should filter the workOffers of which freeCores>0 when make fake resource 
> offers on all executors
> 
>
> Key: SPARK-25852
> URL: https://issues.apache.org/jira/browse/SPARK-25852
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.2
>Reporter: zuotingbing
>Priority: Minor
>
> We should filter the workOffers of which freeCores=0 when make fake resource 
> offers on all executors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-25852) we should filter the workOffers of which freeCores=0 when make fake resource offers on all executors

2018-10-26 Thread zuotingbing (JIRA)
zuotingbing created SPARK-25852:
---

 Summary: we should filter the workOffers of which freeCores=0 when 
make fake resource offers on all executors
 Key: SPARK-25852
 URL: https://issues.apache.org/jira/browse/SPARK-25852
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.3.2
Reporter: zuotingbing


We should filter the workOffers of which freeCores=0 when make fake resource 
offers on all executors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-25451) Stages page doesn't show the right number of the total tasks

2018-09-18 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618837#comment-16618837
 ] 

zuotingbing edited comment on SPARK-25451 at 9/18/18 10:11 AM:
---

yes, thanks [~yumwang]


was (Author: zuo.tingbing9):
yes, thanks

> Stages page doesn't show the right number of the total tasks
> 
>
> Key: SPARK-25451
> URL: https://issues.apache.org/jira/browse/SPARK-25451
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: mshot.png
>
>
>  
> See the attached pic.
>   !mshot.png!
> The executor 1 has 7 tasks, but in the Stages Page the total tasks of 
> executor is 6.
>  
> to reproduce this simply start a shell:
> {code:java}
> $SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
> --total-executor-cores 2 --master spark://localhost.localdomain:7077{code}
> Run job as fellows:
> {code:java}
> sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
> executor")}.collect() {code}
>  
> Go to the stages page and you will see the Total Tasks  is not right in
> {code:java}
> Aggregated Metrics by Executor{code}
> table. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25451) Stages page doesn't show the right number of the total tasks

2018-09-18 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25451:

Target Version/s:   (was: 2.3.1)

> Stages page doesn't show the right number of the total tasks
> 
>
> Key: SPARK-25451
> URL: https://issues.apache.org/jira/browse/SPARK-25451
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: mshot.png
>
>
>  
> See the attached pic.
>   !mshot.png!
> The executor 1 has 7 tasks, but in the Stages Page the total tasks of 
> executor is 6.
>  
> to reproduce this simply start a shell:
> {code:java}
> $SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
> --total-executor-cores 2 --master spark://localhost.localdomain:7077{code}
> Run job as fellows:
> {code:java}
> sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
> executor")}.collect() {code}
>  
> Go to the stages page and you will see the Total Tasks  is not right in
> {code:java}
> Aggregated Metrics by Executor{code}
> table. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25451) Stages page doesn't show the right number of the total tasks

2018-09-18 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618837#comment-16618837
 ] 

zuotingbing commented on SPARK-25451:
-

yes, thanks

> Stages page doesn't show the right number of the total tasks
> 
>
> Key: SPARK-25451
> URL: https://issues.apache.org/jira/browse/SPARK-25451
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: mshot.png
>
>
>  
> See the attached pic.
>   !mshot.png!
> The executor 1 has 7 tasks, but in the Stages Page the total tasks of 
> executor is 6.
>  
> to reproduce this simply start a shell:
> {code:java}
> $SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
> --total-executor-cores 2 --master spark://localhost.localdomain:7077{code}
> Run job as fellows:
> {code:java}
> sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
> executor")}.collect() {code}
>  
> Go to the stages page and you will see the Total Tasks  is not right in
> {code:java}
> Aggregated Metrics by Executor{code}
> table. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25451) Stages page doesn't show the right number of the total tasks

2018-09-18 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25451:

Description: 
 

See the attached pic.

  !mshot.png!

The executor 1 has 7 tasks, but in the Stages Page the total tasks of executor 
is 6.

 

to reproduce this simply start a shell:
{code:java}
$SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
--total-executor-cores 2 --master spark://localhost.localdomain:7077{code}
Run job as fellows:
{code:java}
sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
executor")}.collect() {code}
 

Go to the stages page and you will see the Total Tasks  is not right in
{code:java}
Aggregated Metrics by Executor{code}
table. 

 

  was:
 

See the attached pic.

 

!image-2018-09-18-16-35-09-548.png!

The executor 1 has 7 tasks, but in the Stages Page the total tasks of executor 
is 6.

 

to reproduce this simply start a shell:

$SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
--total-executor-cores 2 --master spark://localhost.localdomain:7077

Run job as fellows:

 

 
{code:java}
sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
executor")}.collect() {code}
 

Go to the stages page and you will see the Total Tasks  is not right in
{code:java}
Aggregated Metrics by Executor{code}
table. 

 


> Stages page doesn't show the right number of the total tasks
> 
>
> Key: SPARK-25451
> URL: https://issues.apache.org/jira/browse/SPARK-25451
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: mshot.png
>
>
>  
> See the attached pic.
>   !mshot.png!
> The executor 1 has 7 tasks, but in the Stages Page the total tasks of 
> executor is 6.
>  
> to reproduce this simply start a shell:
> {code:java}
> $SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
> --total-executor-cores 2 --master spark://localhost.localdomain:7077{code}
> Run job as fellows:
> {code:java}
> sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
> executor")}.collect() {code}
>  
> Go to the stages page and you will see the Total Tasks  is not right in
> {code:java}
> Aggregated Metrics by Executor{code}
> table. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25451) Stages page doesn't show the right number of the total tasks

2018-09-18 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25451:

Description: 
 

See the attached pic.

 

!image-2018-09-18-16-35-09-548.png!

The executor 1 has 7 tasks, but in the Stages Page the total tasks of executor 
is 6.

 

to reproduce this simply start a shell:

$SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
--total-executor-cores 2 --master spark://localhost.localdomain:7077

Run job as fellows:

 

 
{code:java}
sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
executor")}.collect() {code}
 

Go to the stages page and you will see the Total Tasks  is not right in
{code:java}
Aggregated Metrics by Executor{code}
table. 

 

  was:
 

See the attached pic.

!image-2018-09-18-16-35-09-548.png!

The executor 1 has 7 tasks, but in the Stages Page the total tasks of executor 
is 6.

 

to reproduce this simply start a shell:

$SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
--total-executor-cores 2 --master spark://localhost.localdomain:7077

Run job as fellows:

 

 
{code:java}
sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
executor")}.collect() {code}
 

Go to the stages page and you will see the Total Tasks  is not right in
{code:java}
Aggregated Metrics by Executor{code}
table. 

 


> Stages page doesn't show the right number of the total tasks
> 
>
> Key: SPARK-25451
> URL: https://issues.apache.org/jira/browse/SPARK-25451
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: mshot.png
>
>
>  
> See the attached pic.
>  
> !image-2018-09-18-16-35-09-548.png!
> The executor 1 has 7 tasks, but in the Stages Page the total tasks of 
> executor is 6.
>  
> to reproduce this simply start a shell:
> $SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
> --total-executor-cores 2 --master spark://localhost.localdomain:7077
> Run job as fellows:
>  
>  
> {code:java}
> sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
> executor")}.collect() {code}
>  
> Go to the stages page and you will see the Total Tasks  is not right in
> {code:java}
> Aggregated Metrics by Executor{code}
> table. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-25451) Stages page doesn't show the right number of the total tasks

2018-09-18 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-25451:

Attachment: mshot.png

> Stages page doesn't show the right number of the total tasks
> 
>
> Key: SPARK-25451
> URL: https://issues.apache.org/jira/browse/SPARK-25451
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: mshot.png
>
>
>  
> See the attached pic.
> !image-2018-09-18-16-35-09-548.png!
> The executor 1 has 7 tasks, but in the Stages Page the total tasks of 
> executor is 6.
>  
> to reproduce this simply start a shell:
> $SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
> --total-executor-cores 2 --master spark://localhost.localdomain:7077
> Run job as fellows:
>  
>  
> {code:java}
> sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
> executor")}.collect() {code}
>  
> Go to the stages page and you will see the Total Tasks  is not right in
> {code:java}
> Aggregated Metrics by Executor{code}
> table. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-25451) Stages page doesn't show the right number of the total tasks

2018-09-18 Thread zuotingbing (JIRA)
zuotingbing created SPARK-25451:
---

 Summary: Stages page doesn't show the right number of the total 
tasks
 Key: SPARK-25451
 URL: https://issues.apache.org/jira/browse/SPARK-25451
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 2.3.1
Reporter: zuotingbing


 

See the attached pic.

!image-2018-09-18-16-35-09-548.png!

The executor 1 has 7 tasks, but in the Stages Page the total tasks of executor 
is 6.

 

to reproduce this simply start a shell:

$SPARK_HOME/bin/spark-shell --executor-cores 1 --executor-memory 1g 
--total-executor-cores 2 --master spark://localhost.localdomain:7077

Run job as fellows:

 

 
{code:java}
sc.parallelize(1 to 1, 3).map{ x => throw new RuntimeException("Bad 
executor")}.collect() {code}
 

Go to the stages page and you will see the Total Tasks  is not right in
{code:java}
Aggregated Metrics by Executor{code}
table. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24829) In Spark Thrift Server, CAST AS FLOAT inconsistent with spark-shell or spark-sql

2018-07-17 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-24829:

Summary: In Spark Thrift Server, CAST AS FLOAT inconsistent with 
spark-shell or spark-sql   (was: CAST AS FLOAT inconsistent with Hive)

> In Spark Thrift Server, CAST AS FLOAT inconsistent with spark-shell or 
> spark-sql 
> -
>
> Key: SPARK-24829
> URL: https://issues.apache.org/jira/browse/SPARK-24829
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-07-18_110944.png, 2018-07-18_11.png
>
>
> SELECT CAST('4.56' AS FLOAT)
> the result is 4.55942779541 , it should be 4.56



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24829) In Spark Thrift Server, CAST AS FLOAT inconsistent with spark-shell or spark-sql

2018-07-17 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-24829:

Attachment: (was: CAST-FLOAT.png)

> In Spark Thrift Server, CAST AS FLOAT inconsistent with spark-shell or 
> spark-sql 
> -
>
> Key: SPARK-24829
> URL: https://issues.apache.org/jira/browse/SPARK-24829
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-07-18_110944.png, 2018-07-18_11.png
>
>
> SELECT CAST('4.56' AS FLOAT)
> the result is 4.55942779541 , it should be 4.56



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24829) In Spark Thrift Server, CAST AS FLOAT inconsistent with spark-shell or spark-sql

2018-07-17 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-24829:

Attachment: 2018-07-18_11.png

> In Spark Thrift Server, CAST AS FLOAT inconsistent with spark-shell or 
> spark-sql 
> -
>
> Key: SPARK-24829
> URL: https://issues.apache.org/jira/browse/SPARK-24829
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-07-18_110944.png, 2018-07-18_11.png
>
>
> SELECT CAST('4.56' AS FLOAT)
> the result is 4.55942779541 , it should be 4.56



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24829) In Spark Thrift Server, CAST AS FLOAT inconsistent with spark-shell or spark-sql

2018-07-17 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-24829:

Attachment: 2018-07-18_110944.png

> In Spark Thrift Server, CAST AS FLOAT inconsistent with spark-shell or 
> spark-sql 
> -
>
> Key: SPARK-24829
> URL: https://issues.apache.org/jira/browse/SPARK-24829
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-07-18_110944.png, 2018-07-18_11.png
>
>
> SELECT CAST('4.56' AS FLOAT)
> the result is 4.55942779541 , it should be 4.56



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24829) CAST AS FLOAT inconsistent with Hive

2018-07-17 Thread zuotingbing (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-24829:

Attachment: CAST-FLOAT.png

> CAST AS FLOAT inconsistent with Hive
> 
>
> Key: SPARK-24829
> URL: https://issues.apache.org/jira/browse/SPARK-24829
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.1
>Reporter: zuotingbing
>Priority: Major
> Attachments: CAST-FLOAT.png
>
>
> SELECT CAST('4.56' AS FLOAT)
> the result is 4.55942779541 , it should be 4.56



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24829) CAST AS FLOAT inconsistent with Hive

2018-07-17 Thread zuotingbing (JIRA)
zuotingbing created SPARK-24829:
---

 Summary: CAST AS FLOAT inconsistent with Hive
 Key: SPARK-24829
 URL: https://issues.apache.org/jira/browse/SPARK-24829
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.1
Reporter: zuotingbing


SELECT CAST('4.56' AS FLOAT)

the result is 4.55942779541 , it should be 4.56



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19250) In security cluster, spark beeline connect to hive metastore failed

2018-05-29 Thread zuotingbing (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-19250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493254#comment-16493254
 ] 

zuotingbing commented on SPARK-19250:
-

Same problem in spark 2.2.1 . But We add kinit for Kerberos before start the 
thrift server , beeline works well in spark 2.0.2

> In security cluster, spark beeline connect to hive metastore failed
> ---
>
> Key: SPARK-19250
> URL: https://issues.apache.org/jira/browse/SPARK-19250
> Project: Spark
>  Issue Type: Bug
>Reporter: meiyoula
>Priority: Major
>  Labels: security-issue
>
> 1. starting thriftserver in security mode, set hive.metastore.uris to hive 
> metastore uri, also hive is in security mode.
> 2. when use beeline to create table, it can't connect to hive metastore 
> successfully, occurs "Failed to find any Kerberos tgt".
> {quote}
> 2017-01-17 16:25:53,618 | ERROR | [pool-25-thread-1] | SASL negotiation 
> failure | 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:315)
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:513)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:249)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1533)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3119)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3138)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.setAuthorizerV2Config(SessionState.java:791)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:755)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getAuthenticator(SessionState.java:1461)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getUserFromAuthenticator(SessionState.java:1014)
> at 
> org.apache.hadoop.hive.ql.metadata.Table.getEmptyTable(Table.java:177)
> at org.apache.hadoop.hive.ql.metadata.Table.(Table.java:119)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$toHiveTable(HiveClientImpl.scala:803)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:430)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:430)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:430)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:284)
> at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:231)
> 

[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 8:02 AM:
--

cc [~vanzin]  [~rxin] [~yhuai]


was (Author: zuo.tingbing9):
cc [~vanzin]  [~rxin] Xiao Li

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 8:01 AM:
--

cc [~vanzin]  [~rxin] Xiao Li


was (Author: zuo.tingbing9):
cc [~vanzin]  [~rxin] [gatorsmile|https://github.com/gatorsmile]

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:53 AM:
--

cc [~vanzin]  [~rxin] [gatorsmile|https://github.com/gatorsmile]


was (Author: zuo.tingbing9):
cc [~vanzin]  [gatorsmile|https://github.com/gatorsmile]

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:51 AM:
--

cc [~vanzin]  [gatorsmile|https://github.com/gatorsmile]


was (Author: zuo.tingbing9):
cc [~vanzin]  

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:49 AM:
--

cc [~vanzin]  gatorsmile


was (Author: zuo.tingbing9):
cc [~vanzin] 

cc +gatorsmile+

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:49 AM:
--

cc [~vanzin]  


was (Author: zuo.tingbing9):
cc [~vanzin]  gatorsmile

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:48 AM:
--

cc [~vanzin] 

cc @gatorsmile


was (Author: zuo.tingbing9):
cc [~vanzin] gatorsmile

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:48 AM:
--

cc [~vanzin] 

cc +gatorsmile+


was (Author: zuo.tingbing9):
cc [~vanzin] 

cc @gatorsmile

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:47 AM:
--

cc [~vanzin] gatorsmile


was (Author: zuo.tingbing9):
cc [~vanzin] 

cc [gatorsmile|https://github.com/gatorsmile]

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:45 AM:
--

cc [~vanzin] 

cc [gatorsmile|https://github.com/gatorsmile]


was (Author: zuo.tingbing9):
cc [~vanzin]  [gatorsmile|https://github.com/gatorsmile]

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:44 AM:
--

cc [~vanzin]  [gatorsmile|https://github.com/gatorsmile]


was (Author: zuo.tingbing9):
cc [~vanzin]  *[gatorsmile|https://github.com/gatorsmile]*

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:43 AM:
--

cc [~vanzin]  *[gatorsmile|https://github.com/gatorsmile]*


was (Author: zuo.tingbing9):
cc [~vanzin]  @*[gatorsmile|https://github.com/gatorsmile]*

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing edited comment on SPARK-15544 at 4/17/18 7:43 AM:
--

cc [~vanzin]  @*[gatorsmile|https://github.com/gatorsmile]*


was (Author: zuo.tingbing9):
cc [~vanzin]

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440535#comment-16440535
 ] 

zuotingbing commented on SPARK-15544:
-

cc [~vanzin]

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15544) Bouncing Zookeeper node causes Active spark master to exit

2018-04-17 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440511#comment-16440511
 ] 

zuotingbing commented on SPARK-15544:
-

The same issue still occurs in spark 2.3.0.  

see [SPARK-23530|https://issues.apache.org/jira/browse/SPARK-23530]

> Bouncing Zookeeper node causes Active spark master to exit
> --
>
> Key: SPARK-15544
> URL: https://issues.apache.org/jira/browse/SPARK-15544
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1
> Environment: Ubuntu 14.04.  Zookeeper 3.4.6 with 3-node quorum
>Reporter: Steven Lowenthal
>Priority: Major
>
> Shutting Down a single zookeeper node caused spark master to exit.  The 
> master should have connected to a second zookeeper node. 
> {code:title=log output}
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/1 on worker worker-20160524013212-10.16.28.76-59138
> 16/05/25 18:21:28 INFO master.Master: Launching executor 
> app-20160525182128-0006/2 on worker worker-20160524013204-10.16.21.217-47129
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x154dfc0426b0054, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO zookeeper.ClientCnxn: Unable to read additional data 
> from server sessionid 0x254c701f28d0053, likely server has closed socket, 
> closing socket connection and attempting reconnect
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO state.ConnectionStateManager: State change: SUSPENDED
> 16/05/26 00:16:01 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 16/05/26 00:16:01 ERROR master.Master: Leadership has been revoked -- master 
> shutting down. }}
> {code}
> spark-env.sh: 
> {code:title=spark-env.sh}
> export SPARK_LOCAL_DIRS=/ephemeral/spark/local
> export SPARK_WORKER_DIR=/ephemeral/spark/work
> export SPARK_LOG_DIR=/var/log/spark
> export HADOOP_CONF_DIR=/home/ubuntu/hadoop-2.6.3/etc/hadoop
> export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER 
> -Dspark.deploy.zookeeper.url=gn5456-zookeeper-01:2181,gn5456-zookeeper-02:2181,gn5456-zookeeper-03:2181"
> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23745) Remove the directories of the “hive.downloaded.resources.dir” when HiveThriftServer2 stopped

2018-03-20 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23745:

Description: 
!2018-03-20_164832.png!  

when start the HiveThriftServer2, we create some directories for 
hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
remove these directories. The directories could accumulate a lot.

  was:
!2018-03-20_164832.png!  

when start the HiveThriftServer2, we create some  directories for 
hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
remove these directories.The directories could accumulate a lot.


> Remove the directories of the “hive.downloaded.resources.dir” when 
> HiveThriftServer2 stopped
> 
>
> Key: SPARK-23745
> URL: https://issues.apache.org/jira/browse/SPARK-23745
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
> Environment: linux
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-20_164832.png
>
>
> !2018-03-20_164832.png!  
> when start the HiveThriftServer2, we create some directories for 
> hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
> remove these directories. The directories could accumulate a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23745) Remove the directories of the “hive.downloaded.resources.dir” when HiveThriftServer2 stopped

2018-03-20 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23745:

Description: 
!2018-03-20_164832.png!  

when start the HiveThriftServer2, we create some  directories for 
hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
remove these directories.The directories could accumulate a lot.

  was:
 

when start the HiveThriftServer2, we create some  directories for 
hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
remove these directories.The directories could accumulate a lot.


> Remove the directories of the “hive.downloaded.resources.dir” when 
> HiveThriftServer2 stopped
> 
>
> Key: SPARK-23745
> URL: https://issues.apache.org/jira/browse/SPARK-23745
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
> Environment: linux
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-20_164832.png
>
>
> !2018-03-20_164832.png!  
> when start the HiveThriftServer2, we create some  directories for 
> hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
> remove these directories.The directories could accumulate a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23745) Remove the directories of the “hive.downloaded.resources.dir” when HiveThriftServer2 stopped

2018-03-20 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23745:

Description: 
 

when start the HiveThriftServer2, we create some  directories for 
hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
remove these directories.The directories could accumulate a lot.

  was:
!image-2018-03-20-16-49-00-175.png!

when start the HiveThriftServer2, we create some  directories for 
hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
remove these directories.The directories could accumulate a lot.


> Remove the directories of the “hive.downloaded.resources.dir” when 
> HiveThriftServer2 stopped
> 
>
> Key: SPARK-23745
> URL: https://issues.apache.org/jira/browse/SPARK-23745
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
> Environment: linux
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-20_164832.png
>
>
>  
> when start the HiveThriftServer2, we create some  directories for 
> hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
> remove these directories.The directories could accumulate a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23745) Remove the directories of the “hive.downloaded.resources.dir” when HiveThriftServer2 stopped

2018-03-20 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23745:

Attachment: 2018-03-20_164832.png

> Remove the directories of the “hive.downloaded.resources.dir” when 
> HiveThriftServer2 stopped
> 
>
> Key: SPARK-23745
> URL: https://issues.apache.org/jira/browse/SPARK-23745
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
> Environment: linux
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-20_164832.png
>
>
> !image-2018-03-20-16-49-00-175.png!
> when start the HiveThriftServer2, we create some  directories for 
> hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
> remove these directories.The directories could accumulate a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23745) Remove the directories of the “hive.downloaded.resources.dir” when HiveThriftServer2 stopped

2018-03-20 Thread zuotingbing (JIRA)
zuotingbing created SPARK-23745:
---

 Summary: Remove the directories of the 
“hive.downloaded.resources.dir” when HiveThriftServer2 stopped
 Key: SPARK-23745
 URL: https://issues.apache.org/jira/browse/SPARK-23745
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.0
 Environment: linux
Reporter: zuotingbing


!image-2018-03-20-16-49-00-175.png!

when start the HiveThriftServer2, we create some  directories for 
hive.downloaded.resources.dir, but when stop the HiveThriftServer2 we do not 
remove these directories.The directories could accumulate a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-06 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23547:

Description: 
  !2018-03-07_121010.png!

 

when the hive session closed, we should also cleanup the .pipeout file.

 

  was:
  !2018-03-07_121010.png!

when the hive session closed, we should also cleanup the .pipeout file.

 

 


> Cleanup the .pipeout file when the Hive Session closed
> --
>
> Key: SPARK-23547
> URL: https://issues.apache.org/jira/browse/SPARK-23547
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-07_121010.png
>
>
>   !2018-03-07_121010.png!
>  
> when the hive session closed, we should also cleanup the .pipeout file.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-06 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23547:

Description: 
  !2018-03-07_121010.png!

when the hive session closed, we should also cleanup the .pipeout file.

 

 

  was:
 

 

when the hive session closed, we should also cleanup the .pipeout file.

 

 


> Cleanup the .pipeout file when the Hive Session closed
> --
>
> Key: SPARK-23547
> URL: https://issues.apache.org/jira/browse/SPARK-23547
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-07_121010.png
>
>
>   !2018-03-07_121010.png!
> when the hive session closed, we should also cleanup the .pipeout file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-06 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23547:

Attachment: 2018-03-07_121010.png

> Cleanup the .pipeout file when the Hive Session closed
> --
>
> Key: SPARK-23547
> URL: https://issues.apache.org/jira/browse/SPARK-23547
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-07_121010.png
>
>
>  
>  
> when the hive session closed, we should also cleanup the .pipeout file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-06 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23547:

Attachment: (was: 2018-03-07_121010.png)

> Cleanup the .pipeout file when the Hive Session closed
> --
>
> Key: SPARK-23547
> URL: https://issues.apache.org/jira/browse/SPARK-23547
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-07_121010.png
>
>
>  
>  
> when the hive session closed, we should also cleanup the .pipeout file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-06 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23547:

Description: 
 

 

when the hive session closed, we should also cleanup the .pipeout file.

 

 

  was:
!2018-03-01_202415.png!

 

when the hive session closed, we should also cleanup the .pipeout file.

 

 


> Cleanup the .pipeout file when the Hive Session closed
> --
>
> Key: SPARK-23547
> URL: https://issues.apache.org/jira/browse/SPARK-23547
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-07_121010.png
>
>
>  
>  
> when the hive session closed, we should also cleanup the .pipeout file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-06 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23547:

Attachment: 2018-03-07_121010.png

> Cleanup the .pipeout file when the Hive Session closed
> --
>
> Key: SPARK-23547
> URL: https://issues.apache.org/jira/browse/SPARK-23547
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-07_121010.png
>
>
> !2018-03-01_202415.png!
>  
> when the hive session closed, we should also cleanup the .pipeout file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-06 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23547:

Attachment: (was: 2018-03-01_202415.png)

> Cleanup the .pipeout file when the Hive Session closed
> --
>
> Key: SPARK-23547
> URL: https://issues.apache.org/jira/browse/SPARK-23547
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-07_121010.png
>
>
> !2018-03-01_202415.png!
>  
> when the hive session closed, we should also cleanup the .pipeout file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-01 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23547:

Description: 
!2018-03-01_202415.png!

 

when the hive session closed, we should also cleanup the .pipeout file.

 

 

  was:
when the hive session closed, we should also cleanup the .pipeout file.

 

 


> Cleanup the .pipeout file when the Hive Session closed
> --
>
> Key: SPARK-23547
> URL: https://issues.apache.org/jira/browse/SPARK-23547
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-01_202415.png
>
>
> !2018-03-01_202415.png!
>  
> when the hive session closed, we should also cleanup the .pipeout file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-01 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-23547:

Attachment: 2018-03-01_202415.png

> Cleanup the .pipeout file when the Hive Session closed
> --
>
> Key: SPARK-23547
> URL: https://issues.apache.org/jira/browse/SPARK-23547
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: zuotingbing
>Priority: Major
> Attachments: 2018-03-01_202415.png
>
>
> when the hive session closed, we should also cleanup the .pipeout file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23547) Cleanup the .pipeout file when the Hive Session closed

2018-03-01 Thread zuotingbing (JIRA)
zuotingbing created SPARK-23547:
---

 Summary: Cleanup the .pipeout file when the Hive Session closed
 Key: SPARK-23547
 URL: https://issues.apache.org/jira/browse/SPARK-23547
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.0
Reporter: zuotingbing


when the hive session closed, we should also cleanup the .pipeout file.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-22793) Memory leak in Spark Thrift Server

2017-12-25 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292314#comment-16292314
 ] 

zuotingbing edited comment on SPARK-22793 at 12/26/17 2:00 AM:
---

yes the master branch also has this problem.


was (Author: zuo.tingbing9):
yes the master branch also has this problem, but the difference is so big 
between branch master and 2.0 . Could someone help to merge this to the master 
branch?

> Memory leak in Spark Thrift Server
> --
>
> Key: SPARK-22793
> URL: https://issues.apache.org/jira/browse/SPARK-22793
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.2.1
>Reporter: zuotingbing
>Priority: Critical
>
> 1. Start HiveThriftServer2.
> 2. Connect to thriftserver through beeline.
> 3. Close the beeline.
> 4. repeat step2 and step 3 for several times, which caused the leak of Memory.
> we found there are many directories never be dropped under the path
> {code:java}
> hive.exec.local.scratchdir
> {code} and 
> {code:java}
> hive.exec.scratchdir
> {code} , as we know the scratchdir has been added to deleteOnExit when it be 
> created. So it means that the cache size of FileSystem deleteOnExit will keep 
> increasing until JVM terminated.
> In addition, we use 
> {code:java}
> jmap -histo:live [PID]
> {code} to printout the size of objects in HiveThriftServer2 Process, we can 
> find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
> "org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though 
> we closed all the beeline connections, which caused the leak of Memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22793) Memory leak in Spark Thrift Server

2017-12-20 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-22793:

Affects Version/s: 2.2.1

> Memory leak in Spark Thrift Server
> --
>
> Key: SPARK-22793
> URL: https://issues.apache.org/jira/browse/SPARK-22793
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.2.1
>Reporter: zuotingbing
>Priority: Critical
>
> 1. Start HiveThriftServer2.
> 2. Connect to thriftserver through beeline.
> 3. Close the beeline.
> 4. repeat step2 and step 3 for several times, which caused the leak of Memory.
> we found there are many directories never be dropped under the path
> {code:java}
> hive.exec.local.scratchdir
> {code} and 
> {code:java}
> hive.exec.scratchdir
> {code} , as we know the scratchdir has been added to deleteOnExit when it be 
> created. So it means that the cache size of FileSystem deleteOnExit will keep 
> increasing until JVM terminated.
> In addition, we use 
> {code:java}
> jmap -histo:live [PID]
> {code} to printout the size of objects in HiveThriftServer2 Process, we can 
> find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
> "org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though 
> we closed all the beeline connections, which caused the leak of Memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22837) Session timeout checker does not work in SessionManager

2017-12-19 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-22837:

Summary: Session timeout checker does not work in SessionManager  (was: 
Session timeout checker does not work in Hive Thrift Server)

> Session timeout checker does not work in SessionManager
> ---
>
> Key: SPARK-22837
> URL: https://issues.apache.org/jira/browse/SPARK-22837
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.2.1
>Reporter: zuotingbing
>
> Currently, 
> {code:java}
> SessionManager.init
> {code}
>  will not be called, the config 
> {code:java}
> HIVE_SERVER2_SESSION_CHECK_INTERVAL HIVE_SERVER2_IDLE_SESSION_TIMEOUT 
> HIVE_SERVER2_IDLE_SESSION_CHECK_OPERATION
> {code}
> of session timeout checker can not be loaded, it cause the session timeout 
> checker does not work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22837) Session timeout checker does not work in Hive Thrift Server

2017-12-19 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-22837:

Description: 
Currently, 
{code:java}
SessionManager.init
{code}
 will not be called, the config 
{code:java}
HIVE_SERVER2_SESSION_CHECK_INTERVAL HIVE_SERVER2_IDLE_SESSION_TIMEOUT 
HIVE_SERVER2_IDLE_SESSION_CHECK_OPERATION
{code}
of session timeout checker can not be loaded, it cause the session timeout 
checker does not work.

  was:
Currently, 
{code:java}
SessionManager.int
{code}
 will not be called, the config 
{code:java}
HIVE_SERVER2_SESSION_CHECK_INTERVAL HIVE_SERVER2_IDLE_SESSION_TIMEOUT 
HIVE_SERVER2_IDLE_SESSION_CHECK_OPERATION
{code}
of session timeout checker can not be loaded, it cause the session timeout 
checker does not work.


> Session timeout checker does not work in Hive Thrift Server
> ---
>
> Key: SPARK-22837
> URL: https://issues.apache.org/jira/browse/SPARK-22837
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.2.1
>Reporter: zuotingbing
>
> Currently, 
> {code:java}
> SessionManager.init
> {code}
>  will not be called, the config 
> {code:java}
> HIVE_SERVER2_SESSION_CHECK_INTERVAL HIVE_SERVER2_IDLE_SESSION_TIMEOUT 
> HIVE_SERVER2_IDLE_SESSION_CHECK_OPERATION
> {code}
> of session timeout checker can not be loaded, it cause the session timeout 
> checker does not work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-22837) Session timeout checker does not work in Hive Thrift Server

2017-12-19 Thread zuotingbing (JIRA)
zuotingbing created SPARK-22837:
---

 Summary: Session timeout checker does not work in Hive Thrift 
Server
 Key: SPARK-22837
 URL: https://issues.apache.org/jira/browse/SPARK-22837
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.1, 2.0.2
Reporter: zuotingbing


Currently, 
{code:java}
SessionManager.int
{code}
 will not be called, the config 
{code:java}
HIVE_SERVER2_SESSION_CHECK_INTERVAL HIVE_SERVER2_IDLE_SESSION_TIMEOUT 
HIVE_SERVER2_IDLE_SESSION_CHECK_OPERATION
{code}
of session timeout checker can not be loaded, it cause the session timeout 
checker does not work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-22793) Memory leak in Spark Thrift Server

2017-12-15 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292314#comment-16292314
 ] 

zuotingbing edited comment on SPARK-22793 at 12/15/17 10:48 AM:


yes the master branch also has this problem, but the difference is so big 
between branch master and 2.0 . Could someone help to merge this to the master 
branch?


was (Author: zuo.tingbing9):
yes the master branch also has this problem,but the different is so big between 
branch master to 2.0 . i am not sure this can be merged to the master branch.

> Memory leak in Spark Thrift Server
> --
>
> Key: SPARK-22793
> URL: https://issues.apache.org/jira/browse/SPARK-22793
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2
>Reporter: zuotingbing
>Priority: Critical
>
> 1. Start HiveThriftServer2.
> 2. Connect to thriftserver through beeline.
> 3. Close the beeline.
> 4. repeat step2 and step 3 for several times, which caused the leak of Memory.
> we found there are many directories never be dropped under the path
> {code:java}
> hive.exec.local.scratchdir
> {code} and 
> {code:java}
> hive.exec.scratchdir
> {code} , as we know the scratchdir has been added to deleteOnExit when it be 
> created. So it means that the cache size of FileSystem deleteOnExit will keep 
> increasing until JVM terminated.
> In addition, we use 
> {code:java}
> jmap -histo:live [PID]
> {code} to printout the size of objects in HiveThriftServer2 Process, we can 
> find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
> "org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though 
> we closed all the beeline connections, which caused the leak of Memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22793) Memory leak in Spark Thrift Server

2017-12-15 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292314#comment-16292314
 ] 

zuotingbing commented on SPARK-22793:
-

yes the master branch also has this problem,but the different is so big between 
branch master to 2.0 . i am not sure this can be merged to the master branch.

> Memory leak in Spark Thrift Server
> --
>
> Key: SPARK-22793
> URL: https://issues.apache.org/jira/browse/SPARK-22793
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2
>Reporter: zuotingbing
>Priority: Critical
>
> 1. Start HiveThriftServer2.
> 2. Connect to thriftserver through beeline.
> 3. Close the beeline.
> 4. repeat step2 and step 3 for several times, which caused the leak of Memory.
> we found there are many directories never be dropped under the path
> {code:java}
> hive.exec.local.scratchdir
> {code} and 
> {code:java}
> hive.exec.scratchdir
> {code} , as we know the scratchdir has been added to deleteOnExit when it be 
> created. So it means that the cache size of FileSystem deleteOnExit will keep 
> increasing until JVM terminated.
> In addition, we use 
> {code:java}
> jmap -histo:live [PID]
> {code} to printout the size of objects in HiveThriftServer2 Process, we can 
> find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
> "org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though 
> we closed all the beeline connections, which caused the leak of Memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22793) Memory leak in Spark Thrift Server

2017-12-15 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292248#comment-16292248
 ] 

zuotingbing commented on SPARK-22793:
-

ok, i will try to check it in master branch. Thanks.

> Memory leak in Spark Thrift Server
> --
>
> Key: SPARK-22793
> URL: https://issues.apache.org/jira/browse/SPARK-22793
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2
>Reporter: zuotingbing
>Priority: Critical
>
> 1. Start HiveThriftServer2.
> 2. Connect to thriftserver through beeline.
> 3. Close the beeline.
> 4. repeat step2 and step 3 for several times, which caused the leak of Memory.
> we found there are many directories never be dropped under the path
> {code:java}
> hive.exec.local.scratchdir
> {code} and 
> {code:java}
> hive.exec.scratchdir
> {code} , as we know the scratchdir has been added to deleteOnExit when it be 
> created. So it means that the cache size of FileSystem deleteOnExit will keep 
> increasing until JVM terminated.
> In addition, we use 
> {code:java}
> jmap -histo:live [PID]
> {code} to printout the size of objects in HiveThriftServer2 Process, we can 
> find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
> "org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though 
> we closed all the beeline connections, which caused the leak of Memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22793) Memory leak in Spark Thrift Server

2017-12-15 Thread zuotingbing (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292174#comment-16292174
 ] 

zuotingbing commented on SPARK-22793:
-


{code:java}
lazy val metadataHive: HiveClient = sharedState.metadataHive.newSession()
{code}

HiveClient has been created by 
{code:java}
sharedState.metadataHive
{code}
but will be created again in
{code:java}
.newSession()
{code}



> Memory leak in Spark Thrift Server
> --
>
> Key: SPARK-22793
> URL: https://issues.apache.org/jira/browse/SPARK-22793
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2
>Reporter: zuotingbing
>Priority: Critical
>
> 1. Start HiveThriftServer2.
> 2. Connect to thriftserver through beeline.
> 3. Close the beeline.
> 4. repeat step2 and step 3 for several times, which caused the leak of Memory.
> we found there are many directories never be dropped under the path
> {code:java}
> hive.exec.local.scratchdir
> {code} and 
> {code:java}
> hive.exec.scratchdir
> {code} , as we know the scratchdir has been added to deleteOnExit when it be 
> created. So it means that the cache size of FileSystem deleteOnExit will keep 
> increasing until JVM terminated.
> In addition, we use 
> {code:java}
> jmap -histo:live [PID]
> {code} to printout the size of objects in HiveThriftServer2 Process, we can 
> find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
> "org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though 
> we closed all the beeline connections, which caused the leak of Memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22793) Memory leak in Spark Thrift Server

2017-12-14 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-22793:

Description: 
1. Start HiveThriftServer2.
2. Connect to thriftserver through beeline.
3. Close the beeline.
4. repeat step2 and step 3 for several times, which caused the leak of Memory.

we found there are many directories never be dropped under the path
{code:java}
hive.exec.local.scratchdir
{code} and 
{code:java}
hive.exec.scratchdir
{code} , as we know the scratchdir has been added to deleteOnExit when it be 
created. So it means that the cache size of FileSystem deleteOnExit will keep 
increasing until JVM terminated.

In addition, we use 
{code:java}
jmap -histo:live [PID]
{code} to printout the size of objects in HiveThriftServer2 Process, we can 
find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
"org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though we 
closed all the beeline connections, which caused the leak of Memory.




  was:
1. Start HiveThriftServer2
2. Connect to thriftserver through beeline
3. Close the beeline
4. repeat step2 and step 3 for several times

we found there are many directories never be dropped under the path
{code:java}
hive.exec.local.scratchdir
{code} and 
{code:java}
hive.exec.scratchdir
{code} , as we know the scratchdir has been added to deleteOnExit when it be 
created. So it means that the cache size of FileSystem deleteOnExit will keep 
increasing until JVM terminated.

In addition, we use 
{code:java}
jmap -histo:live [PID]
{code} to printout the size of objects in HiveThriftServer2 Process, we can 
find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
"org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though we 
closed all the beeline connections, which caused the leak of Memory.





> Memory leak in Spark Thrift Server
> --
>
> Key: SPARK-22793
> URL: https://issues.apache.org/jira/browse/SPARK-22793
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2
>Reporter: zuotingbing
>Priority: Critical
>
> 1. Start HiveThriftServer2.
> 2. Connect to thriftserver through beeline.
> 3. Close the beeline.
> 4. repeat step2 and step 3 for several times, which caused the leak of Memory.
> we found there are many directories never be dropped under the path
> {code:java}
> hive.exec.local.scratchdir
> {code} and 
> {code:java}
> hive.exec.scratchdir
> {code} , as we know the scratchdir has been added to deleteOnExit when it be 
> created. So it means that the cache size of FileSystem deleteOnExit will keep 
> increasing until JVM terminated.
> In addition, we use 
> {code:java}
> jmap -histo:live [PID]
> {code} to printout the size of objects in HiveThriftServer2 Process, we can 
> find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
> "org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though 
> we closed all the beeline connections, which caused the leak of Memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22793) Memory leak in Spark Thrift Server

2017-12-14 Thread zuotingbing (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated SPARK-22793:

Description: 
1. Start HiveThriftServer2
2. Connect to thriftserver through beeline
3. Close the beeline
4. repeat step2 and step 3 for several times

we found there are many directories never be dropped under the path
{code:java}
hive.exec.local.scratchdir
{code} and 
{code:java}
hive.exec.scratchdir
{code} , as we know the scratchdir has been added to deleteOnExit when it be 
created. So it means that the cache size of FileSystem deleteOnExit will keep 
increasing until JVM terminated.

In addition, we use 
{code:java}
jmap -histo:live [PID]
{code} to printout the size of objects in HiveThriftServer2 Process, we can 
find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
"org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though we 
closed all the beeline connections, which caused the leak of Memory.




  was:
1. Start HiveThriftServer2
2. Connect to thriftserver through beeline
3. Close the beeline
4. repeat step2 and step 3 for several times

we found there are many directories never be dropped under the path
{code:java}
hive.exec.local.scratchdir
{code} and 
{code:java}
hive.exec.scratchdir
{code} , as we know the scratchdir is added to deleteOnExit when it be created. 
So it means that the cache size of FileSystem deleteOnExit will keep increasing 
until JVM terminated.

In addition, we use 
{code:java}
jmap -histo:live [PID]
{code} to printout the size of objects in HiveThriftServer2 Process, we can 
find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
"org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though we 
closed all the beeline connections, which caused the leak of Memory.





> Memory leak in Spark Thrift Server
> --
>
> Key: SPARK-22793
> URL: https://issues.apache.org/jira/browse/SPARK-22793
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2
>Reporter: zuotingbing
>Priority: Critical
>
> 1. Start HiveThriftServer2
> 2. Connect to thriftserver through beeline
> 3. Close the beeline
> 4. repeat step2 and step 3 for several times
> we found there are many directories never be dropped under the path
> {code:java}
> hive.exec.local.scratchdir
> {code} and 
> {code:java}
> hive.exec.scratchdir
> {code} , as we know the scratchdir has been added to deleteOnExit when it be 
> created. So it means that the cache size of FileSystem deleteOnExit will keep 
> increasing until JVM terminated.
> In addition, we use 
> {code:java}
> jmap -histo:live [PID]
> {code} to printout the size of objects in HiveThriftServer2 Process, we can 
> find the object "org.apache.spark.sql.hive.client.HiveClientImpl" and 
> "org.apache.hadoop.hive.ql.session.SessionState" keep increasing even though 
> we closed all the beeline connections, which caused the leak of Memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



  1   2   >