[jira] [Comment Edited] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Mate Szalay-Beko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071778#comment-17071778
 ] 

Mate Szalay-Beko edited comment on HDFS-15251 at 3/31/20, 1:24 PM:
---

[~jiangjianfei], [~weichiu]
 I guess this logic is needed for triggering a new election to decide who 
should be the active / stand-by name node. I am not very familiar with the HDFS 
code, so can not review that part, but I can give you some background info 
about the CLOSED state.

It was introduced by ZOOKEEPER-2368 . The ZooKeeper watcher gets notified when 
the connection was broken by the ZooKeeper server and the connection state is 
DISCONNECTED in this case. The new behaviour in 3.5.5+ is that a new watcher 
event gets triggered even if the ZooKeeper client was the one closing the 
connection in which case the connection state will be CLOSED.

So (as far as I can tell) it is never possible to get two watcher events when 
the connection is closing. There will be only a single event and the state 
should be either DISCONNECTED or CLOSED. Depending on who initiated the closing 
of the connection. This makes the proposed patch logical. Handling this watcher 
event definitely makes sense (at least to log it).

On the other hand I am not sure what is the expected behaviour in HDFS failover 
controller if HDFS is closing the ZooKeeper connection. When do we call 
ZooKeeper.close() on the connection in the HDFS code? I guess HDFS can do this 
e.g. during some graceful shutdown in the failover controller process. Are we 
sure we want to go to neutral mode and rejoin to election during shutdown? I 
really don't know the background, so I let you to decide.


was (Author: symat):
[~jiangjianfei], [~weichiu]
I guess this logic is needed for triggering a new election to decide who should 
be the active / stand-by name node. I am not very familiar with the HDFS code, 
so can not review that part, but I can give you some background info about the 
CLOSED state.

It was introduced by [ZOOKEEPER-2368 
|https://issues.apache.org/jira/browse/ZOOKEEPER-2368]. The ZooKeeper watcher 
gets notified when the connection was broken by the ZooKeeper server and the 
connection state is DISCONNECTED in this case. The new behaviour in 3.5.5+ is 
that a new watcher event gets triggered even if the ZooKeeper client was the 
one closing the connection in which case the connection state will be CLOSED.

So (as far as I can tell) it is never possible to get two watcher event when 
the connection is closing. There will be only a single event and the state 
should be either DISCONNECTED or CLOSED. Depending on who initiated the closing 
of the connection. This makes the proposed patch logical. Handling this watcher 
event definitely makes sense (at least to log it).

On the other hand I am not sure what is the expected behaviour in HDFS failover 
controller when HDFS is closing the ZooKeeper connection. When do we call 
ZooKeeper.close() on the connection in the HDFS code? I guess HDFS might do 
this during some graceful shutdown in the failover controller process. Are we 
sure we want to go to neutral mode and rejoin to election during shutdown? I 
really don't know the background, so I let you to decide.

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251.002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15251) Add new zookeeper event type case after zk updated to 3.5.x

2020-03-31 Thread Mate Szalay-Beko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071778#comment-17071778
 ] 

Mate Szalay-Beko commented on HDFS-15251:
-

[~jiangjianfei], [~weichiu]
I guess this logic is needed for triggering a new election to decide who should 
be the active / stand-by name node. I am not very familiar with the HDFS code, 
so can not review that part, but I can give you some background info about the 
CLOSED state.

It was introduced by [ZOOKEEPER-2368 
|https://issues.apache.org/jira/browse/ZOOKEEPER-2368]. The ZooKeeper watcher 
gets notified when the connection was broken by the ZooKeeper server and the 
connection state is DISCONNECTED in this case. The new behaviour in 3.5.5+ is 
that a new watcher event gets triggered even if the ZooKeeper client was the 
one closing the connection in which case the connection state will be CLOSED.

So (as far as I can tell) it is never possible to get two watcher event when 
the connection is closing. There will be only a single event and the state 
should be either DISCONNECTED or CLOSED. Depending on who initiated the closing 
of the connection. This makes the proposed patch logical. Handling this watcher 
event definitely makes sense (at least to log it).

On the other hand I am not sure what is the expected behaviour in HDFS failover 
controller when HDFS is closing the ZooKeeper connection. When do we call 
ZooKeeper.close() on the connection in the HDFS code? I guess HDFS might do 
this during some graceful shutdown in the failover controller process. Are we 
sure we want to go to neutral mode and rejoin to election during shutdown? I 
really don't know the background, so I let you to decide.

> Add new zookeeper event type case after zk updated to 3.5.x
> ---
>
> Key: HDFS-15251
> URL: https://issues.apache.org/jira/browse/HDFS-15251
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.3.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-15251.001.patch, HDFS-15251.002.patch
>
>
> In zookeeper 3.5.x, KeeperState add a new one named Closed, so should add 
> Close case to the swich as it is not an unexpected Zookeeper watch event 
> state.
> {code:java}
> /** @deprecated */
>  @Deprecated
>  Unknown(-1),
>  Disconnected(0),
>  /** @deprecated */
>  @Deprecated
>  NoSyncConnected(1),
>  SyncConnected(3),
>  AuthFailed(4),
>  ConnectedReadOnly(5),
>  SaslAuthenticated(6),
>  Expired(-112),
>  Closed(7);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org