[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541820#comment-14541820
 ] 

Rakesh R commented on ZOOKEEPER-2175:
-------------------------------------

Thanks [~cnauroth] for the interest in introducing/contributing a new feature. 
Yes, {{checksum}} validation will be helpful to detect the transmission errors. 
I think there is a space to discuss this feature in detail and implementing 
here. At very high level I can think of performance aspect because this will 
have the additional overhead of CPU computation and I/O resources - network 
layer and disk persistence layer (as packet size will increase).

In this particular case [~brahmareddy] has mentioned that client {{sessionId}} 
has corrupted when the request reaches at the server and not the {{user data}}. 
By seeing this I still have doubt in my mind why both NN has transition to 
standby, but I couldn't find the reason what has happened to the ZooKeeper 
client due to the {{sessionId}} corruption. This question is actually not 
relevant to ZK project, I'm posting my doubt here because the discussions are 
revolving around ZK transmission errors. Probably I'll also try to look at the 
{{ActiveStandbyElector}} code and ZK code again.

> Checksum validation for malformed packets needs to handle.
> ----------------------------------------------------------
>
>                 Key: ZOOKEEPER-2175
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2175
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: Brahma Reddy Battula
>
>  *Session Id from ZK :* 
> 2015-04-15 21:24:54,257 | INFO  | CommitProcessor:22 | Established session 
> 0x164cb2b3e4b36ae4 with negotiated timeout 45000 for client 
> /160.149.0.117:44586 | 
> org.apache.zookeeper.server.ZooKeeperServer.finishSessionInit(ZooKeeperServer.java:623)
> 2015-04-15 21:24:54,261 | INFO  | 
> NIOServerCxn.Factory:160-149-0-114/160.149.0.114:24002 | Successfully 
> authenticated client: authenticationID=hdfs/[email protected];  
> authorizationID=hdfs/[email protected]. | 
> org.apache.zookeeper.server.auth.SaslServerCallbackHandler.handleAuthorizeCallback(SaslServerCallbackHandler.java:118)
> 2015-04-15 21:24:54,261 | INFO  | 
> NIOServerCxn.Factory:160-149-0-114/160.149.0.114:24002 | Setting 
> authorizedID: hdfs/[email protected] | 
> org.apache.zookeeper.server.auth.SaslServerCallbackHandler.handleAuthorizeCallback(SaslServerCallbackHandler.java:134)
> 2015-04-15 21:24:54,261 | INFO  | 
> NIOServerCxn.Factory:160-149-0-114/160.149.0.114:24002 | adding SASL 
> authorization for authorizationID: hdfs/[email protected] | 
> org.apache.zookeeper.server.ZooKeeperServer.processSasl(ZooKeeperServer.java:1009)
> 2015-04-15 21:24:54,262 | INFO  | ProcessThread(sid:22 cport:-1): | Got 
> user-level KeeperException when processing  
> *{color:red}sessionid:0x164cb2b3e4b36ae4{color}*  type:create cxid:0x3 
> zxid:0x20009fafc txntype:-1 reqpath:n/a Error 
> Path:/hadoop-ha/hacluster/ActiveStandbyElectorLock Error:KeeperErrorCode = 
> NodeExists for /hadoop-ha/hacluster/ActiveStandbyElectorLock | 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:648)
>  *ZKFC Received :*  ZK client
> 2015-04-15 21:24:54,237 | INFO  | main-SendThread(160-149-0-114:24002) | 
> Socket connection established to 160-149-0-114/160.149.0.114:24002, 
> initiating session | 
> org.apache.zookeeper.ClientCnxn$SendThread.primeConnection(ClientCnxn.java:854)
> 2015-04-15 21:24:54,257 | INFO  | main-SendThread(160-149-0-114:24002) | 
> Session establishment complete on server 160-149-0-114/160.149.0.114:24002,  
> *{color:blue}sessionid = 0x144cb2b3e4b36ae4 {color}* , negotiated timeout = 
> 45000 | 
> org.apache.zookeeper.ClientCnxn$SendThread.onConnected(ClientCnxn.java:1259)
> 2015-04-15 21:24:54,260 | INFO  | main-EventThread | EventThread shut down | 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:512)
> 2015-04-15 21:24:54,262 | INFO  | main-EventThread | Session connected. | 
> org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:547)
> 2015-04-15 21:24:54,264 | INFO  | main-EventThread | Successfully 
> authenticated to ZooKeeper using SASL. | 
> org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:573)
> one bit corrupted..please check the following for same..
> 144cb2b3e4b36ae4=1010001001100101100101011001111100100101100110110101011100100
> 164cb2b3e4b36ae4=1011001001100101100101011001111100100101100110110101011100100



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to