[ 
https://issues.apache.org/jira/browse/KAFKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702992#comment-14702992
 ] 

James Lent commented on KAFKA-1387:
-----------------------------------

Your approach sounds much simpler than mine (which I like).  Similar to what I 
proposed doing only at startup (ensureNodeDoesNotExist method).  I am however 
not sure I understand the exact change you propose.  As I remember the 
createEphemeralPathExpectConflictHandleZKBug is called by three code paths:

- Register Broker
- Register Consumer
- Leadership election  

In my change I specifically tried avoid changing the Leadership election logic.

Is your change basically to implement your new logic (delete if already exists) 
instead of calling createEphemeralPathExpectConflictHandleZKBug in the first 
two cases?  If so I agree it sounds reasonable.  I suppose in a 
misconfiguration case two nodes might get into a registration war over the 
Broker node, but, that could (perhaps) be handled at startup (second one fails 
to start up).

If your propose replacing the createEphemeralPathExpectConflictHandleZKBug for 
the Leadership election case too then I am less comfortable making (and 
testing) that change.  I have never really dug into that logic too much.

One other factor to consider is that I am a bit backed up a work right now and 
this will not be issue will not be my highest priority.


> Kafka getting stuck creating ephemeral node it has already created when two 
> zookeeper sessions are established in a very short period of time
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1387
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1387
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8.1.1
>            Reporter: Fedor Korotkiy
>            Priority: Blocker
>              Labels: newbie, patch, zkclient-problems
>         Attachments: kafka-1387.patch
>
>
> Kafka broker re-registers itself in zookeeper every time handleNewSession() 
> callback is invoked.
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/server/KafkaHealthcheck.scala
>  
> Now imagine the following sequence of events.
> 1) Zookeeper session reestablishes. handleNewSession() callback is queued by 
> the zkClient, but not invoked yet.
> 2) Zookeeper session reestablishes again, queueing callback second time.
> 3) First callback is invoked, creating /broker/[id] ephemeral path.
> 4) Second callback is invoked and it tries to create /broker/[id] path using 
> createEphemeralPathExpectConflictHandleZKBug() function. But the path is 
> already exists, so createEphemeralPathExpectConflictHandleZKBug() is getting 
> stuck in the infinite loop.
> Seems like controller election code have the same issue.
> I'am able to reproduce this issue on the 0.8.1 branch from github using the 
> following configs.
> # zookeeper
> tickTime=10
> dataDir=/tmp/zk/
> clientPort=2101
> maxClientCnxns=0
> # kafka
> broker.id=1
> log.dir=/tmp/kafka
> zookeeper.connect=localhost:2101
> zookeeper.connection.timeout.ms=100
> zookeeper.sessiontimeout.ms=100
> Just start kafka and zookeeper and then pause zookeeper several times using 
> Ctrl-Z.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to