[
https://issues.apache.org/jira/browse/BOOKKEEPER-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791714#comment-13791714
]
Sijie Guo commented on BOOKKEEPER-638:
--------------------------------------
> a bookie could register with zk,
a bookie only register itself after bookie is started. so in general, no write
requests would be sent during startup.
> a client sees it, tries to use it, and hit the RejectedRequestHandler.
This is exactly what the patch tends to do. If a bookie is not ready (can't
serving any request), we should fail any read/write request fast. A client
can't be aware of a channel is turning to readable in server side, so it has to
wait for timeout and retry other bookie, which introduce latency impact.
> Two bookies could start at the same time to access bookie data.
> ---------------------------------------------------------------
>
> Key: BOOKKEEPER-638
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-638
> Project: Bookkeeper
> Issue Type: Bug
> Components: bookkeeper-server
> Affects Versions: 4.3.0
> Reporter: Sijie Guo
> Assignee: Sijie Guo
> Priority: Blocker
> Fix For: 4.3.0
>
> Attachments: BOOKKEEPER-638.diff
>
>
> this issue is introduced in providing netty server for bookie.
> in BOOKKEEPER-294, we agreed on the start sequence of bookie:
> 1) bind bookie port first (to avoid two processes running at the same host).
> 2) start bookie (e.g initialize bookie storage and replaying journals)
> 3) start nio server to accept incoming requests.
> but after refactoring for netty server, step 1) is combined to be executed in
> step 3), so two processes could have chance to run at the same time replaying
> journals. this is pretty bad.
> we need to change the code to stick on the sequence described above.
--
This message was sent by Atlassian JIRA
(v6.1#6144)