[
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530222#comment-13530222
]
Jay Shrauner commented on ZOOKEEPER-1505:
-----------------------------------------
Alex- The race condition is within FinalRequestProcessor on any node--Leader,
Follower, or Observer. This has nothing to do with serialization order of the
leader. Watch setting/firing is a write operation only on the local node of
locally maintained state. What happens is, say client A is toggling the value
of node /X from 1 to 2, and client B is reading and setting a watch on node /X.
Client B will always see a consistent view; it may however not receive a watch
firing so it may never know to read value 2. If client B is relying on timely
watch firing to keep its data fresh, this is a problem.
1. It is possible for thread C1 to process client B reading value 1 and setting
the watch; thread C2 to process client A writing 2 to /X, firing the watch,
writing this out to client B's network stack (the watch firing); and finally
thread C1 to push the read of value 1 onto client B's network stack. Because
the return value of a getData-and-setWatch call came after the watch fired, the
client will possibly ignore the watch firing. So eg say client B had originally
responded to a watch firing on /X. In its view, it sees /X watch fire, it sends
a getData request, it sees /X watch fire again (which it ignores, because it
already has a getData outstanding), and finally it gets the response to its
getData request.
2. It is also possible for client B to read value 1, client A to write value 2
and check for watch firing, and then for client B to reset the watch. There is
no locking guarding the atomicity of client B reading /X and setting the watch
on /X.
It is relatively straightforward to add locking preventing case (2), but for
case (1) I think we need to restrict parallelism in FinalRequestProcessor.
We can improve the parallelism here, but it hit the point where I wanted to
leave that for a future Jira. If we could identify which read requests set
watches, and treat those as a third type, we could then allow pure read
requests from client B to process simultaneously with write request from client
A. Current code only fully parses getData and other read request blocks in
FinalRequestProcessor, so we would need to move this up earlier, which might
however have performance implications.
> Multi-thread CommitProcessor
> ----------------------------
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
> Issue Type: Improvement
> Components: server
> Affects Versions: 3.4.3, 3.4.4, 3.5.0
> Reporter: Jay Shrauner
> Assignee: Jay Shrauner
> Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch,
> ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues
> and runs all downstream processors. This is noticeably inefficient for
> read-intensive workloads, which could be run concurrently. The trick is
> handling write transactions. I propose multi-threading this code according to
> the following two constraints
> - each session must see its requests responded to in order
> - all committed transactions must be handled in zxid order, across all
> sessions
> I believe these cover the only constraints we need to honor. In particular, I
> believe we can relax the following:
> - it does not matter if the read request in one session happens before or
> after the write request in another session
> With these constraints, I propose the following threads
> - 1 primary queue servicing/work dispatching thread
> - 0-N assignable worker threads, where a given session is always assigned
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple
> sessionId mod number of worker threads), we guarantee the first constraint--
> requests we push onto the thread queue are processed in order. The way we
> guarantee the second constraint is we only allow a single commit transaction
> to be in flight at a time--the queue servicing thread blocks while a commit
> transaction is in flight, and when the transaction completes it clears the
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32
> worker threads for a 56% +/- 5% improvement in throughput (this improvement
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that
> makes worker threads daemon threads and names then in an easily debuggable
> manner. Supports assignable threads (as used here) and non-assignable threads
> (as used by NIOServerCnxnFactory).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira