My 5 cents. As far as I understand, there are two narrow points in Zookeeper: 1) single-threaded transaction numbering on coordinator 2) writing data. Number 1 IMHO can't be avoided, but it can potentially be very fast as single AtomicLong is needed. After transaction numbering is done, all other work, including data writing, can be distributed and parallelized as any recovery / reordering can use this sequential numbering to ensure that there are no gaps in transactions sent to clients. And actually, I'd use some hashing for workload distribution to ensure that there are no hot spots.
Are there any weak points in my logic? Best regards, Vitalii Tymchyshyn.
