Re: ZooKeeper snapCount Tuning
The workload is a more deciding factor than hardwares when tuning zookeeper.snapCount and other config parameters, under current ZK implementation. I am afraid there is no universal value that applicable to every case although we can provide recommended settings by benchmarking predictable and usual workloads. In any cases, larger snap count leading to less frequent snapshotting, which should improve system performance but at the cost of longer recovery time. For new hardwares, I think most of time ZK just get the benefit for free. For preallocation, I agree and think that'll still be useful, as that's a file system thing and work regardless of underlying medium. To get optimal usage of the new hardware would require more thought, and just borrow some ideas from database world that might applicable to ZK: * Off loading snapshot to dedicated hardware accelerator like FPGA. * SyncRequestProcessor can flush transaction to NVRam without buffering and group commit. * Durable ZK data tree on NVRam that does not require WAL and snapshot. I suspect not much going on here as ZK, unlike databases, never received enough workloads (which is a by design) that justifies the investment. On Fri, Apr 3, 2020 at 1:34 PM Ted Dunning wrote: > On Fri, Apr 3, 2020 at 10:01 AM Patrick Hunt wrote: > > > ... > > Makes sense. For eg. SSD characteristics are vastly diff from spinning > > media. > > > super true. > > > > I suspect it would be worth looking into this in even more depth - > > we pre-allocate certain files, perhaps that's no longer necessary, etc... > > > > The preallocation still makes sense on most file systems since meta-data > changes (i.e. changing file length) are much more expensive than data > changes (overwriting previously allocated blocks). > > Makes sense. If we do something it would be great to have a set of tests > > that could be used/reused to explore the various types even beyond SSD > > itself. > > > > Indeed. Storage class memory, for example, could make for an amazing ZK > implementation. So could use of the upcoming SSD devices that implement > key-value stores. > > > > > > Regards, > > > > Patrick > > > > > > > My hypothesis is: with a larger snapCount value, ZK can have higher > > > throughput because it is spending less time creating snapshots. > > > > > > Thanks! > > > > > >
Re: ZooKeeper snapCount Tuning
On Fri, Apr 3, 2020 at 10:01 AM Patrick Hunt wrote: > ... > Makes sense. For eg. SSD characteristics are vastly diff from spinning > media. super true. > I suspect it would be worth looking into this in even more depth - > we pre-allocate certain files, perhaps that's no longer necessary, etc... > The preallocation still makes sense on most file systems since meta-data changes (i.e. changing file length) are much more expensive than data changes (overwriting previously allocated blocks). Makes sense. If we do something it would be great to have a set of tests > that could be used/reused to explore the various types even beyond SSD > itself. > Indeed. Storage class memory, for example, could make for an amazing ZK implementation. So could use of the upcoming SSD devices that implement key-value stores. > > Regards, > > Patrick > > > > My hypothesis is: with a larger snapCount value, ZK can have higher > > throughput because it is spending less time creating snapshots. > > > > Thanks! > > >
Re: ZooKeeper snapCount Tuning
On Fri, Apr 3, 2020 at 6:19 AM David Mollitor wrote: > Hello Community, > > The configuration zookeeper.snapCount defaults to a value of 100,000 and > has been at this default for 11 years now. > > > https://github.com/apache/zookeeper/blob/e87bad6774e7269ef21a156aff9dad089ef54794/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L1149 > > Based on the last ZK meetup, I know there has been some recent attempts to > re-run the baseline performance benchmarks. > > The current value may be a "safe" value. However, I think we can all agree > that hardware has improved quite a bit in the past 11 years. Does anyone > have any experience tweaking and testing this number on a production > system? Are there any recommendations out there for how to set this value? > > Makes sense. For eg. SSD characteristics are vastly diff from spinning media. I suspect it would be worth looking into this in even more depth - we pre-allocate certain files, perhaps that's no longer necessary, etc... Makes sense. If we do something it would be great to have a set of tests that could be used/reused to explore the various types even beyond SSD itself. Regards, Patrick > My hypothesis is: with a larger snapCount value, ZK can have higher > throughput because it is spending less time creating snapshots. > > Thanks! >
ZooKeeper snapCount Tuning
Hello Community, The configuration zookeeper.snapCount defaults to a value of 100,000 and has been at this default for 11 years now. https://github.com/apache/zookeeper/blob/e87bad6774e7269ef21a156aff9dad089ef54794/zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java#L1149 Based on the last ZK meetup, I know there has been some recent attempts to re-run the baseline performance benchmarks. The current value may be a "safe" value. However, I think we can all agree that hardware has improved quite a bit in the past 11 years. Does anyone have any experience tweaking and testing this number on a production system? Are there any recommendations out there for how to set this value? My hypothesis is: with a larger snapCount value, ZK can have higher throughput because it is spending less time creating snapshots. Thanks!