[
https://issues.apache.org/jira/browse/ZOOKEEPER-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15533779#comment-15533779
]
Flavio Junqueira commented on ZOOKEEPER-2605:
---------------------------------------------
We use the txn log for durability and the updates to the zk database are
written synchronously to the log before we respond to the client. Using a txn
log is efficient because we only append to the log and write sequentially to
the file. Snapshots are written to disk asynchronously, and are only used to
speed up recovery. The "source of truth" for the state of a server is the txn
log.
In your case, you just need to provision and configure your system accordingly.
You could, for example, reduce the frequency of snapshotting with the
{{snapCount}} parameter. You also need to make sure you have enough free space.
If the device is shared, it is possible that other applications are filling it
up.
> Snapshot generation fills up disk space due to high volume of requests.
> -----------------------------------------------------------------------
>
> Key: ZOOKEEPER-2605
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2605
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.4.5
> Reporter: Joe Wang
> Priority: Minor
>
> Not sure if it's a bug, or just a consequence of a design decision.
> Recently we had an issue where faulty clients were issuing create requests at
> an abnormally high rate, which caused zookeeper to generate more snapshots
> than our cron job could clean up. This filled up the disk on our zookeeper
> hosts and brought the cluster down.
> Is there a reason why Zookeeper uses a write-ahead log instead only flushing
> successful transactions to disk? If only successful transactions are flushed
> and counted towards snapCount, then even if a client is spamming requests to
> create a node that already exists, it wouldn't cause a flood of snapshots to
> be persisted to disk.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)