[
https://issues.apache.org/jira/browse/ZOOKEEPER-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193494#comment-13193494
]
Ioan Eugen Stan commented on ZOOKEEPER-580:
-------------------------------------------
+1
> Document reasonable limits on the size and shape of data for a zookeeper
> ensemble.
> ----------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-580
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-580
> Project: ZooKeeper
> Issue Type: Improvement
> Components: documentation
> Reporter: bryan thompson
>
> I would like to have documentation which clarifies the reasonable limits on
> the size and shape of data in a zookeeper ensemble. Since all zookeeper
> nodes and their data are replicated on each peer in an ensemble, there will
> be a machine limit on the amount of data in a zookeeper instance, but I have
> not seen any guidance on the estimation of that machine limit. Presumably
> the machine limits are primarily determined by the amount of heap available
> to the JVM before swapping sets in, however there might well be other limits
> which are less obvious in terms of the #of children per node and the depth of
> the node hierarchy (in addition to the already documented limit on the amount
> of data in a node). There may also be interactions with the hierarchy depth
> and performance, which I have not seen detailed anywhere.
> Guidance regarding pragmatic and machine limits would be helpful is choosing
> designs using zookeeper which can scale. For example, if metadata about each
> shard of a partitioned database architecture is mapped onto a distinct znode
> in zookeeper, then there could be an very large number of znodes for a large
> database deployment. While this would make it easy to reassign shards to
> services dynamically, the design might impose an unforeseen limit on the #of
> shards in the database. A similar concern would apply to an attempt to
> maintain metadata about each file in a distributed file system.
> Issue [ZOOKEEPER-272] described some problems when nodes have a large number
> #of children. However, it did not elaborate on whether the change to an
> Iterator model would break the atomic semantics of the List<String> of
> children or if the Iterator would be backed by a snapshot of the children as
> it existed at the time the iterator was requested, which would put a memory
> burden on the ensemble. This raises the related question of when designs
> which work around scaling limits in zookeeper might break desirable
> semantics, primarily the ability to have a consistent view of the distributed
> state.
> Put another way, are there anti-patterns for zookeeper relating to
> scalability? Too many children? Too much depth? Avoid decomposing large
> numbers of children into hierarchies? Etc.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira