[ https://issues.apache.org/jira/browse/OAK-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17150231#comment-17150231 ]
Thomas Mueller commented on OAK-8523: ------------------------------------- Target: the node that is referenced. Source: the node where the target is referenced. I understand that someone might want to use paths instead of UUIDs. It can still be done using an index. Storing the *path* of the target node in the source node: * Easy to read for humans. * Simple lookup to the target from the source: Session.getNode(absPath). * Disadvantage: need to update if the target is moved. * Disadvantage: need to run a query to get the list of sources. Storing the *UUID* of the target node in the source node: * Disadvantage: Hard to read for humans. * Simple to lookup the target from the source: Session.getNodeByIdentifier(uuid). * No need to update anything is the target is moved. * Simple to get the list of sources: Node.getReferences() * Disadvantage: best only add a UUID if the node is actually referenced, to avoid an unnecessary entry in the UUID index. Storing the *list of sources* in the target node: * Potentially a huge number of entries in the target (for example: if a "header" or "footer" referenced in millions of nodes). Needs to be updated whenever a reference is added or removed, which can lead to an exponential time algorithm and storage requirement. * Potentially can result in out-of-memory during indexing or other operations. > Best Practices - Property Value Length Limit > -------------------------------------------- > > Key: OAK-8523 > URL: https://issues.apache.org/jira/browse/OAK-8523 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, jcr > Reporter: Thomas Mueller > Priority: Major > > Right now, Oak supports very large properties (e.g. String). But 1 MB (or > larger) properties are problematic in multiple areas like indexing. It is > more important for software-as-a-service, where we need to guarantee SLOs, > but it also helps other cases. So we should: > * (1) Document best practises, e.g. "Property values should be smaller than > 100 KB". > * (2) Introduce "softLimit" and "hardLimit", where softLimit is e.g. 100 KB > and hardLimit is configurable, and (initially) by default Integer.MAX_VALUE. > Setting the hard limits to a lower value by default is problematic, because > it can break existing applications. With default value infinity, customers > can set lower limits e.g. in tests first, and once they are happy, in > production as well. > * (3) Log a warning if a property is larger than "softLimit". To avoid > logging many warnings (if there are many such properties) we then set > softLimit = softLimit * 1.1 (reset to 100 KB in the next repository start). > Logging is needed to know what _exactly_ is broken (path, stack trace of the > actual usage...) > * (4) Add a metric (monitoring) for detected large properties. Just logging > warnings might not be enough. > * (5) Throttling: we could add flow control (pauses; Thread.sleep) after > violations, to improve isolation (to prevent affecting other threads that > don't violate the contract). > * (6) We could expose the violation info in the session, so a framework could > check that data after executing custom code, and add more info (e.g. log). > * (7) If larger than the configurable hardLimit, fail the commit or reject > setProperty (throw an exception). > * (8) At some point, in a new Oak version, change the default value for > hardLimit to some reasonable number, e.g. 1 MB. > The "property length" is just one case. There are multiple candidates: > > * Number of properties for a node > * Number of elements for multi-valued properties > * Total size of a node (including inlined properties) > * Number of direct child nodes for orderable child nodes > * Number of direct child nodes for non-orderable child nodes > * Size of transaction > * Adding observations listeners that listen for all changes (global listeners) > For those cases, new Jira issue should be made. -- This message was sent by Atlassian Jira (v8.3.4#803005)