Thomas Mueller created OAK-8523:
-----------------------------------

             Summary: Best Practices - Property Value Length Limit
                 Key: OAK-8523
                 URL: https://issues.apache.org/jira/browse/OAK-8523
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: core, jcr
            Reporter: Thomas Mueller


Right now, Oak supports very large properties (e.g. String). But 1 MB (or 
larger) properties are problematic in multiple areas like indexing. It is more 
important for software-as-a-service, where we need to guarantee SLOs, but it 
also helps other cases. So we should:

* (1) Document best practises, e.g. "Property values should be smaller than 100 
KB".
* (2) Introduce "softLimit" and "hardLimit", where softLimit is e.g. 100 KB and 
hardLimit is configurable, and (initially) by default Integer.MAX_VALUE. 
Setting the hard limits to a lower value by default is problematic, because it 
can break existing applications. With default value infinity, customers can set 
lower limits e.g. in tests first, and once they are happy, in production as 
well.
* (3) Log a warning if a property is larger than "softLimit". To avoid logging 
many warnings (if there are many such properties) we then set softLimit = 
softLimit * 1.1 (reset to 100 KB in the next repository start). Logging is 
needed to know what _exactly_ is broken (path, stack trace of the actual 
usage...)
* (4) Add a metric (monitoring) for detected large properties. Just logging 
warnings might not be enough.
* (5) Throttling: we could add flow control (pauses; Thread.sleep) after 
violations, to improve isolation (to prevent affecting other threads that don't 
violate the contract).
* (6) We could expose the violation info in the session, so a framework could 
check that data after executing custom code, and add more info (e.g. log).
* (7) If larger than the configurable hardLimit, fail the commit or reject 
setProperty (throw an exception).
* (8) At some point, in a new Oak version, change the default value for 
hardLimit to some reasonable number, e.g. 1 MB.

The "property length" is just one case. There are multiple candidates:
        
* Number of properties for a node
* Number of elements for multi-valued properties
* Total size of a node (including inlined properties)
* Number of direct child nodes for orderable child nodes
* Number of direct child nodes for non-orderable child nodes
* Size of transaction
* Adding observations listeners that listen for all changes (global listeners)

For those cases, new Jira issue should be made.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to