Hello,

I've been investigating Oak performance and found a couple of cases where 
MongoMK makes use of stringified versions of the 
org.apache.jackrabbit.mongomk.Revision type. One example of such a problem was 
reported here [1].

I have a couple of things that I'd like to talk about, concerning 
Revision.toString() usage:

 1.  toString() should hardly ever be used for anything other than debugging. 
It is very hard to find relevant matching references of Object.toString() in a 
Java code base. In other words, toString() is almost not "refactorable". E.g. 
is hard to predict what side-effects a change to the toString() behaviour will 
have. Ideally, toString() should delegate to another method, such as format(), 
where the predicable logic is really implemented.
 2.  Revision has a much lower memory footprint than its string representation, 
so it is actually be better suited to be used in maps, caches, etc.
 3.  MongoMK.isCommitted() is an example that shows how revisions are 
unnecessarily transformed back and forth to strings.

In a larger profiling session, Revision.fromString() and Revision.toString() 
accounted for a total of around 1.5% of all CPU time on my machine.
This may seem like micro-optimisation to some, but I think that we should take 
these things seriously, as they might add up to a significant amount of CPU and 
memory waste, if practiced across a large code base.

Please, let me know what you think.

Cheers
Lukas

[1]: https://issues.apache.org/jira/browse/OAK-825

Reply via email to