I just came across the
org.apache.jackrabbit.oak.segment.RecordUsageAnalyser class in Oak,
which I completely forgot about before. I think you can use that one to
parse nodes and have it list some statistics about them. Alternatively
you should be able to relatively easy come up with your own tooling
based on org.apache.jackrabbit.oak.segment.SegmentParser (which is also
the base for RecordUsageAnalyser). Please take care though, these tools
are not very deeply tested and any results obtained by them should be
placed under scrutiny.


On 04.03.18 15:22, Roy Teeuwen wrote:
> Hey guys,
> I am using Oak 1.6.6 with an authoring system and a few publish systems. We 
> are using the latest TarMK that is available on the 1.6.6 branch and also 
> using the separate file datastore instead of embedded in the segment store.
> What I have noticed so far is that the segment store of the author is 16GB 
> with 165GB datastore while the publishes are 1.5GB with only 50GB datastore. 
> I would like to investigate where the big difference is between those two 
> systems, seeing as all the content nodes are as good as all published. The 
> offline compaction happens daily so that can't be the problem, also the 
> online compaction is enabled. Are there any tools / methods available to list 
> out what the disk usage is of every node? This being both in the segmentstore 
> and the related datastore files? I can make wild guesses as to it being for 
> example sling event / job nodes and stuff like that but I would like some 
> real numbers.
> Thanks!
> Roy

Reply via email to