Re: Finding out the diskspace used of specific nodes

Michael Dürig Wed, 07 Mar 2018 02:30:08 -0800

Hi,

I think you could do the same via a Groovy script. Depending on how
deep you want to dig into the lower layers you would need to hack your
way through though. The tooling I started building aims to simplify
this (but didn't fully succeed at it yet).


Michael

On 6 March 2018 at 22:06, Roy Teeuwen <[email protected]> wrote:
> Hey Michael,
>
> Thanks for the info!  I will have a look if I can still run the script on an 
> oak 1.6.6, who knows :).
> Can you tell me what the difference would be in making a groovy script and 
> running it in oak-run? Are there things you can't do in there that you can in 
> the scala ammonite shell you use?
>
> Greets,
> Roy
>
>> On 5 Mar 2018, at 13:59, Michael Dürig <[email protected]> wrote:
>>
>> Hi,
>>
>> Unfortunately there is no good tooling at this point in time.
>>
>> In the past I hacked something together, which might serve as a
>> starting point: https://github.com/mduerig/script-oak. This tooling
>> allows you to fire arbitrary queries at the segment store from the
>> Ammonite shell (a Scala REPL). Since this relies of a lot of
>> implementation details that keep changing the tooling is usually out
>> of sync with Oak. There is plans to improve this (see
>> https://issues.apache.org/jira/browse/OAK-6584), but so far not much
>> commitment in making his happen. Patches welcome though!
>>
>> Michael
>>
>> On 4 March 2018 at 15:22, Roy Teeuwen <[email protected]> wrote:
>>> Hey guys,
>>>
>>> I am using Oak 1.6.6 with an authoring system and a few publish systems. We 
>>> are using the latest TarMK that is available on the 1.6.6 branch and also 
>>> using the separate file datastore instead of embedded in the segment store.
>>>
>>> What I have noticed so far is that the segment store of the author is 16GB 
>>> with 165GB datastore while the publishes are 1.5GB with only 50GB 
>>> datastore. I would like to investigate where the big difference is between 
>>> those two systems, seeing as all the content nodes are as good as all 
>>> published. The offline compaction happens daily so that can't be the 
>>> problem, also the online compaction is enabled. Are there any tools / 
>>> methods available to list out what the disk usage is of every node? This 
>>> being both in the segmentstore and the related datastore files? I can make 
>>> wild guesses as to it being for example sling event / job nodes and stuff 
>>> like that but I would like some real numbers.
>>>
>>> Thanks!
>>> Roy
>

Re: Finding out the diskspace used of specific nodes

Reply via email to