Re: [s3ql] Estimating actual backend storage used

Chris Davies Wed, 22 Jun 2016 11:17:39 -0700

On Tuesday, 21 June 2016 13:54:42 UTC+1, Chris Davies wrote:
>
> I've been having a little dig around in the schema with a view to seeing 
> how I can perform a variation of du -ms for various parts of the tree, 
> comparing the ratio of stored to used data blocks. I'm curious to determine 
> the "most expensive" parts of my backups. I've had some interesting times 
> trying to drill down, though
>


Just for the record, here is the query I that think represents the raw data 
I wanted to obtain. If I've got it right it lists every file in the S3QL 
filesystem along with its apparent size and actual storage used. Each block 
of storage for an item is divided equally amongst the items using that 
block (e.g. two items sharing the same 1MB block with be apportioned 512KB 
each).

-- Iterate across the entire S3QL filesystem
--
-- Example output
--      inode    uid  gid  mode   mtime                size       path
--      4687924  902  902  33277  2014-01-07 22:03:55  18880591   
/archive/hosting/Jan2014 Audit Log Archives/AUDLOG_AU_201401.LOG.gz
--      4687929  902  902  33277  2014-01-07 12:07:03  207445907  
/archive/hosting/Jan2014 Audit Log Archives/AUDLOG_CH_201401.LOG.gz
--      4688566  902  902  33277  2011-04-28 15:16:06  1024       
/archive/hosting/Old Docs/FBNOTE.LOG
--
with recursive

    -- Declare recursive function to travel across the entire filesystem
    --
    Filesystem(inode, parent_inode, level, path)
    as (
        -- Parent (starting point)
        --
        select  c.inode, c.parent_inode, 1 as level, '/' || c.name as path
        from    contents_v c
        where   c.parent_inode = 1

        union

        -- Recursive child (from parent)
        --
        select  c.inode, c.parent_inode, f.level + 1 as level, f.path || 
'/' || c.name as path
        from    Filesystem f
        join    contents_v c on c.parent_inode = f.inode
    )

    -- Iterate across the resulting set of rows
    --
    select      f.inode, i.size, sum(b.size/b.refcount) storage, 
count(b.id) copies, 100-(100*sum(b.size/b.refcount))/i.size pct_saving, 
datetime(i.mtime_ns/1000/1000/1000, 'unixepoch') as mtime, f.path
    from        Filesystem f
    join        inode_blocks ib on ib.inode = f.inode
    join        inodes i on i.id = f.inode
    join        blocks b on b.id = ib.block_id
    group by    f.inode
    order by    f.path
;


I'm hoping this may be useful for someone out there.
Chris

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [s3ql] Estimating actual backend storage used

Reply via email to