Re: \o/ compsize

2017-09-15 Thread David Sterba
On Mon, Sep 04, 2017 at 08:42:29PM +0200, Adam Borowski wrote:
> On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote:
> > 2017-09-04 18:11 GMT+03:00 Adam Borowski :
> > > Here's an utility to measure used compression type + ratio on a set of 
> > > files
> > > or directories: https://github.com/kilobyte/compsize
> > >
> > > It should be of great help for users, and also if you:
> > > * muck with compression levels
> > > * add new compression types
> > > * add heurestics that could err on withholding compression too much
> > 
> > Packaged to AUR:
> > https://aur.archlinux.org/packages/compsize-git/
> 
> Cool!  I'd wait until people say the code is sane (I don't really know these
> ioctls) but if you want to make poor AUR folks our beta testers, that's ok.
> 
> However, one issue: I did not set a license; your packaging says GPL3. 
> It would be better to have something compatible with btrfs-progs which are
> GPL2-only.  What about GPL2-or-higher?
> 
> After adding some related info (like wasted space in pinned extents, reuse
> of extents), it'd be nice to have this tool inside btrfs-progs, either as a
> part of "fi du" or another command.

I've now implemented a prototype that calculates the compressed size of
extents per-file. As 'fi du' knows about what extents are shared, the
compression can be also calculated shared/exclusive.

There's no summary like compsize does, this would need a bit more
precise tracking of the extents, not just the compressed size but also
which algo was used. I can imagine all sorts of output enhancements,
like summarize the inline-compressed extents or print the algo summary
per-file. This should be easy once the calclation code is there. I
haven't reused compsize.c, as I needed only the ioctl part and wire it
to 'fi du', but the search ioctl is the same.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-05 Thread Adam Borowski
On Mon, Sep 04, 2017 at 10:33:40PM +0200, A L wrote:
> On 9/4/2017 5:11 PM, Adam Borowski wrote:
> > Hi!
> > Here's an utility to measure used compression type + ratio on a set of files
> > or directories: https://github.com/kilobyte/compsize
> 
> Great tool. Just tried it on some of my backup snapshots.
> 
>    # compsize portage.20170904T2200
>    142432 files.
>    all   78%  329M/ 422M
>    none 100%  227M/ 227M
>    zlib  52%  102M/ 195M
> 
>    # du -sh  portage.20170904T2200
>    787M    portage.20170904T2200
> 
>    # btrfs fi du -s  portage.20170904T2200
>     Total   Exclusive  Set shared  Filename
>     271.61MiB 6.34MiB   245.51MiB portage.20170904T2200
> 
> Interesting results. How do I interpret them?

I've added some documentation; especially in the man page.

(Sorry for not pushing this earlier, Timofey went wild on this tool and I
wanted to avoid conflicts.)

> Compsize also doesn't seem to like some non-standard files and throws an
> error (even though they should be ignored?):
> 
> # compsize usb-backup/volumes/root/root.20170727T2321/
> open("usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350"):
> No such device or address
> 
> # dir
> usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350
> srwx-- 1 root root 0 Dec 31  2015 
> usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350=

Fixed.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din
⠈⠳⣄ 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-05 Thread Nick Terrell
On 9/4/17, 8:12 AM, "Adam Borowski"  wrote:
> Hi!
> Here's an utility to measure used compression type + ratio on a set of files
> or directories: https://github.com/kilobyte/compsize
> 
> It should be of great help for users, and also if you:
> * muck with compression levels
> * add new compression types
> * add heurestics that could err on withholding compression too much

Thanks for writing this tool Adam, I'll try it out with zstd! It looks very
useful for benchmarking compression algorithms, much better than measuring
the filesystem size with du/df.

> (Thanks for Knorrie and his python-btrfs project that made figuring out the
> ioctls much easier.)
> 
> Meow!
> -- 
> ⢀⣴⠾⠻⢶⣦⠀ 
> ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
> ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din
> ⠈⠳⣄ 
> 
 

N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: \o/ compsize

2017-09-05 Thread Qu Wenruo



On 2017年09月05日 22:21, Hans van Kranenburg wrote:

On 09/05/2017 04:02 PM, Qu Wenruo wrote:



On 2017年09月05日 03:52, Timofey Titovets wrote:

2017-09-04 21:42 GMT+03:00 Adam Borowski :

On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote:

2017-09-04 18:11 GMT+03:00 Adam Borowski :

Here's an utility to measure used compression type + ratio on a set
of files
or directories: https://github.com/kilobyte/compsize

It should be of great help for users, and also if you:
* muck with compression levels
* add new compression types
* add heurestics that could err on withholding compression too much


Did a brief review, and the result looks quite good.
Especially same disk bytenr is handled well, so same file extent
referring to different part of the large extent won't get count twice.

Nice job.

But still some smaller improvement can be done:
(Please keep in mind I can go totally wrong since I'm not doing a
comprehensive review)

Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY,
which should filtered out unrelated results.


No, it does not.

https://patchwork.kernel.org/patch/9767619/


Why not?

Min key = ino, EXTENT_DATA, 0
Max key = ino, EXTENT_DATA, -1

With that min_key and max_key, the result is just what we want.

This also filtered out any item not belongs to this ino, and other 
things like XATTR or whatever.


Thanks,
Qu



And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined
functions will be a big improvement for reviewers.
(So I can check if the magic numbers are right or not, since I'm a lazy
bone and don't want to manually calculate the offset)



Packaged to AUR:
https://aur.archlinux.org/packages/compsize-git/


Nice, I don't even need to build it myself!
(Well, no much dependency anyway)



Cool!  I'd wait until people say the code is sane (I don't really
know these
ioctls) but if you want to make poor AUR folks our beta testers,
that's ok.


The code is sane!
And it even considered inline extent! (Which I didn't consider BTW as
inline extent counts as metadata, not data so my first thought just is
to just ignore them).



This just are too handy =>>

However, one issue: I did not set a license; your packaging says GPL3.
It would be better to have something compatible with btrfs-progs
which are
GPL2-only.  What about GPL2-or-higher?


Sorry for license, just copy-paste error, fixed


After adding some related info (like wasted space in pinned extents,
reuse
of extents), it'd be nice to have this tool inside btrfs-progs,
either as a
part of "fi du" or another command.


That will be useful =>

If improved, I think there is the chance to get it into btrfs-progs.

Thanks,
Qu



P.S.
your code work amazing fast on my ssd and data %)
150Gb data
-O0 2.12s
-O2 0.51s


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-05 Thread Hans van Kranenburg
On 09/05/2017 04:02 PM, Qu Wenruo wrote:
> 
> 
> On 2017年09月05日 03:52, Timofey Titovets wrote:
>> 2017-09-04 21:42 GMT+03:00 Adam Borowski :
>>> On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote:
 2017-09-04 18:11 GMT+03:00 Adam Borowski :
> Here's an utility to measure used compression type + ratio on a set
> of files
> or directories: https://github.com/kilobyte/compsize
>
> It should be of great help for users, and also if you:
> * muck with compression levels
> * add new compression types
> * add heurestics that could err on withholding compression too much
> 
> Did a brief review, and the result looks quite good.
> Especially same disk bytenr is handled well, so same file extent
> referring to different part of the large extent won't get count twice.
> 
> Nice job.
> 
> But still some smaller improvement can be done:
> (Please keep in mind I can go totally wrong since I'm not doing a
> comprehensive review)
> 
> Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY,
> which should filtered out unrelated results.

No, it does not.

https://patchwork.kernel.org/patch/9767619/

> And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined
> functions will be a big improvement for reviewers.
> (So I can check if the magic numbers are right or not, since I'm a lazy
> bone and don't want to manually calculate the offset)
> 

 Packaged to AUR:
 https://aur.archlinux.org/packages/compsize-git/
> 
> Nice, I don't even need to build it myself!
> (Well, no much dependency anyway)
> 
>>>
>>> Cool!  I'd wait until people say the code is sane (I don't really
>>> know these
>>> ioctls) but if you want to make poor AUR folks our beta testers,
>>> that's ok.
> 
> The code is sane!
> And it even considered inline extent! (Which I didn't consider BTW as
> inline extent counts as metadata, not data so my first thought just is
> to just ignore them).
> 
>>
>> This just are too handy =)
>>
>>> However, one issue: I did not set a license; your packaging says GPL3.
>>> It would be better to have something compatible with btrfs-progs
>>> which are
>>> GPL2-only.  What about GPL2-or-higher?
>>
>> Sorry for license, just copy-paste error, fixed
>>
>>> After adding some related info (like wasted space in pinned extents,
>>> reuse
>>> of extents), it'd be nice to have this tool inside btrfs-progs,
>>> either as a
>>> part of "fi du" or another command.
>>
>> That will be useful =)
> 
> If improved, I think there is the chance to get it into btrfs-progs.
> 
> Thanks,
> Qu
> 
>>
>> P.S.
>> your code work amazing fast on my ssd and data %)
>> 150Gb data
>> -O0 2.12s
>> -O2 0.51s
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-05 Thread Qu Wenruo



On 2017年09月05日 03:52, Timofey Titovets wrote:

2017-09-04 21:42 GMT+03:00 Adam Borowski :

On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote:

2017-09-04 18:11 GMT+03:00 Adam Borowski :

Here's an utility to measure used compression type + ratio on a set of files
or directories: https://github.com/kilobyte/compsize

It should be of great help for users, and also if you:
* muck with compression levels
* add new compression types
* add heurestics that could err on withholding compression too much


Did a brief review, and the result looks quite good.
Especially same disk bytenr is handled well, so same file extent 
referring to different part of the large extent won't get count twice.


Nice job.

But still some smaller improvement can be done:
(Please keep in mind I can go totally wrong since I'm not doing a 
comprehensive review)


Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY, 
which should filtered out unrelated results.


And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined 
functions will be a big improvement for reviewers.
(So I can check if the magic numbers are right or not, since I'm a lazy 
bone and don't want to manually calculate the offset)




Packaged to AUR:
https://aur.archlinux.org/packages/compsize-git/


Nice, I don't even need to build it myself!
(Well, no much dependency anyway)



Cool!  I'd wait until people say the code is sane (I don't really know these
ioctls) but if you want to make poor AUR folks our beta testers, that's ok.


The code is sane!
And it even considered inline extent! (Which I didn't consider BTW as 
inline extent counts as metadata, not data so my first thought just is 
to just ignore them).




This just are too handy =)


However, one issue: I did not set a license; your packaging says GPL3.
It would be better to have something compatible with btrfs-progs which are
GPL2-only.  What about GPL2-or-higher?


Sorry for license, just copy-paste error, fixed


After adding some related info (like wasted space in pinned extents, reuse
of extents), it'd be nice to have this tool inside btrfs-progs, either as a
part of "fi du" or another command.


That will be useful =)


If improved, I think there is the chance to get it into btrfs-progs.

Thanks,
Qu



P.S.
your code work amazing fast on my ssd and data %)
150Gb data
-O0 2.12s
-O2 0.51s


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-04 Thread A L

On 9/4/2017 5:11 PM, Adam Borowski wrote:

Hi!
Here's an utility to measure used compression type + ratio on a set of files
or directories: https://github.com/kilobyte/compsize


Great tool. Just tried it on some of my backup snapshots.

   # compsize portage.20170904T2200
   142432 files.
   all   78%  329M/ 422M
   none 100%  227M/ 227M
   zlib  52%  102M/ 195M

   # du -sh  portage.20170904T2200
   787M    portage.20170904T2200

   # btrfs fi du -s  portage.20170904T2200
    Total   Exclusive  Set shared  Filename
    271.61MiB 6.34MiB   245.51MiB portage.20170904T2200

Interesting results. How do I interpret them?


Compsize also doesn't seem to like some non-standard files and throws an 
error (even though they should be ignored?):


# compsize usb-backup/volumes/root/root.20170727T2321/
open("usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350"): 
No such device or address


# dir 
usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350
srwx-- 1 root root 0 Dec 31  2015 
usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350=



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-04 Thread Timofey Titovets
2017-09-04 21:42 GMT+03:00 Adam Borowski :
> On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote:
>> 2017-09-04 18:11 GMT+03:00 Adam Borowski :
>> > Here's an utility to measure used compression type + ratio on a set of 
>> > files
>> > or directories: https://github.com/kilobyte/compsize
>> >
>> > It should be of great help for users, and also if you:
>> > * muck with compression levels
>> > * add new compression types
>> > * add heurestics that could err on withholding compression too much
>>
>> Packaged to AUR:
>> https://aur.archlinux.org/packages/compsize-git/
>
> Cool!  I'd wait until people say the code is sane (I don't really know these
> ioctls) but if you want to make poor AUR folks our beta testers, that's ok.

This just are too handy =)

> However, one issue: I did not set a license; your packaging says GPL3.
> It would be better to have something compatible with btrfs-progs which are
> GPL2-only.  What about GPL2-or-higher?

Sorry for license, just copy-paste error, fixed

> After adding some related info (like wasted space in pinned extents, reuse
> of extents), it'd be nice to have this tool inside btrfs-progs, either as a
> part of "fi du" or another command.

That will be useful =)

P.S.
your code work amazing fast on my ssd and data %)
150Gb data
-O0 2.12s
-O2 0.51s

-- 
Have a nice day,
Timofey.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-04 Thread Adam Borowski
On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote:
> 2017-09-04 18:11 GMT+03:00 Adam Borowski :
> > Here's an utility to measure used compression type + ratio on a set of files
> > or directories: https://github.com/kilobyte/compsize
> >
> > It should be of great help for users, and also if you:
> > * muck with compression levels
> > * add new compression types
> > * add heurestics that could err on withholding compression too much
> 
> Packaged to AUR:
> https://aur.archlinux.org/packages/compsize-git/

Cool!  I'd wait until people say the code is sane (I don't really know these
ioctls) but if you want to make poor AUR folks our beta testers, that's ok.

However, one issue: I did not set a license; your packaging says GPL3. 
It would be better to have something compatible with btrfs-progs which are
GPL2-only.  What about GPL2-or-higher?

After adding some related info (like wasted space in pinned extents, reuse
of extents), it'd be nice to have this tool inside btrfs-progs, either as a
part of "fi du" or another command.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din
⠈⠳⣄ 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-04 Thread Timofey Titovets
2017-09-04 18:11 GMT+03:00 Adam Borowski :
> Hi!
> Here's an utility to measure used compression type + ratio on a set of files
> or directories: https://github.com/kilobyte/compsize
>
> It should be of great help for users, and also if you:
> * muck with compression levels
> * add new compression types
> * add heurestics that could err on withholding compression too much
>
> (Thanks for Knorrie and his python-btrfs project that made figuring out the
> ioctls much easier.)
>
> Meow!
> --
> ⢀⣴⠾⠻⢶⣦⠀
> ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
> ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din
> ⠈⠳⣄

Packaged to AUR:
https://aur.archlinux.org/packages/compsize-git/

-- 
Have a nice day,
Timofey.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


\o/ compsize

2017-09-04 Thread Adam Borowski
Hi!
Here's an utility to measure used compression type + ratio on a set of files
or directories: https://github.com/kilobyte/compsize

It should be of great help for users, and also if you:
* muck with compression levels
* add new compression types
* add heurestics that could err on withholding compression too much

(Thanks for Knorrie and his python-btrfs project that made figuring out the
ioctls much easier.)

Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din
⠈⠳⣄ 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html