Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-11 Thread Eric Biggers
Hi Benjamin,

On Wed, Mar 07, 2018 at 12:50:08PM +0100, Benjamin Warnke wrote:
> Hi Eric,
> 
> 
> On 06.03.2018 at 23:13, Eric Biggers wrote:
> > 
> > Hi Benjamin,
> > 
> > On Tue, Mar 06, 2018 at 09:23:08PM +0100, Benjamin Warnke wrote:
> >> Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
> >> compresses each page individually. As a result the compression algorithm is
> >> forced to use a very small sliding window. None of the available 
> >> compression
> >> algorithms is designed to achieve high compression ratios with small 
> >> inputs.
> >> 
> >> This patch adds a new compression algorithm 'zBeWalgo' to the crypto api. 
> >> This
> >> algorithm focusses on increasing the capacity of the compressed 
> >> block-device
> >> created by ZRAM. The choice of compression algorithms is always a tradeoff
> >> between speed and compression ratio.
> >> 
> > [...]
> >> 
> >> Zstd is not in the list of compared algorithms, because it is not available
> >> for Zram in the currently available kernel trees.
> >> 
> > 
> > This still isn't a valid excuse for not comparing it to Zstandard.  
> > Zstandard is
> > already in the kernel in lib/.  It would only need glue code to wire it up 
> > as an
> > scomp_alg to the crypto API.  
> 
> I'll add benchmarks with Zstandard.
> 
> > In contrast you're proposing an entirely new
> > algorithm.  Ultimately, by proposing a new algorithm you're on the hook to
> > demonstrate that existing algorithms aren't good enough.  Note that many of 
> > the
> > existing algorithms including Zstandard also have multiple compression 
> > levels
> > available to choose from.
> 
> The crypto api exposes each algorithm only once with its default compression 
> level.
> Currently there is no switch/option/mechanism/code or something to switch 
> thoose levels.
> Of course I could benchmark the algorithms with multiple compression levels.
> How many / Which compression levels would you like to see to be compared with 
> each other?
> 

However many are needed to demonstrate that the new algorithm is needed.  E.g.
if your primary argument is that the new algorithm provides a better compression
ratio, than you'd need to show that simply choosing a higher compression level
with Zstandard or DEFLATE isn't good enough.

> > It's also rare to add a compression or encryption algorithm to the Linux 
> > kernel
> > that isn't already used somewhere else.
> 
> The kernel is the only place where so many small pieces of data need to be 
> compressed.
> In user space nobody cares that the data is splitted into pages, and 
> therefore compression usually applied to large datasets.
> 

Not really true.  There are other situations where people need near-random
access to data so not too much can be compressed at once.  Also there are
applications where compression needs to be applied at the level of individual
packets or messages because e.g. the messages need to be sent over the network
right away to meet a time constraint.

> >  Have you at least written a formal
> > specification of the 'zBeWalgo' data format?
> 
> Not yet.
> 
> >  Zstandard, for example, had a
> > well-tested and optimized userspace library and a specification of the data
> > format before it was added to the kernel.  And even before that it took a 
> > couple
> > years of development to stabilize the Zstandard data format.
> > 
> > Now, it's true that you don't strictly need a stable data format if the
> > algorithm will only ever be used for RAM compression.  But, if you want to 
> > take
> > that shortcut, then you're on the hook to justify it, and perhaps to 
> > enlighten
> > the crypto API about algorithms that don't have a stable data format (e.g. 
> > using
> > a new CRYPTO_ALG_* flag) so that users have to more explicitly opt-in to 
> > using
> > them.
> 
> I could add CRYPTO_ALG_TYPE_COMPRESS_UNSTABLE as an alias for 
> CRYPTO_ALG_TYPE_COMPRESS
> 

A real flag would probably make more sense than an algorithm type.  Note that it
could also be meaningful for other algorithm types besides compression, e.g.
encryption algorithms.

> > You cannot just ignore fuzz-safety in the decompressor either. The existing
> > decompressors exposed through the crypto API are fuzz-safe, so it's not 
> > valid to
> > compare an unsafe decompressor to them.  If you really do want to add an 
> > unsafe
> > decompressor then you'd likely need to look into adding crypto API support 
> > for
> > users to request unsafe decompression -- which could also be used to expose 
> > the
> > unsafe version of the LZ4 decompressor (LZ4_decompress_fast() instead of
> > LZ4_decompress_safe()).  Or if you do make the decompressor safe, then of 
> > course
> > you'd actually need to fuzz it; I suggest porting the code to userspace and
> > running american fuzzy lop (afl-fuzz) on it: http://lcamtuf.coredump.cx/afl/
> 
> The current version of the decompressor is fuzz-safe. I tested it with lots 
> of data and made sure, that each possible branch in 

Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-11 Thread Eric Biggers
Hi Benjamin,

On Wed, Mar 07, 2018 at 12:50:08PM +0100, Benjamin Warnke wrote:
> Hi Eric,
> 
> 
> On 06.03.2018 at 23:13, Eric Biggers wrote:
> > 
> > Hi Benjamin,
> > 
> > On Tue, Mar 06, 2018 at 09:23:08PM +0100, Benjamin Warnke wrote:
> >> Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
> >> compresses each page individually. As a result the compression algorithm is
> >> forced to use a very small sliding window. None of the available 
> >> compression
> >> algorithms is designed to achieve high compression ratios with small 
> >> inputs.
> >> 
> >> This patch adds a new compression algorithm 'zBeWalgo' to the crypto api. 
> >> This
> >> algorithm focusses on increasing the capacity of the compressed 
> >> block-device
> >> created by ZRAM. The choice of compression algorithms is always a tradeoff
> >> between speed and compression ratio.
> >> 
> > [...]
> >> 
> >> Zstd is not in the list of compared algorithms, because it is not available
> >> for Zram in the currently available kernel trees.
> >> 
> > 
> > This still isn't a valid excuse for not comparing it to Zstandard.  
> > Zstandard is
> > already in the kernel in lib/.  It would only need glue code to wire it up 
> > as an
> > scomp_alg to the crypto API.  
> 
> I'll add benchmarks with Zstandard.
> 
> > In contrast you're proposing an entirely new
> > algorithm.  Ultimately, by proposing a new algorithm you're on the hook to
> > demonstrate that existing algorithms aren't good enough.  Note that many of 
> > the
> > existing algorithms including Zstandard also have multiple compression 
> > levels
> > available to choose from.
> 
> The crypto api exposes each algorithm only once with its default compression 
> level.
> Currently there is no switch/option/mechanism/code or something to switch 
> thoose levels.
> Of course I could benchmark the algorithms with multiple compression levels.
> How many / Which compression levels would you like to see to be compared with 
> each other?
> 

However many are needed to demonstrate that the new algorithm is needed.  E.g.
if your primary argument is that the new algorithm provides a better compression
ratio, than you'd need to show that simply choosing a higher compression level
with Zstandard or DEFLATE isn't good enough.

> > It's also rare to add a compression or encryption algorithm to the Linux 
> > kernel
> > that isn't already used somewhere else.
> 
> The kernel is the only place where so many small pieces of data need to be 
> compressed.
> In user space nobody cares that the data is splitted into pages, and 
> therefore compression usually applied to large datasets.
> 

Not really true.  There are other situations where people need near-random
access to data so not too much can be compressed at once.  Also there are
applications where compression needs to be applied at the level of individual
packets or messages because e.g. the messages need to be sent over the network
right away to meet a time constraint.

> >  Have you at least written a formal
> > specification of the 'zBeWalgo' data format?
> 
> Not yet.
> 
> >  Zstandard, for example, had a
> > well-tested and optimized userspace library and a specification of the data
> > format before it was added to the kernel.  And even before that it took a 
> > couple
> > years of development to stabilize the Zstandard data format.
> > 
> > Now, it's true that you don't strictly need a stable data format if the
> > algorithm will only ever be used for RAM compression.  But, if you want to 
> > take
> > that shortcut, then you're on the hook to justify it, and perhaps to 
> > enlighten
> > the crypto API about algorithms that don't have a stable data format (e.g. 
> > using
> > a new CRYPTO_ALG_* flag) so that users have to more explicitly opt-in to 
> > using
> > them.
> 
> I could add CRYPTO_ALG_TYPE_COMPRESS_UNSTABLE as an alias for 
> CRYPTO_ALG_TYPE_COMPRESS
> 

A real flag would probably make more sense than an algorithm type.  Note that it
could also be meaningful for other algorithm types besides compression, e.g.
encryption algorithms.

> > You cannot just ignore fuzz-safety in the decompressor either. The existing
> > decompressors exposed through the crypto API are fuzz-safe, so it's not 
> > valid to
> > compare an unsafe decompressor to them.  If you really do want to add an 
> > unsafe
> > decompressor then you'd likely need to look into adding crypto API support 
> > for
> > users to request unsafe decompression -- which could also be used to expose 
> > the
> > unsafe version of the LZ4 decompressor (LZ4_decompress_fast() instead of
> > LZ4_decompress_safe()).  Or if you do make the decompressor safe, then of 
> > course
> > you'd actually need to fuzz it; I suggest porting the code to userspace and
> > running american fuzzy lop (afl-fuzz) on it: http://lcamtuf.coredump.cx/afl/
> 
> The current version of the decompressor is fuzz-safe. I tested it with lots 
> of data and made sure, that each possible branch in 

Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-07 Thread Benjamin Warnke
Hi Eric,


On 06.03.2018 at 23:13, Eric Biggers wrote:
> 
> Hi Benjamin,
> 
> On Tue, Mar 06, 2018 at 09:23:08PM +0100, Benjamin Warnke wrote:
>> Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
>> compresses each page individually. As a result the compression algorithm is
>> forced to use a very small sliding window. None of the available compression
>> algorithms is designed to achieve high compression ratios with small inputs.
>> 
>> This patch adds a new compression algorithm 'zBeWalgo' to the crypto api. 
>> This
>> algorithm focusses on increasing the capacity of the compressed block-device
>> created by ZRAM. The choice of compression algorithms is always a tradeoff
>> between speed and compression ratio.
>> 
> [...]
>> 
>> Zstd is not in the list of compared algorithms, because it is not available
>> for Zram in the currently available kernel trees.
>> 
> 
> This still isn't a valid excuse for not comparing it to Zstandard.  Zstandard 
> is
> already in the kernel in lib/.  It would only need glue code to wire it up as 
> an
> scomp_alg to the crypto API.  

I'll add benchmarks with Zstandard.

> In contrast you're proposing an entirely new
> algorithm.  Ultimately, by proposing a new algorithm you're on the hook to
> demonstrate that existing algorithms aren't good enough.  Note that many of 
> the
> existing algorithms including Zstandard also have multiple compression levels
> available to choose from.

The crypto api exposes each algorithm only once with its default compression 
level.
Currently there is no switch/option/mechanism/code or something to switch 
thoose levels.
Of course I could benchmark the algorithms with multiple compression levels.
How many / Which compression levels would you like to see to be compared with 
each other?

> It's also rare to add a compression or encryption algorithm to the Linux 
> kernel
> that isn't already used somewhere else.

The kernel is the only place where so many small pieces of data need to be 
compressed.
In user space nobody cares that the data is splitted into pages, and therefore 
compression usually applied to large datasets.

>  Have you at least written a formal
> specification of the 'zBeWalgo' data format?

Not yet.

>  Zstandard, for example, had a
> well-tested and optimized userspace library and a specification of the data
> format before it was added to the kernel.  And even before that it took a 
> couple
> years of development to stabilize the Zstandard data format.
> 
> Now, it's true that you don't strictly need a stable data format if the
> algorithm will only ever be used for RAM compression.  But, if you want to 
> take
> that shortcut, then you're on the hook to justify it, and perhaps to enlighten
> the crypto API about algorithms that don't have a stable data format (e.g. 
> using
> a new CRYPTO_ALG_* flag) so that users have to more explicitly opt-in to using
> them.

I could add CRYPTO_ALG_TYPE_COMPRESS_UNSTABLE as an alias for 
CRYPTO_ALG_TYPE_COMPRESS

> You cannot just ignore fuzz-safety in the decompressor either. The existing
> decompressors exposed through the crypto API are fuzz-safe, so it's not valid 
> to
> compare an unsafe decompressor to them.  If you really do want to add an 
> unsafe
> decompressor then you'd likely need to look into adding crypto API support for
> users to request unsafe decompression -- which could also be used to expose 
> the
> unsafe version of the LZ4 decompressor (LZ4_decompress_fast() instead of
> LZ4_decompress_safe()).  Or if you do make the decompressor safe, then of 
> course
> you'd actually need to fuzz it; I suggest porting the code to userspace and
> running american fuzzy lop (afl-fuzz) on it: http://lcamtuf.coredump.cx/afl/

The current version of the decompressor is fuzz-safe. I tested it with lots of 
data and made sure, that each possible branch in the code is executed multiple 
times in different combinations.

> Separately, given that this is a brand new algorithm and it seems to have
> various corner cases, it would be a good idea to demonstrate a higher 
> guarantee
> of the correctness of the compression-decompression round trip.  afl-fuzz can 
> be
> good for that too: you could port the code to userspace and write a simple
> program which compresses and decompresses a file and aborts if it was 
> corrupted.
> Then by passing the program to afl-fuzz it will eventually cover most of the
> compressor and decompressor.  I've done that when testing compression code in
> the past.

I will port my code to user space and use afl-fuzz to test it.

Benjamin



Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-07 Thread Benjamin Warnke
Hi Eric,


On 06.03.2018 at 23:13, Eric Biggers wrote:
> 
> Hi Benjamin,
> 
> On Tue, Mar 06, 2018 at 09:23:08PM +0100, Benjamin Warnke wrote:
>> Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
>> compresses each page individually. As a result the compression algorithm is
>> forced to use a very small sliding window. None of the available compression
>> algorithms is designed to achieve high compression ratios with small inputs.
>> 
>> This patch adds a new compression algorithm 'zBeWalgo' to the crypto api. 
>> This
>> algorithm focusses on increasing the capacity of the compressed block-device
>> created by ZRAM. The choice of compression algorithms is always a tradeoff
>> between speed and compression ratio.
>> 
> [...]
>> 
>> Zstd is not in the list of compared algorithms, because it is not available
>> for Zram in the currently available kernel trees.
>> 
> 
> This still isn't a valid excuse for not comparing it to Zstandard.  Zstandard 
> is
> already in the kernel in lib/.  It would only need glue code to wire it up as 
> an
> scomp_alg to the crypto API.  

I'll add benchmarks with Zstandard.

> In contrast you're proposing an entirely new
> algorithm.  Ultimately, by proposing a new algorithm you're on the hook to
> demonstrate that existing algorithms aren't good enough.  Note that many of 
> the
> existing algorithms including Zstandard also have multiple compression levels
> available to choose from.

The crypto api exposes each algorithm only once with its default compression 
level.
Currently there is no switch/option/mechanism/code or something to switch 
thoose levels.
Of course I could benchmark the algorithms with multiple compression levels.
How many / Which compression levels would you like to see to be compared with 
each other?

> It's also rare to add a compression or encryption algorithm to the Linux 
> kernel
> that isn't already used somewhere else.

The kernel is the only place where so many small pieces of data need to be 
compressed.
In user space nobody cares that the data is splitted into pages, and therefore 
compression usually applied to large datasets.

>  Have you at least written a formal
> specification of the 'zBeWalgo' data format?

Not yet.

>  Zstandard, for example, had a
> well-tested and optimized userspace library and a specification of the data
> format before it was added to the kernel.  And even before that it took a 
> couple
> years of development to stabilize the Zstandard data format.
> 
> Now, it's true that you don't strictly need a stable data format if the
> algorithm will only ever be used for RAM compression.  But, if you want to 
> take
> that shortcut, then you're on the hook to justify it, and perhaps to enlighten
> the crypto API about algorithms that don't have a stable data format (e.g. 
> using
> a new CRYPTO_ALG_* flag) so that users have to more explicitly opt-in to using
> them.

I could add CRYPTO_ALG_TYPE_COMPRESS_UNSTABLE as an alias for 
CRYPTO_ALG_TYPE_COMPRESS

> You cannot just ignore fuzz-safety in the decompressor either. The existing
> decompressors exposed through the crypto API are fuzz-safe, so it's not valid 
> to
> compare an unsafe decompressor to them.  If you really do want to add an 
> unsafe
> decompressor then you'd likely need to look into adding crypto API support for
> users to request unsafe decompression -- which could also be used to expose 
> the
> unsafe version of the LZ4 decompressor (LZ4_decompress_fast() instead of
> LZ4_decompress_safe()).  Or if you do make the decompressor safe, then of 
> course
> you'd actually need to fuzz it; I suggest porting the code to userspace and
> running american fuzzy lop (afl-fuzz) on it: http://lcamtuf.coredump.cx/afl/

The current version of the decompressor is fuzz-safe. I tested it with lots of 
data and made sure, that each possible branch in the code is executed multiple 
times in different combinations.

> Separately, given that this is a brand new algorithm and it seems to have
> various corner cases, it would be a good idea to demonstrate a higher 
> guarantee
> of the correctness of the compression-decompression round trip.  afl-fuzz can 
> be
> good for that too: you could port the code to userspace and write a simple
> program which compresses and decompresses a file and aborts if it was 
> corrupted.
> Then by passing the program to afl-fuzz it will eventually cover most of the
> compressor and decompressor.  I've done that when testing compression code in
> the past.

I will port my code to user space and use afl-fuzz to test it.

Benjamin



Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-07 Thread Benjamin Warnke
Hello,

On(07/03/2018 03:12),Sergey Senozhatsky wrote:
> 
> Hello,
> 
> On (03/06/18 20:59), Benjamin Warnke wrote:
>>   Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
>>   compresses each page individually. As a result the compression algorithm
>>   is
>>   forced to use a very small sliding window. None of the available
>>   compression
>>   algorithms is designed to achieve high compression ratios with small
>>   inputs.
> 
> I think you first need to merge zBeWalgo (looks like a long way to go)
> And then add ZRAM support, as a separate patch.

I'll split my patch into 2 parts

1st: add zBeWalgo compression algorithm
2nd: enable zBeWalgo to be used by ZRAM

> 
>>   - 'ecoham' (100 MiB) This dataset is one of the input files for the
>>   scientific
>>   application ECOHAM which runs an ocean simulation. This dataset contains a
>>   lot of zeros. Where the data is not zero there are arrays of floating
>>   point
>>   values, adjacent float values are likely to be similar to each other,
>>   allowing for high compression ratios.
>> 
>>   algorithm | ratio   | compression| decompression
>>   zbewalgo  |   12.94 |  294.10 MBit/s | 1242.59 MBit/s
>>   deflate   |   12.54 |   75.51 MBit/s |  736.39 MBit/s
>>   842   |   12.26 |  182.59 MBit/s |  683.61 MBit/s
>>   lz4hc |   12.00 |   51.23 MBit/s | 1524.73 MBit/s
>>   lz4   |   10.68 | 1334.37 MBit/s | 1603.54 MBit/s
>>   lzo   |9.79 | 1333.76 MBit/s | 1534.63 MBit/s
>> 
>>   - 'source-code' (800 MiB) This dataset is a tarball of the source-code
>>   from a
>>   linux-kernel.
>> 
>>   algorithm | ratio   | compression| decompression
>>   deflate   |3.27 |   42.48 MBit/s |  250.36 MBit/s
>>   lz4hc |2.40 |  104.14 MBit/s | 1150.53 MBit/s
>>   lzo   |2.27 |  444.77 MBit/s |  886.97 MBit/s
>>   lz4   |2.18 |  453.08 MBit/s | 1101.45 MBit/s
>>   842   |1.65 |   64.10 MBit/s |  158.40 MBit/s
>>   zbewalgo  |1.19 |   52.89 MBit/s |  197.58 MBit/s
>> 
>>   - 'hpcg' (8 GiB) This dataset is a (partial) memory-snapshot of the
>>   running hpcg-benchmark. At the time of the snapshot, that application
>>   performed a sparse matrix - vector multiplication.
>> 
>>   algorithm | ratio   | compression| decompression
>>   zbewalgo  |   16.16 |  179.97 MBit/s |  468.36 MBit/s
>>   deflate   |9.52 |   65.11 MBit/s |  632.69 MBit/s
>>   lz4hc |4.96 |  193.33 MBit/s | 1607.12 MBit/s
>>   842   |4.20 |  150.99 MBit/s |  316.22 MBit/s
>>   lzo   |4.14 |  922.74 MBit/s |  865.32 MBit/s
>>   lz4   |3.79 |  908.39 MBit/s | 1375.33 MBit/s
>> 
>>   - 'partdiff' (8 GiB) Array of double values. Adjacent doubles are similar,
>>   but
>>   not equal. This array is produced by a partial differential equation
>>   solver
>>   using a Jakobi-implementation.
>> 
>>   algorithm | ratio   | compression| decompression
>>   zbewalgo  |1.30 |  203.30 MBit/s |  530.87 MBit/s
>>   deflate   |1.02 |   37.06 MBit/s | 1131.88 MBit/s
>>   lzo   |1.00 | 1741.46 MBit/s | 2012.78 MBit/s
>>   lz4   |1.00 | 1458.08 MBit/s | 2013.88 MBit/s
>>   lz4hc |1.00 |  173.19 MBit/s | 2012.37 MBit/s
>>   842   |1.00 |   64.10 MBit/s | 2013.64 MBit/s
> 
> Hm, mixed feelings.

as Eric Biggers suggested, I'll add Zstandard to the set of algorithms which 
compared. What else should I add to the benchmarks?

Benjamin



Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-07 Thread Benjamin Warnke
Hello,

On(07/03/2018 03:12),Sergey Senozhatsky wrote:
> 
> Hello,
> 
> On (03/06/18 20:59), Benjamin Warnke wrote:
>>   Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
>>   compresses each page individually. As a result the compression algorithm
>>   is
>>   forced to use a very small sliding window. None of the available
>>   compression
>>   algorithms is designed to achieve high compression ratios with small
>>   inputs.
> 
> I think you first need to merge zBeWalgo (looks like a long way to go)
> And then add ZRAM support, as a separate patch.

I'll split my patch into 2 parts

1st: add zBeWalgo compression algorithm
2nd: enable zBeWalgo to be used by ZRAM

> 
>>   - 'ecoham' (100 MiB) This dataset is one of the input files for the
>>   scientific
>>   application ECOHAM which runs an ocean simulation. This dataset contains a
>>   lot of zeros. Where the data is not zero there are arrays of floating
>>   point
>>   values, adjacent float values are likely to be similar to each other,
>>   allowing for high compression ratios.
>> 
>>   algorithm | ratio   | compression| decompression
>>   zbewalgo  |   12.94 |  294.10 MBit/s | 1242.59 MBit/s
>>   deflate   |   12.54 |   75.51 MBit/s |  736.39 MBit/s
>>   842   |   12.26 |  182.59 MBit/s |  683.61 MBit/s
>>   lz4hc |   12.00 |   51.23 MBit/s | 1524.73 MBit/s
>>   lz4   |   10.68 | 1334.37 MBit/s | 1603.54 MBit/s
>>   lzo   |9.79 | 1333.76 MBit/s | 1534.63 MBit/s
>> 
>>   - 'source-code' (800 MiB) This dataset is a tarball of the source-code
>>   from a
>>   linux-kernel.
>> 
>>   algorithm | ratio   | compression| decompression
>>   deflate   |3.27 |   42.48 MBit/s |  250.36 MBit/s
>>   lz4hc |2.40 |  104.14 MBit/s | 1150.53 MBit/s
>>   lzo   |2.27 |  444.77 MBit/s |  886.97 MBit/s
>>   lz4   |2.18 |  453.08 MBit/s | 1101.45 MBit/s
>>   842   |1.65 |   64.10 MBit/s |  158.40 MBit/s
>>   zbewalgo  |1.19 |   52.89 MBit/s |  197.58 MBit/s
>> 
>>   - 'hpcg' (8 GiB) This dataset is a (partial) memory-snapshot of the
>>   running hpcg-benchmark. At the time of the snapshot, that application
>>   performed a sparse matrix - vector multiplication.
>> 
>>   algorithm | ratio   | compression| decompression
>>   zbewalgo  |   16.16 |  179.97 MBit/s |  468.36 MBit/s
>>   deflate   |9.52 |   65.11 MBit/s |  632.69 MBit/s
>>   lz4hc |4.96 |  193.33 MBit/s | 1607.12 MBit/s
>>   842   |4.20 |  150.99 MBit/s |  316.22 MBit/s
>>   lzo   |4.14 |  922.74 MBit/s |  865.32 MBit/s
>>   lz4   |3.79 |  908.39 MBit/s | 1375.33 MBit/s
>> 
>>   - 'partdiff' (8 GiB) Array of double values. Adjacent doubles are similar,
>>   but
>>   not equal. This array is produced by a partial differential equation
>>   solver
>>   using a Jakobi-implementation.
>> 
>>   algorithm | ratio   | compression| decompression
>>   zbewalgo  |1.30 |  203.30 MBit/s |  530.87 MBit/s
>>   deflate   |1.02 |   37.06 MBit/s | 1131.88 MBit/s
>>   lzo   |1.00 | 1741.46 MBit/s | 2012.78 MBit/s
>>   lz4   |1.00 | 1458.08 MBit/s | 2013.88 MBit/s
>>   lz4hc |1.00 |  173.19 MBit/s | 2012.37 MBit/s
>>   842   |1.00 |   64.10 MBit/s | 2013.64 MBit/s
> 
> Hm, mixed feelings.

as Eric Biggers suggested, I'll add Zstandard to the set of algorithms which 
compared. What else should I add to the benchmarks?

Benjamin



Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-06 Thread Sergey Senozhatsky
Hello,

On (03/06/18 20:59), Benjamin Warnke wrote:
>Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
>compresses each page individually. As a result the compression algorithm
>is
>forced to use a very small sliding window. None of the available
>compression
>algorithms is designed to achieve high compression ratios with small
>inputs.

I think you first need to merge zBeWalgo (looks like a long way to go)
And then add ZRAM support, as a separate patch.

>- 'ecoham' (100 MiB) This dataset is one of the input files for the
>scientific
>application ECOHAM which runs an ocean simulation. This dataset contains a
>lot of zeros. Where the data is not zero there are arrays of floating
>point
>values, adjacent float values are likely to be similar to each other,
>allowing for high compression ratios.
> 
>algorithm | ratio   | compression    | decompression
>zbewalgo  |   12.94 |  294.10 MBit/s | 1242.59 MBit/s
>deflate   |   12.54 |   75.51 MBit/s |  736.39 MBit/s
>842   |   12.26 |  182.59 MBit/s |  683.61 MBit/s
>lz4hc |   12.00 |   51.23 MBit/s | 1524.73 MBit/s
>lz4   |   10.68 | 1334.37 MBit/s | 1603.54 MBit/s
>lzo   |    9.79 | 1333.76 MBit/s | 1534.63 MBit/s
> 
>- 'source-code' (800 MiB) This dataset is a tarball of the source-code
>from a
>linux-kernel.
> 
>algorithm | ratio   | compression    | decompression
>deflate   |    3.27 |   42.48 MBit/s |  250.36 MBit/s
>lz4hc |    2.40 |  104.14 MBit/s | 1150.53 MBit/s
>lzo   |    2.27 |  444.77 MBit/s |  886.97 MBit/s
>lz4   |    2.18 |  453.08 MBit/s | 1101.45 MBit/s
>842   |    1.65 |   64.10 MBit/s |  158.40 MBit/s
>zbewalgo  |    1.19 |   52.89 MBit/s |  197.58 MBit/s
> 
>- 'hpcg' (8 GiB) This dataset is a (partial) memory-snapshot of the
>running hpcg-benchmark. At the time of the snapshot, that application
>performed a sparse matrix - vector multiplication.
> 
>algorithm | ratio   | compression    | decompression
>zbewalgo  |   16.16 |  179.97 MBit/s |  468.36 MBit/s
>deflate   |    9.52 |   65.11 MBit/s |  632.69 MBit/s
>lz4hc |    4.96 |  193.33 MBit/s | 1607.12 MBit/s
>842   |    4.20 |  150.99 MBit/s |  316.22 MBit/s
>lzo   |    4.14 |  922.74 MBit/s |  865.32 MBit/s
>lz4   |    3.79 |  908.39 MBit/s | 1375.33 MBit/s
> 
>- 'partdiff' (8 GiB) Array of double values. Adjacent doubles are similar,
>but
>not equal. This array is produced by a partial differential equation
>solver
>using a Jakobi-implementation.
> 
>algorithm | ratio   | compression    | decompression
>zbewalgo  |    1.30 |  203.30 MBit/s |  530.87 MBit/s
>deflate   |    1.02 |   37.06 MBit/s | 1131.88 MBit/s
>lzo   |    1.00 | 1741.46 MBit/s | 2012.78 MBit/s
>lz4   |    1.00 | 1458.08 MBit/s | 2013.88 MBit/s
>lz4hc |    1.00 |  173.19 MBit/s | 2012.37 MBit/s
>842   |    1.00 |   64.10 MBit/s | 2013.64 MBit/s

Hm, mixed feelings.

-ss


Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-06 Thread Sergey Senozhatsky
Hello,

On (03/06/18 20:59), Benjamin Warnke wrote:
>Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
>compresses each page individually. As a result the compression algorithm
>is
>forced to use a very small sliding window. None of the available
>compression
>algorithms is designed to achieve high compression ratios with small
>inputs.

I think you first need to merge zBeWalgo (looks like a long way to go)
And then add ZRAM support, as a separate patch.

>- 'ecoham' (100 MiB) This dataset is one of the input files for the
>scientific
>application ECOHAM which runs an ocean simulation. This dataset contains a
>lot of zeros. Where the data is not zero there are arrays of floating
>point
>values, adjacent float values are likely to be similar to each other,
>allowing for high compression ratios.
> 
>algorithm | ratio   | compression    | decompression
>zbewalgo  |   12.94 |  294.10 MBit/s | 1242.59 MBit/s
>deflate   |   12.54 |   75.51 MBit/s |  736.39 MBit/s
>842   |   12.26 |  182.59 MBit/s |  683.61 MBit/s
>lz4hc |   12.00 |   51.23 MBit/s | 1524.73 MBit/s
>lz4   |   10.68 | 1334.37 MBit/s | 1603.54 MBit/s
>lzo   |    9.79 | 1333.76 MBit/s | 1534.63 MBit/s
> 
>- 'source-code' (800 MiB) This dataset is a tarball of the source-code
>from a
>linux-kernel.
> 
>algorithm | ratio   | compression    | decompression
>deflate   |    3.27 |   42.48 MBit/s |  250.36 MBit/s
>lz4hc |    2.40 |  104.14 MBit/s | 1150.53 MBit/s
>lzo   |    2.27 |  444.77 MBit/s |  886.97 MBit/s
>lz4   |    2.18 |  453.08 MBit/s | 1101.45 MBit/s
>842   |    1.65 |   64.10 MBit/s |  158.40 MBit/s
>zbewalgo  |    1.19 |   52.89 MBit/s |  197.58 MBit/s
> 
>- 'hpcg' (8 GiB) This dataset is a (partial) memory-snapshot of the
>running hpcg-benchmark. At the time of the snapshot, that application
>performed a sparse matrix - vector multiplication.
> 
>algorithm | ratio   | compression    | decompression
>zbewalgo  |   16.16 |  179.97 MBit/s |  468.36 MBit/s
>deflate   |    9.52 |   65.11 MBit/s |  632.69 MBit/s
>lz4hc |    4.96 |  193.33 MBit/s | 1607.12 MBit/s
>842   |    4.20 |  150.99 MBit/s |  316.22 MBit/s
>lzo   |    4.14 |  922.74 MBit/s |  865.32 MBit/s
>lz4   |    3.79 |  908.39 MBit/s | 1375.33 MBit/s
> 
>- 'partdiff' (8 GiB) Array of double values. Adjacent doubles are similar,
>but
>not equal. This array is produced by a partial differential equation
>solver
>using a Jakobi-implementation.
> 
>algorithm | ratio   | compression    | decompression
>zbewalgo  |    1.30 |  203.30 MBit/s |  530.87 MBit/s
>deflate   |    1.02 |   37.06 MBit/s | 1131.88 MBit/s
>lzo   |    1.00 | 1741.46 MBit/s | 2012.78 MBit/s
>lz4   |    1.00 | 1458.08 MBit/s | 2013.88 MBit/s
>lz4hc |    1.00 |  173.19 MBit/s | 2012.37 MBit/s
>842   |    1.00 |   64.10 MBit/s | 2013.64 MBit/s

Hm, mixed feelings.

-ss


Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-06 Thread Eric Biggers
Hi Benjamin,

On Tue, Mar 06, 2018 at 09:23:08PM +0100, Benjamin Warnke wrote:
> Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
> compresses each page individually. As a result the compression algorithm is
> forced to use a very small sliding window. None of the available compression
> algorithms is designed to achieve high compression ratios with small inputs.
> 
> This patch adds a new compression algorithm 'zBeWalgo' to the crypto api. This
> algorithm focusses on increasing the capacity of the compressed block-device
> created by ZRAM. The choice of compression algorithms is always a tradeoff
> between speed and compression ratio.
> 
[...]
> 
> Zstd is not in the list of compared algorithms, because it is not available
> for Zram in the currently available kernel trees.
>

This still isn't a valid excuse for not comparing it to Zstandard.  Zstandard is
already in the kernel in lib/.  It would only need glue code to wire it up as an
scomp_alg to the crypto API.  In contrast you're proposing an entirely new
algorithm.  Ultimately, by proposing a new algorithm you're on the hook to
demonstrate that existing algorithms aren't good enough.  Note that many of the
existing algorithms including Zstandard also have multiple compression levels
available to choose from.

It's also rare to add a compression or encryption algorithm to the Linux kernel
that isn't already used somewhere else.  Have you at least written a formal
specification of the 'zBeWalgo' data format?  Zstandard, for example, had a
well-tested and optimized userspace library and a specification of the data
format before it was added to the kernel.  And even before that it took a couple
years of development to stabilize the Zstandard data format.

Now, it's true that you don't strictly need a stable data format if the
algorithm will only ever be used for RAM compression.  But, if you want to take
that shortcut, then you're on the hook to justify it, and perhaps to enlighten
the crypto API about algorithms that don't have a stable data format (e.g. using
a new CRYPTO_ALG_* flag) so that users have to more explicitly opt-in to using
them.

You cannot just ignore fuzz-safety in the decompressor either.  The existing
decompressors exposed through the crypto API are fuzz-safe, so it's not valid to
compare an unsafe decompressor to them.  If you really do want to add an unsafe
decompressor then you'd likely need to look into adding crypto API support for
users to request unsafe decompression -- which could also be used to expose the
unsafe version of the LZ4 decompressor (LZ4_decompress_fast() instead of
LZ4_decompress_safe()).  Or if you do make the decompressor safe, then of course
you'd actually need to fuzz it; I suggest porting the code to userspace and
running american fuzzy lop (afl-fuzz) on it: http://lcamtuf.coredump.cx/afl/

Separately, given that this is a brand new algorithm and it seems to have
various corner cases, it would be a good idea to demonstrate a higher guarantee
of the correctness of the compression-decompression round trip.  afl-fuzz can be
good for that too: you could port the code to userspace and write a simple
program which compresses and decompresses a file and aborts if it was corrupted.
Then by passing the program to afl-fuzz it will eventually cover most of the
compressor and decompressor.  I've done that when testing compression code in
the past.

Hope this is helpful!

Eric


Re: [RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-06 Thread Eric Biggers
Hi Benjamin,

On Tue, Mar 06, 2018 at 09:23:08PM +0100, Benjamin Warnke wrote:
> Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
> compresses each page individually. As a result the compression algorithm is
> forced to use a very small sliding window. None of the available compression
> algorithms is designed to achieve high compression ratios with small inputs.
> 
> This patch adds a new compression algorithm 'zBeWalgo' to the crypto api. This
> algorithm focusses on increasing the capacity of the compressed block-device
> created by ZRAM. The choice of compression algorithms is always a tradeoff
> between speed and compression ratio.
> 
[...]
> 
> Zstd is not in the list of compared algorithms, because it is not available
> for Zram in the currently available kernel trees.
>

This still isn't a valid excuse for not comparing it to Zstandard.  Zstandard is
already in the kernel in lib/.  It would only need glue code to wire it up as an
scomp_alg to the crypto API.  In contrast you're proposing an entirely new
algorithm.  Ultimately, by proposing a new algorithm you're on the hook to
demonstrate that existing algorithms aren't good enough.  Note that many of the
existing algorithms including Zstandard also have multiple compression levels
available to choose from.

It's also rare to add a compression or encryption algorithm to the Linux kernel
that isn't already used somewhere else.  Have you at least written a formal
specification of the 'zBeWalgo' data format?  Zstandard, for example, had a
well-tested and optimized userspace library and a specification of the data
format before it was added to the kernel.  And even before that it took a couple
years of development to stabilize the Zstandard data format.

Now, it's true that you don't strictly need a stable data format if the
algorithm will only ever be used for RAM compression.  But, if you want to take
that shortcut, then you're on the hook to justify it, and perhaps to enlighten
the crypto API about algorithms that don't have a stable data format (e.g. using
a new CRYPTO_ALG_* flag) so that users have to more explicitly opt-in to using
them.

You cannot just ignore fuzz-safety in the decompressor either.  The existing
decompressors exposed through the crypto API are fuzz-safe, so it's not valid to
compare an unsafe decompressor to them.  If you really do want to add an unsafe
decompressor then you'd likely need to look into adding crypto API support for
users to request unsafe decompression -- which could also be used to expose the
unsafe version of the LZ4 decompressor (LZ4_decompress_fast() instead of
LZ4_decompress_safe()).  Or if you do make the decompressor safe, then of course
you'd actually need to fuzz it; I suggest porting the code to userspace and
running american fuzzy lop (afl-fuzz) on it: http://lcamtuf.coredump.cx/afl/

Separately, given that this is a brand new algorithm and it seems to have
various corner cases, it would be a good idea to demonstrate a higher guarantee
of the correctness of the compression-decompression round trip.  afl-fuzz can be
good for that too: you could port the code to userspace and write a simple
program which compresses and decompresses a file and aborts if it was corrupted.
Then by passing the program to afl-fuzz it will eventually cover most of the
compressor and decompressor.  I've done that when testing compression code in
the past.

Hope this is helpful!

Eric


[RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-06 Thread Benjamin Warnke
Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
compresses each page individually. As a result the compression algorithm is
forced to use a very small sliding window. None of the available compression
algorithms is designed to achieve high compression ratios with small inputs.

This patch adds a new compression algorithm 'zBeWalgo' to the crypto api. This
algorithm focusses on increasing the capacity of the compressed block-device
created by ZRAM. The choice of compression algorithms is always a tradeoff
between speed and compression ratio.

If faster algorithms like 'lz4' are chosen the compression ratio is often
lower than the ratio of zBeWalgo as shown in the following benchmarks. Due to
the lower compression ratio, ZRAM needs to fall back to backing_devices
earlier. If backing_devices are required, the effective speed of ZRAM is a
weighted average of de/compression time and writing/reading from the
backing_device. This should be considered when comparing the speeds in the
benchmarks.

There are different kinds of backing_devices, each with its own drawbacks.
1. HDDs: This kind of backing device is very slow. If the compression ratio 
of an algorithm is much lower than the ratio of zBeWalgo, it might be faster
to use zBewalgo instead.
2. SSDs: I tested a swap partition on my NVME-SSD. The speed is even higher
than zram with lz4, but after about 5 Minutes the SSD is blocking all
read/write requests due to overheating. This is definitly not an option.



zBeWalgo is a completely new algorithm - Currently it is not published
somewhere else right now, googleing it would not show up any results. The
following section describes how the algorithm works.

zBeWalgo itself is a container compression algorithm, which can execute
multiple different compression and transformation algorithms after each other.
The execution of different compression algorithms after each other will be
called 'combination' in this description and in the code. Additionally to be
able to execute combinations of algorithms, zBeWalgo can try different
combinations on the same input. This allows high compression ratios on
completely different datasets, which would otherwise require its own
algorithm each. Executing all known combinations on each input page would be
very slow. Therefore the data is compressed at first with that combination,
which was already successful on the last input page. If the compressed data
size of the current page is similar to that of the last page, the compressed
data is returned immediately without even trying the other combinations. Even
if there is no guarantee that consecutive calls to the algorithm belong to
each other, the speed improvement is obvious.

ZRAM uses zsmalloc for the management of the compressed pages. The largest
size-class in zsmalloc is 3264 Bytes. If the compressed data is larger than
that threshold, ZRAM ignores the compression and writes the uncompressed page
instead. As a consequence it is useless to continue compression, if the
algorithm detects, that the data can not be compressed using the current
combination. The threshold for aborting compression can be changed via sysfs at
any time, even if the algorithm is currently in use. If a combination fails to
compress the data, zBeWalgo tries the next combination. If no combination is
able to reduce the data in size, zBeWalgo returns a negative value.

Each combination consists of up to 7 compression and transformation steps.
Combinations can be added and removed at any time via sysfs. Already compressed
Data can always be decompressed, even if the combination used to produce it
does not exist anymore. Technically the user could add up to 256 combinations
concurrently, but that would be very time consuming if the data can not be
compressed.

To be able to build combinations and call different algorithms, all those
algorithms are implementing the same interface. This enables the user to
specify additional combinations while ZRAM is running.

Within the combinations many different algorithms can be used. Some of those
algorithms are published. This patch adds the following algorithms to be used
within the combinations:
- bwt: The Burrows-Wheeler-Transformation was published by 'M. Burrows' and
 'D. J. Wheeler' in 1994. This implementation uses counting sort for
 sorting the data. Their original paper is online available at:
 http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.pdf
- mtf: The Move-To-Front algorithm as described by 'M. Burrows' and
 'D. J. Wheeler' in the same paper as bwt.
- jbe: j-bit-encoding as proposed by 'I Made Agus Dwi Suarjaya' in 2012.
 https://arxiv.org/pdf/1209.1045.pdf
- jbe2: A minor modification of jbe. Swapping groups of 4 Bit in consecutive
 Bytes can increase the compression ratio, if for example the first
 4 Bits of each Byte are zero. If jbe2 is called after mtf, this
 happens ofthen.
- rle: Run Length Encoding
- huffman: Huffman encoding
- bewalgo: I 

[RESEND PATCH v3] crypto: add zBeWalgo compression for zram

2018-03-06 Thread Benjamin Warnke
Currently ZRAM uses compression-algorithms from the crypto-api. ZRAM
compresses each page individually. As a result the compression algorithm is
forced to use a very small sliding window. None of the available compression
algorithms is designed to achieve high compression ratios with small inputs.

This patch adds a new compression algorithm 'zBeWalgo' to the crypto api. This
algorithm focusses on increasing the capacity of the compressed block-device
created by ZRAM. The choice of compression algorithms is always a tradeoff
between speed and compression ratio.

If faster algorithms like 'lz4' are chosen the compression ratio is often
lower than the ratio of zBeWalgo as shown in the following benchmarks. Due to
the lower compression ratio, ZRAM needs to fall back to backing_devices
earlier. If backing_devices are required, the effective speed of ZRAM is a
weighted average of de/compression time and writing/reading from the
backing_device. This should be considered when comparing the speeds in the
benchmarks.

There are different kinds of backing_devices, each with its own drawbacks.
1. HDDs: This kind of backing device is very slow. If the compression ratio 
of an algorithm is much lower than the ratio of zBeWalgo, it might be faster
to use zBewalgo instead.
2. SSDs: I tested a swap partition on my NVME-SSD. The speed is even higher
than zram with lz4, but after about 5 Minutes the SSD is blocking all
read/write requests due to overheating. This is definitly not an option.



zBeWalgo is a completely new algorithm - Currently it is not published
somewhere else right now, googleing it would not show up any results. The
following section describes how the algorithm works.

zBeWalgo itself is a container compression algorithm, which can execute
multiple different compression and transformation algorithms after each other.
The execution of different compression algorithms after each other will be
called 'combination' in this description and in the code. Additionally to be
able to execute combinations of algorithms, zBeWalgo can try different
combinations on the same input. This allows high compression ratios on
completely different datasets, which would otherwise require its own
algorithm each. Executing all known combinations on each input page would be
very slow. Therefore the data is compressed at first with that combination,
which was already successful on the last input page. If the compressed data
size of the current page is similar to that of the last page, the compressed
data is returned immediately without even trying the other combinations. Even
if there is no guarantee that consecutive calls to the algorithm belong to
each other, the speed improvement is obvious.

ZRAM uses zsmalloc for the management of the compressed pages. The largest
size-class in zsmalloc is 3264 Bytes. If the compressed data is larger than
that threshold, ZRAM ignores the compression and writes the uncompressed page
instead. As a consequence it is useless to continue compression, if the
algorithm detects, that the data can not be compressed using the current
combination. The threshold for aborting compression can be changed via sysfs at
any time, even if the algorithm is currently in use. If a combination fails to
compress the data, zBeWalgo tries the next combination. If no combination is
able to reduce the data in size, zBeWalgo returns a negative value.

Each combination consists of up to 7 compression and transformation steps.
Combinations can be added and removed at any time via sysfs. Already compressed
Data can always be decompressed, even if the combination used to produce it
does not exist anymore. Technically the user could add up to 256 combinations
concurrently, but that would be very time consuming if the data can not be
compressed.

To be able to build combinations and call different algorithms, all those
algorithms are implementing the same interface. This enables the user to
specify additional combinations while ZRAM is running.

Within the combinations many different algorithms can be used. Some of those
algorithms are published. This patch adds the following algorithms to be used
within the combinations:
- bwt: The Burrows-Wheeler-Transformation was published by 'M. Burrows' and
 'D. J. Wheeler' in 1994. This implementation uses counting sort for
 sorting the data. Their original paper is online available at:
 http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.pdf
- mtf: The Move-To-Front algorithm as described by 'M. Burrows' and
 'D. J. Wheeler' in the same paper as bwt.
- jbe: j-bit-encoding as proposed by 'I Made Agus Dwi Suarjaya' in 2012.
 https://arxiv.org/pdf/1209.1045.pdf
- jbe2: A minor modification of jbe. Swapping groups of 4 Bit in consecutive
 Bytes can increase the compression ratio, if for example the first
 4 Bits of each Byte are zero. If jbe2 is called after mtf, this
 happens ofthen.
- rle: Run Length Encoding
- huffman: Huffman encoding
- bewalgo: I