Re: [PATCH v3 3/4] btrfs: Add zstd support

2017-08-18 Thread David Sterba
On Wed, Jul 26, 2017 at 12:19:29AM +0100, Giovanni Cabiddu wrote:
> Hi Nick,
> 
> On Thu, Jul 20, 2017 at 10:27:42PM +0100, Nick Terrell wrote:
> > Add zstd compression and decompression support to BtrFS. zstd at its
> > fastest level compresses almost as well as zlib, while offering much
> > faster compression and decompression, approaching lzo speeds.
> Can we look at integrating the zstd implementation below the acomp API
> available in the crypto subsystem?
> (https://github.com/torvalds/linux/blob/master/crypto/acompress.c)
> Acomp was designed to provide a generic and uniform API for compression
> in the kernel which hides algorithm specific details to frameworks.
> In future it would be nice to see btrfs using exclusively acomp
> for compression. This way when a new compression algorithm is exposed
> through acomp, it will be available immediately in btrfs.

The compression algorithm is considered part of the filesystem format
and must be covered by an incompatibility bit. Regarding the acomp API,
I don't see much point adding the extra abstraction layer for the two
currently supported algorithms. We only call 2 functions, for
compression and decompression, no need to allocate the acomp helper
structures and call the functions indirectly.


Re: [PATCH v3 3/4] btrfs: Add zstd support

2017-08-18 Thread David Sterba
On Wed, Jul 26, 2017 at 12:19:29AM +0100, Giovanni Cabiddu wrote:
> Hi Nick,
> 
> On Thu, Jul 20, 2017 at 10:27:42PM +0100, Nick Terrell wrote:
> > Add zstd compression and decompression support to BtrFS. zstd at its
> > fastest level compresses almost as well as zlib, while offering much
> > faster compression and decompression, approaching lzo speeds.
> Can we look at integrating the zstd implementation below the acomp API
> available in the crypto subsystem?
> (https://github.com/torvalds/linux/blob/master/crypto/acompress.c)
> Acomp was designed to provide a generic and uniform API for compression
> in the kernel which hides algorithm specific details to frameworks.
> In future it would be nice to see btrfs using exclusively acomp
> for compression. This way when a new compression algorithm is exposed
> through acomp, it will be available immediately in btrfs.

The compression algorithm is considered part of the filesystem format
and must be covered by an incompatibility bit. Regarding the acomp API,
I don't see much point adding the extra abstraction layer for the two
currently supported algorithms. We only call 2 functions, for
compression and decompression, no need to allocate the acomp helper
structures and call the functions indirectly.


Re: [PATCH v3 3/4] btrfs: Add zstd support

2017-07-25 Thread Giovanni Cabiddu
Hi Nick,

On Thu, Jul 20, 2017 at 10:27:42PM +0100, Nick Terrell wrote:
> Add zstd compression and decompression support to BtrFS. zstd at its
> fastest level compresses almost as well as zlib, while offering much
> faster compression and decompression, approaching lzo speeds.
Can we look at integrating the zstd implementation below the acomp API
available in the crypto subsystem?
(https://github.com/torvalds/linux/blob/master/crypto/acompress.c)
Acomp was designed to provide a generic and uniform API for compression
in the kernel which hides algorithm specific details to frameworks.
In future it would be nice to see btrfs using exclusively acomp
for compression. This way when a new compression algorithm is exposed
through acomp, it will be available immediately in btrfs.
Furthermore, any framework in the kernel that will use acomp will be
automatically enabled to use zstd.
What do you think?

Here is a prototype that shows how btrfs can be integrated with
acomp: https://patchwork.kernel.org/patch/9201741/

Regards,

-- 
Giovanni


Re: [PATCH v3 3/4] btrfs: Add zstd support

2017-07-25 Thread Giovanni Cabiddu
Hi Nick,

On Thu, Jul 20, 2017 at 10:27:42PM +0100, Nick Terrell wrote:
> Add zstd compression and decompression support to BtrFS. zstd at its
> fastest level compresses almost as well as zlib, while offering much
> faster compression and decompression, approaching lzo speeds.
Can we look at integrating the zstd implementation below the acomp API
available in the crypto subsystem?
(https://github.com/torvalds/linux/blob/master/crypto/acompress.c)
Acomp was designed to provide a generic and uniform API for compression
in the kernel which hides algorithm specific details to frameworks.
In future it would be nice to see btrfs using exclusively acomp
for compression. This way when a new compression algorithm is exposed
through acomp, it will be available immediately in btrfs.
Furthermore, any framework in the kernel that will use acomp will be
automatically enabled to use zstd.
What do you think?

Here is a prototype that shows how btrfs can be integrated with
acomp: https://patchwork.kernel.org/patch/9201741/

Regards,

-- 
Giovanni


Re: [PATCH v3 3/4] btrfs: Add zstd support

2017-07-24 Thread kbuild test robot
Hi Nick,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc2 next-20170724]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nick-Terrell/Add-xxhash-and-zstd-modules/20170723-092845
config: x86_64-randconfig-a0-07242221 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   lib/zstd/decompress.c: In function 'ZSTD_decodeSequence':
>> lib/zstd/decompress.c:1001: warning: 'seq.match' is used uninitialized in 
>> this function

vim +1001 lib/zstd/decompress.c

dc22844b Nick Terrell 2017-07-20   930  
dc22844b Nick Terrell 2017-07-20   931  static seq_t 
ZSTD_decodeSequence(seqState_t *seqState)
dc22844b Nick Terrell 2017-07-20   932  {
dc22844b Nick Terrell 2017-07-20   933  seq_t seq;
dc22844b Nick Terrell 2017-07-20   934  
dc22844b Nick Terrell 2017-07-20   935  U32 const llCode = 
FSE_peekSymbol(>stateLL);
dc22844b Nick Terrell 2017-07-20   936  U32 const mlCode = 
FSE_peekSymbol(>stateML);
dc22844b Nick Terrell 2017-07-20   937  U32 const ofCode = 
FSE_peekSymbol(>stateOffb); /* <= maxOff, by table construction */
dc22844b Nick Terrell 2017-07-20   938  
dc22844b Nick Terrell 2017-07-20   939  U32 const llBits = 
LL_bits[llCode];
dc22844b Nick Terrell 2017-07-20   940  U32 const mlBits = 
ML_bits[mlCode];
dc22844b Nick Terrell 2017-07-20   941  U32 const ofBits = ofCode;
dc22844b Nick Terrell 2017-07-20   942  U32 const totalBits = llBits + 
mlBits + ofBits;
dc22844b Nick Terrell 2017-07-20   943  
dc22844b Nick Terrell 2017-07-20   944  static const U32 LL_base[MaxLL 
+ 1] = {0,  1,  2,  3,  4,  5,  6,  7,  8,9, 10,11,12,13,   
  14, 15, 16, 18,
dc22844b Nick Terrell 2017-07-20   945  
   20, 22, 24, 28, 32, 40, 48, 64, 0x80, 0x100, 0x200, 0x400, 0x800, 
0x1000, 0x2000, 0x4000, 0x8000, 0x1};
dc22844b Nick Terrell 2017-07-20   946  
dc22844b Nick Terrell 2017-07-20   947  static const U32 ML_base[MaxML 
+ 1] = {3,  4,  5,  6,  7,  8,  9,  10,   11,12,13,14,15, 
16, 17, 18, 19, 20,
dc22844b Nick Terrell 2017-07-20   948  
   21, 22, 23, 24, 25, 26, 27, 28,   29,30,31,32,33, 
34, 35, 37, 39, 41,
dc22844b Nick Terrell 2017-07-20   949  
   43, 47, 51, 59, 67, 83, 99, 0x83, 0x103, 0x203, 0x403, 0x803, 0x1003, 
0x2003, 0x4003, 0x8003, 0x10003};
dc22844b Nick Terrell 2017-07-20   950  
dc22844b Nick Terrell 2017-07-20   951  static const U32 OF_base[MaxOff 
+ 1] = {0,   1, 1,  5,  0xD,  0x1D,  0x3D,  0x7D,   
   0xFD, 0x1FD,
dc22844b Nick Terrell 2017-07-20   952  
0x3FD,   0x7FD,0xFFD,0x1FFD,   0x3FFD,   0x7FFD,0xFFFD,
0x1FFFD,   0x3FFFD,  0x7FFFD,
dc22844b Nick Terrell 2017-07-20   953  
0xD, 0x1D, 0x3D, 0x7D, 0xFD, 0x1FD, 0x3FD, 
0x7FD, 0xFFD};
dc22844b Nick Terrell 2017-07-20   954  
dc22844b Nick Terrell 2017-07-20   955  /* sequence */
dc22844b Nick Terrell 2017-07-20   956  {
dc22844b Nick Terrell 2017-07-20   957  size_t offset;
dc22844b Nick Terrell 2017-07-20   958  if (!ofCode)
dc22844b Nick Terrell 2017-07-20   959  offset = 0;
dc22844b Nick Terrell 2017-07-20   960  else {
dc22844b Nick Terrell 2017-07-20   961  offset = 
OF_base[ofCode] + BIT_readBitsFast(>DStream, ofBits); /* <=  
(ZSTD_WINDOWLOG_MAX-1) bits */
dc22844b Nick Terrell 2017-07-20   962  if 
(ZSTD_32bits())
dc22844b Nick Terrell 2017-07-20   963  
BIT_reloadDStream(>DStream);
dc22844b Nick Terrell 2017-07-20   964  }
dc22844b Nick Terrell 2017-07-20   965  
dc22844b Nick Terrell 2017-07-20   966  if (ofCode <= 1) {
dc22844b Nick Terrell 2017-07-20   967  offset += 
(llCode == 0);
dc22844b Nick Terrell 2017-07-20   968  if (offset) {
dc22844b Nick Terrell 2017-07-20   969  size_t 
temp = (offset == 3) ? seqState->prevOffset[0] - 1 : 
seqState->prevOffset[offset];
dc22844b Nick Terrell 2017-07-20   970  temp += 
!temp; /* 0 is not valid; input is corrupted; force offset to 1 */
dc22844b Nick Terrell 2017-07-20   971  if 
(offset != 1)
dc22844b Nick Terrell 2017-07-20   972  

Re: [PATCH v3 3/4] btrfs: Add zstd support

2017-07-24 Thread kbuild test robot
Hi Nick,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc2 next-20170724]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nick-Terrell/Add-xxhash-and-zstd-modules/20170723-092845
config: x86_64-randconfig-a0-07242221 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   lib/zstd/decompress.c: In function 'ZSTD_decodeSequence':
>> lib/zstd/decompress.c:1001: warning: 'seq.match' is used uninitialized in 
>> this function

vim +1001 lib/zstd/decompress.c

dc22844b Nick Terrell 2017-07-20   930  
dc22844b Nick Terrell 2017-07-20   931  static seq_t 
ZSTD_decodeSequence(seqState_t *seqState)
dc22844b Nick Terrell 2017-07-20   932  {
dc22844b Nick Terrell 2017-07-20   933  seq_t seq;
dc22844b Nick Terrell 2017-07-20   934  
dc22844b Nick Terrell 2017-07-20   935  U32 const llCode = 
FSE_peekSymbol(>stateLL);
dc22844b Nick Terrell 2017-07-20   936  U32 const mlCode = 
FSE_peekSymbol(>stateML);
dc22844b Nick Terrell 2017-07-20   937  U32 const ofCode = 
FSE_peekSymbol(>stateOffb); /* <= maxOff, by table construction */
dc22844b Nick Terrell 2017-07-20   938  
dc22844b Nick Terrell 2017-07-20   939  U32 const llBits = 
LL_bits[llCode];
dc22844b Nick Terrell 2017-07-20   940  U32 const mlBits = 
ML_bits[mlCode];
dc22844b Nick Terrell 2017-07-20   941  U32 const ofBits = ofCode;
dc22844b Nick Terrell 2017-07-20   942  U32 const totalBits = llBits + 
mlBits + ofBits;
dc22844b Nick Terrell 2017-07-20   943  
dc22844b Nick Terrell 2017-07-20   944  static const U32 LL_base[MaxLL 
+ 1] = {0,  1,  2,  3,  4,  5,  6,  7,  8,9, 10,11,12,13,   
  14, 15, 16, 18,
dc22844b Nick Terrell 2017-07-20   945  
   20, 22, 24, 28, 32, 40, 48, 64, 0x80, 0x100, 0x200, 0x400, 0x800, 
0x1000, 0x2000, 0x4000, 0x8000, 0x1};
dc22844b Nick Terrell 2017-07-20   946  
dc22844b Nick Terrell 2017-07-20   947  static const U32 ML_base[MaxML 
+ 1] = {3,  4,  5,  6,  7,  8,  9,  10,   11,12,13,14,15, 
16, 17, 18, 19, 20,
dc22844b Nick Terrell 2017-07-20   948  
   21, 22, 23, 24, 25, 26, 27, 28,   29,30,31,32,33, 
34, 35, 37, 39, 41,
dc22844b Nick Terrell 2017-07-20   949  
   43, 47, 51, 59, 67, 83, 99, 0x83, 0x103, 0x203, 0x403, 0x803, 0x1003, 
0x2003, 0x4003, 0x8003, 0x10003};
dc22844b Nick Terrell 2017-07-20   950  
dc22844b Nick Terrell 2017-07-20   951  static const U32 OF_base[MaxOff 
+ 1] = {0,   1, 1,  5,  0xD,  0x1D,  0x3D,  0x7D,   
   0xFD, 0x1FD,
dc22844b Nick Terrell 2017-07-20   952  
0x3FD,   0x7FD,0xFFD,0x1FFD,   0x3FFD,   0x7FFD,0xFFFD,
0x1FFFD,   0x3FFFD,  0x7FFFD,
dc22844b Nick Terrell 2017-07-20   953  
0xD, 0x1D, 0x3D, 0x7D, 0xFD, 0x1FD, 0x3FD, 
0x7FD, 0xFFD};
dc22844b Nick Terrell 2017-07-20   954  
dc22844b Nick Terrell 2017-07-20   955  /* sequence */
dc22844b Nick Terrell 2017-07-20   956  {
dc22844b Nick Terrell 2017-07-20   957  size_t offset;
dc22844b Nick Terrell 2017-07-20   958  if (!ofCode)
dc22844b Nick Terrell 2017-07-20   959  offset = 0;
dc22844b Nick Terrell 2017-07-20   960  else {
dc22844b Nick Terrell 2017-07-20   961  offset = 
OF_base[ofCode] + BIT_readBitsFast(>DStream, ofBits); /* <=  
(ZSTD_WINDOWLOG_MAX-1) bits */
dc22844b Nick Terrell 2017-07-20   962  if 
(ZSTD_32bits())
dc22844b Nick Terrell 2017-07-20   963  
BIT_reloadDStream(>DStream);
dc22844b Nick Terrell 2017-07-20   964  }
dc22844b Nick Terrell 2017-07-20   965  
dc22844b Nick Terrell 2017-07-20   966  if (ofCode <= 1) {
dc22844b Nick Terrell 2017-07-20   967  offset += 
(llCode == 0);
dc22844b Nick Terrell 2017-07-20   968  if (offset) {
dc22844b Nick Terrell 2017-07-20   969  size_t 
temp = (offset == 3) ? seqState->prevOffset[0] - 1 : 
seqState->prevOffset[offset];
dc22844b Nick Terrell 2017-07-20   970  temp += 
!temp; /* 0 is not valid; input is corrupted; force offset to 1 */
dc22844b Nick Terrell 2017-07-20   971  if 
(offset != 1)
dc22844b Nick Terrell 2017-07-20   972  

Re: [PATCH v3 3/4] btrfs: Add zstd support

2017-07-23 Thread kbuild test robot
Hi Nick,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc1 next-20170721]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nick-Terrell/Add-xxhash-and-zstd-modules/20170723-092845
config: parisc-allyesconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=parisc 

All warnings (new ones prefixed by >>):

   lib/xxhash.c: In function 'xxh64':
>> lib/xxhash.c:236:1: warning: the frame size of 1688 bytes is larger than 
>> 1024 bytes [-Wframe-larger-than=]
}
^
   lib/xxhash.c: In function 'xxh64_update':
   lib/xxhash.c:441:1: warning: the frame size of 1072 bytes is larger than 
1024 bytes [-Wframe-larger-than=]
}
^

vim +236 lib/xxhash.c

09c83807 Nick Terrell 2017-07-20  171  
09c83807 Nick Terrell 2017-07-20  172  uint64_t xxh64(const void *input, const 
size_t len, const uint64_t seed)
09c83807 Nick Terrell 2017-07-20  173  {
09c83807 Nick Terrell 2017-07-20  174   const uint8_t *p = (const uint8_t 
*)input;
09c83807 Nick Terrell 2017-07-20  175   const uint8_t *const b_end = p + len;
09c83807 Nick Terrell 2017-07-20  176   uint64_t h64;
09c83807 Nick Terrell 2017-07-20  177  
09c83807 Nick Terrell 2017-07-20  178   if (len >= 32) {
09c83807 Nick Terrell 2017-07-20  179   const uint8_t *const limit = 
b_end - 32;
09c83807 Nick Terrell 2017-07-20  180   uint64_t v1 = seed + PRIME64_1 
+ PRIME64_2;
09c83807 Nick Terrell 2017-07-20  181   uint64_t v2 = seed + PRIME64_2;
09c83807 Nick Terrell 2017-07-20  182   uint64_t v3 = seed + 0;
09c83807 Nick Terrell 2017-07-20  183   uint64_t v4 = seed - PRIME64_1;
09c83807 Nick Terrell 2017-07-20  184  
09c83807 Nick Terrell 2017-07-20  185   do {
09c83807 Nick Terrell 2017-07-20  186   v1 = xxh64_round(v1, 
get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  187   p += 8;
09c83807 Nick Terrell 2017-07-20  188   v2 = xxh64_round(v2, 
get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  189   p += 8;
09c83807 Nick Terrell 2017-07-20  190   v3 = xxh64_round(v3, 
get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  191   p += 8;
09c83807 Nick Terrell 2017-07-20  192   v4 = xxh64_round(v4, 
get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  193   p += 8;
09c83807 Nick Terrell 2017-07-20  194   } while (p <= limit);
09c83807 Nick Terrell 2017-07-20  195  
09c83807 Nick Terrell 2017-07-20  196   h64 = xxh_rotl64(v1, 1) + 
xxh_rotl64(v2, 7) +
09c83807 Nick Terrell 2017-07-20  197   xxh_rotl64(v3, 12) + 
xxh_rotl64(v4, 18);
09c83807 Nick Terrell 2017-07-20  198   h64 = xxh64_merge_round(h64, 
v1);
09c83807 Nick Terrell 2017-07-20  199   h64 = xxh64_merge_round(h64, 
v2);
09c83807 Nick Terrell 2017-07-20  200   h64 = xxh64_merge_round(h64, 
v3);
09c83807 Nick Terrell 2017-07-20  201   h64 = xxh64_merge_round(h64, 
v4);
09c83807 Nick Terrell 2017-07-20  202  
09c83807 Nick Terrell 2017-07-20  203   } else {
09c83807 Nick Terrell 2017-07-20  204   h64  = seed + PRIME64_5;
09c83807 Nick Terrell 2017-07-20  205   }
09c83807 Nick Terrell 2017-07-20  206  
09c83807 Nick Terrell 2017-07-20  207   h64 += (uint64_t)len;
09c83807 Nick Terrell 2017-07-20  208  
09c83807 Nick Terrell 2017-07-20  209   while (p + 8 <= b_end) {
09c83807 Nick Terrell 2017-07-20  210   const uint64_t k1 = 
xxh64_round(0, get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  211  
09c83807 Nick Terrell 2017-07-20  212   h64 ^= k1;
09c83807 Nick Terrell 2017-07-20  213   h64 = xxh_rotl64(h64, 27) * 
PRIME64_1 + PRIME64_4;
09c83807 Nick Terrell 2017-07-20  214   p += 8;
09c83807 Nick Terrell 2017-07-20  215   }
09c83807 Nick Terrell 2017-07-20  216  
09c83807 Nick Terrell 2017-07-20  217   if (p + 4 <= b_end) {
09c83807 Nick Terrell 2017-07-20  218   h64 ^= 
(uint64_t)(get_unaligned_le32(p)) * PRIME64_1;
09c83807 Nick Terrell 2017-07-20  219   h64 = xxh_rotl64(h64, 23) * 
PRIME64_2 + PRIME64_3;
09c83807 Nick Terrell 2017-07-20  220   p += 4;
09c83807 Nick Terrell 2017-07-20  221   }
09c83807 Nick Terrell 2017-07-20  222  
09c83807 Nick Terrell 2017-07-20  223   while (p < b_end) {
09c83807 Nick Terrell 2017-07-20  224   h64 ^= (*p) * PRIME64_5;
09c83807 Nick Terrell 2017-07-20  225   h64 = xxh_rotl64(h64, 11) * 
PRIME64_1;
09c83807 Nick Terrell 2017-07-20  226   p++;
09c83807 Nick Terrell 2017-07-20  227   }
09c83807 Nick Terrell 2017-07-20  228  

Re: [PATCH v3 3/4] btrfs: Add zstd support

2017-07-23 Thread kbuild test robot
Hi Nick,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc1 next-20170721]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nick-Terrell/Add-xxhash-and-zstd-modules/20170723-092845
config: parisc-allyesconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=parisc 

All warnings (new ones prefixed by >>):

   lib/xxhash.c: In function 'xxh64':
>> lib/xxhash.c:236:1: warning: the frame size of 1688 bytes is larger than 
>> 1024 bytes [-Wframe-larger-than=]
}
^
   lib/xxhash.c: In function 'xxh64_update':
   lib/xxhash.c:441:1: warning: the frame size of 1072 bytes is larger than 
1024 bytes [-Wframe-larger-than=]
}
^

vim +236 lib/xxhash.c

09c83807 Nick Terrell 2017-07-20  171  
09c83807 Nick Terrell 2017-07-20  172  uint64_t xxh64(const void *input, const 
size_t len, const uint64_t seed)
09c83807 Nick Terrell 2017-07-20  173  {
09c83807 Nick Terrell 2017-07-20  174   const uint8_t *p = (const uint8_t 
*)input;
09c83807 Nick Terrell 2017-07-20  175   const uint8_t *const b_end = p + len;
09c83807 Nick Terrell 2017-07-20  176   uint64_t h64;
09c83807 Nick Terrell 2017-07-20  177  
09c83807 Nick Terrell 2017-07-20  178   if (len >= 32) {
09c83807 Nick Terrell 2017-07-20  179   const uint8_t *const limit = 
b_end - 32;
09c83807 Nick Terrell 2017-07-20  180   uint64_t v1 = seed + PRIME64_1 
+ PRIME64_2;
09c83807 Nick Terrell 2017-07-20  181   uint64_t v2 = seed + PRIME64_2;
09c83807 Nick Terrell 2017-07-20  182   uint64_t v3 = seed + 0;
09c83807 Nick Terrell 2017-07-20  183   uint64_t v4 = seed - PRIME64_1;
09c83807 Nick Terrell 2017-07-20  184  
09c83807 Nick Terrell 2017-07-20  185   do {
09c83807 Nick Terrell 2017-07-20  186   v1 = xxh64_round(v1, 
get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  187   p += 8;
09c83807 Nick Terrell 2017-07-20  188   v2 = xxh64_round(v2, 
get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  189   p += 8;
09c83807 Nick Terrell 2017-07-20  190   v3 = xxh64_round(v3, 
get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  191   p += 8;
09c83807 Nick Terrell 2017-07-20  192   v4 = xxh64_round(v4, 
get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  193   p += 8;
09c83807 Nick Terrell 2017-07-20  194   } while (p <= limit);
09c83807 Nick Terrell 2017-07-20  195  
09c83807 Nick Terrell 2017-07-20  196   h64 = xxh_rotl64(v1, 1) + 
xxh_rotl64(v2, 7) +
09c83807 Nick Terrell 2017-07-20  197   xxh_rotl64(v3, 12) + 
xxh_rotl64(v4, 18);
09c83807 Nick Terrell 2017-07-20  198   h64 = xxh64_merge_round(h64, 
v1);
09c83807 Nick Terrell 2017-07-20  199   h64 = xxh64_merge_round(h64, 
v2);
09c83807 Nick Terrell 2017-07-20  200   h64 = xxh64_merge_round(h64, 
v3);
09c83807 Nick Terrell 2017-07-20  201   h64 = xxh64_merge_round(h64, 
v4);
09c83807 Nick Terrell 2017-07-20  202  
09c83807 Nick Terrell 2017-07-20  203   } else {
09c83807 Nick Terrell 2017-07-20  204   h64  = seed + PRIME64_5;
09c83807 Nick Terrell 2017-07-20  205   }
09c83807 Nick Terrell 2017-07-20  206  
09c83807 Nick Terrell 2017-07-20  207   h64 += (uint64_t)len;
09c83807 Nick Terrell 2017-07-20  208  
09c83807 Nick Terrell 2017-07-20  209   while (p + 8 <= b_end) {
09c83807 Nick Terrell 2017-07-20  210   const uint64_t k1 = 
xxh64_round(0, get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  211  
09c83807 Nick Terrell 2017-07-20  212   h64 ^= k1;
09c83807 Nick Terrell 2017-07-20  213   h64 = xxh_rotl64(h64, 27) * 
PRIME64_1 + PRIME64_4;
09c83807 Nick Terrell 2017-07-20  214   p += 8;
09c83807 Nick Terrell 2017-07-20  215   }
09c83807 Nick Terrell 2017-07-20  216  
09c83807 Nick Terrell 2017-07-20  217   if (p + 4 <= b_end) {
09c83807 Nick Terrell 2017-07-20  218   h64 ^= 
(uint64_t)(get_unaligned_le32(p)) * PRIME64_1;
09c83807 Nick Terrell 2017-07-20  219   h64 = xxh_rotl64(h64, 23) * 
PRIME64_2 + PRIME64_3;
09c83807 Nick Terrell 2017-07-20  220   p += 4;
09c83807 Nick Terrell 2017-07-20  221   }
09c83807 Nick Terrell 2017-07-20  222  
09c83807 Nick Terrell 2017-07-20  223   while (p < b_end) {
09c83807 Nick Terrell 2017-07-20  224   h64 ^= (*p) * PRIME64_5;
09c83807 Nick Terrell 2017-07-20  225   h64 = xxh_rotl64(h64, 11) * 
PRIME64_1;
09c83807 Nick Terrell 2017-07-20  226   p++;
09c83807 Nick Terrell 2017-07-20  227   }
09c83807 Nick Terrell 2017-07-20  228  

[PATCH v3 3/4] btrfs: Add zstd support

2017-07-20 Thread Nick Terrell
Add zstd compression and decompression support to BtrFS. zstd at its
fastest level compresses almost as well as zlib, while offering much
faster compression and decompression, approaching lzo speeds.

I benchmarked btrfs with zstd compression against no compression, lzo
compression, and zlib compression. I benchmarked two scenarios. Copying
a set of files to btrfs, and then reading the files. Copying a tarball
to btrfs, extracting it to btrfs, and then reading the extracted files.
After every operation, I call `sync` and include the sync time.
Between every pair of operations I unmount and remount the filesystem
to avoid caching. The benchmark files can be found in the upstream
zstd source repository under
`contrib/linux-kernel/{btrfs-benchmark.sh,btrfs-extract-benchmark.sh}`
[1] [2].

I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD.

The first compression benchmark is copying 10 copies of the unzipped
Silesia corpus [3] into a BtrFS filesystem mounted with
`-o compress-force=Method`. The decompression benchmark times how long
it takes to `tar` all 10 copies into `/dev/null`. The compression ratio is
measured by comparing the output of `df` and `du`. See the benchmark file
[1] for details. I benchmarked multiple zstd compression levels, although
the patch uses zstd level 1.

| Method  | Ratio | Compression MB/s | Decompression speed |
|-|---|--|-|
| None|  0.99 |  504 | 686 |
| lzo |  1.66 |  398 | 442 |
| zlib|  2.58 |   65 | 241 |
| zstd 1  |  2.57 |  260 | 383 |
| zstd 3  |  2.71 |  174 | 408 |
| zstd 6  |  2.87 |   70 | 398 |
| zstd 9  |  2.92 |   43 | 406 |
| zstd 12 |  2.93 |   21 | 408 |
| zstd 15 |  3.01 |   11 | 354 |

The next benchmark first copies `linux-4.11.6.tar` [4] to btrfs. Then it
measures the compression ratio, extracts the tar, and deletes the tar.
Then it measures the compression ratio again, and `tar`s the extracted
files into `/dev/null`. See the benchmark file [2] for details.

| Method | Tar Ratio | Extract Ratio | Copy (s) | Extract (s)| Read (s) |
||---|---|--||--|
| None   |  0.97 |  0.78 |0.981 |  5.501 |8.807 |
| lzo|  2.06 |  1.38 |1.631 |  8.458 |8.585 |
| zlib   |  3.40 |  1.86 |7.750 | 21.544 |   11.744 |
| zstd 1 |  3.57 |  1.85 |2.579 | 11.479 |9.389 |

[1] 
https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-benchmark.sh
[2] 
https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-extract-benchmark.sh
[3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
[4] https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.11.6.tar.xz

zstd source repository: https://github.com/facebook/zstd

Signed-off-by: Nick Terrell 
---
v2 -> v3:
- Port upstream BtrFS commits e1ddce71d6, 389a6cfc2a, and 6acafd1eff
- Change default compression level for BtrFS to 3

 fs/btrfs/Kconfig   |   2 +
 fs/btrfs/Makefile  |   2 +-
 fs/btrfs/compression.c |   1 +
 fs/btrfs/compression.h |   6 +-
 fs/btrfs/ctree.h   |   1 +
 fs/btrfs/disk-io.c |   2 +
 fs/btrfs/ioctl.c   |   6 +-
 fs/btrfs/props.c   |   6 +
 fs/btrfs/super.c   |  12 +-
 fs/btrfs/sysfs.c   |   2 +
 fs/btrfs/zstd.c| 435 +
 include/uapi/linux/btrfs.h |   8 +-
 12 files changed, 471 insertions(+), 12 deletions(-)
 create mode 100644 fs/btrfs/zstd.c

diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
index 80e9c18..a26c63b 100644
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -6,6 +6,8 @@ config BTRFS_FS
select ZLIB_DEFLATE
select LZO_COMPRESS
select LZO_DECOMPRESS
+   select ZSTD_COMPRESS
+   select ZSTD_DECOMPRESS
select RAID6_PQ
select XOR_BLOCKS
select SRCU
diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 128ce17..962a95a 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -6,7 +6,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   transaction.o inode.o file.o tree-defrag.o \
   extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
   extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
-  export.o tree-log.o free-space-cache.o zlib.o lzo.o \
+  export.o tree-log.o free-space-cache.o zlib.o lzo.o zstd.o \
   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
   reada.o backref.o ulist.o 

[PATCH v3 3/4] btrfs: Add zstd support

2017-07-20 Thread Nick Terrell
Add zstd compression and decompression support to BtrFS. zstd at its
fastest level compresses almost as well as zlib, while offering much
faster compression and decompression, approaching lzo speeds.

I benchmarked btrfs with zstd compression against no compression, lzo
compression, and zlib compression. I benchmarked two scenarios. Copying
a set of files to btrfs, and then reading the files. Copying a tarball
to btrfs, extracting it to btrfs, and then reading the extracted files.
After every operation, I call `sync` and include the sync time.
Between every pair of operations I unmount and remount the filesystem
to avoid caching. The benchmark files can be found in the upstream
zstd source repository under
`contrib/linux-kernel/{btrfs-benchmark.sh,btrfs-extract-benchmark.sh}`
[1] [2].

I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD.

The first compression benchmark is copying 10 copies of the unzipped
Silesia corpus [3] into a BtrFS filesystem mounted with
`-o compress-force=Method`. The decompression benchmark times how long
it takes to `tar` all 10 copies into `/dev/null`. The compression ratio is
measured by comparing the output of `df` and `du`. See the benchmark file
[1] for details. I benchmarked multiple zstd compression levels, although
the patch uses zstd level 1.

| Method  | Ratio | Compression MB/s | Decompression speed |
|-|---|--|-|
| None|  0.99 |  504 | 686 |
| lzo |  1.66 |  398 | 442 |
| zlib|  2.58 |   65 | 241 |
| zstd 1  |  2.57 |  260 | 383 |
| zstd 3  |  2.71 |  174 | 408 |
| zstd 6  |  2.87 |   70 | 398 |
| zstd 9  |  2.92 |   43 | 406 |
| zstd 12 |  2.93 |   21 | 408 |
| zstd 15 |  3.01 |   11 | 354 |

The next benchmark first copies `linux-4.11.6.tar` [4] to btrfs. Then it
measures the compression ratio, extracts the tar, and deletes the tar.
Then it measures the compression ratio again, and `tar`s the extracted
files into `/dev/null`. See the benchmark file [2] for details.

| Method | Tar Ratio | Extract Ratio | Copy (s) | Extract (s)| Read (s) |
||---|---|--||--|
| None   |  0.97 |  0.78 |0.981 |  5.501 |8.807 |
| lzo|  2.06 |  1.38 |1.631 |  8.458 |8.585 |
| zlib   |  3.40 |  1.86 |7.750 | 21.544 |   11.744 |
| zstd 1 |  3.57 |  1.85 |2.579 | 11.479 |9.389 |

[1] 
https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-benchmark.sh
[2] 
https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-extract-benchmark.sh
[3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
[4] https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.11.6.tar.xz

zstd source repository: https://github.com/facebook/zstd

Signed-off-by: Nick Terrell 
---
v2 -> v3:
- Port upstream BtrFS commits e1ddce71d6, 389a6cfc2a, and 6acafd1eff
- Change default compression level for BtrFS to 3

 fs/btrfs/Kconfig   |   2 +
 fs/btrfs/Makefile  |   2 +-
 fs/btrfs/compression.c |   1 +
 fs/btrfs/compression.h |   6 +-
 fs/btrfs/ctree.h   |   1 +
 fs/btrfs/disk-io.c |   2 +
 fs/btrfs/ioctl.c   |   6 +-
 fs/btrfs/props.c   |   6 +
 fs/btrfs/super.c   |  12 +-
 fs/btrfs/sysfs.c   |   2 +
 fs/btrfs/zstd.c| 435 +
 include/uapi/linux/btrfs.h |   8 +-
 12 files changed, 471 insertions(+), 12 deletions(-)
 create mode 100644 fs/btrfs/zstd.c

diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
index 80e9c18..a26c63b 100644
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -6,6 +6,8 @@ config BTRFS_FS
select ZLIB_DEFLATE
select LZO_COMPRESS
select LZO_DECOMPRESS
+   select ZSTD_COMPRESS
+   select ZSTD_DECOMPRESS
select RAID6_PQ
select XOR_BLOCKS
select SRCU
diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 128ce17..962a95a 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -6,7 +6,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   transaction.o inode.o file.o tree-defrag.o \
   extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
   extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
-  export.o tree-log.o free-space-cache.o zlib.o lzo.o \
+  export.o tree-log.o free-space-cache.o zlib.o lzo.o zstd.o \
   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
   reada.o backref.o ulist.o qgroup.o send.o