Re: [PATCH v8 0/4] crypto: add algif_akcipher user space API
Hi Mat, >> This patch set adds the AF_ALG user space API to externalize the >> asymmetric cipher API recently added to the kernel crypto API. > > ... > >> Changes v8: >> * port to kernel 4.13 >> * port to consolidated AF_ALG code >> >> Stephan Mueller (4): >> crypto: AF_ALG -- add sign/verify API >> crypto: AF_ALG -- add setpubkey setsockopt call >> crypto: AF_ALG -- add asymmetric cipher >> crypto: algif_akcipher - enable compilation >> >> crypto/Kconfig | 9 + >> crypto/Makefile | 1 + >> crypto/af_alg.c | 28 ++- >> crypto/algif_aead.c | 36 ++-- >> crypto/algif_akcipher.c | 466 >> >> crypto/algif_skcipher.c | 26 ++- >> include/crypto/if_alg.h | 7 +- >> include/uapi/linux/if_alg.h | 3 + >> 8 files changed, 543 insertions(+), 33 deletions(-) >> create mode 100644 crypto/algif_akcipher.c >> >> -- >> 2.13.4 > > The last round of reviews for AF_ALG akcipher left off at an impasse around a > year ago: the consensus was that hardware key support was needed, but that > requirement was in conflict with the "always have a software fallback" rule > for the crypto subsystem. For example, a private key securely generated by > and stored in a TPM could not be copied out for use by a software algorithm. > Has anything come about to resolve this impasse? > > There were some patches around to add keyring support by associating a key ID > with an akcipher socket, but that approach ran in to a mismatch between the > proposed keyring API for the verify operation and the semantics of AF_ALG > verify. > > AF_ALG is best suited for crypto use cases where a socket is set up once and > there are lots of reads and writes to justify the setup cost. With asymmetric > crypto, the setup cost is high when you might only use the socket for a brief > time to do one verify or encrypt operation. > > Given the efficiency and hardware key issues, AF_ALG seems to be mismatched > with asymmetric crypto. Have you looked at the proposed keyctl() support for > crypto operations? we have also seen hardware now where the private key will never leave the crypto hardware. They public and private key is only generated for key exchange purposes and later on discarded again. Asymmetric ciphers are really not a good fit for AF_ALG and they should be solely supported via keyctl. Regards Marcel
Re: [PATCH v8 0/4] crypto: add algif_akcipher user space API
Hi Stephan, On Thu, 10 Aug 2017, Stephan Müller wrote: Hi, This patch set adds the AF_ALG user space API to externalize the asymmetric cipher API recently added to the kernel crypto API. ... Changes v8: * port to kernel 4.13 * port to consolidated AF_ALG code Stephan Mueller (4): crypto: AF_ALG -- add sign/verify API crypto: AF_ALG -- add setpubkey setsockopt call crypto: AF_ALG -- add asymmetric cipher crypto: algif_akcipher - enable compilation crypto/Kconfig | 9 + crypto/Makefile | 1 + crypto/af_alg.c | 28 ++- crypto/algif_aead.c | 36 ++-- crypto/algif_akcipher.c | 466 crypto/algif_skcipher.c | 26 ++- include/crypto/if_alg.h | 7 +- include/uapi/linux/if_alg.h | 3 + 8 files changed, 543 insertions(+), 33 deletions(-) create mode 100644 crypto/algif_akcipher.c -- 2.13.4 The last round of reviews for AF_ALG akcipher left off at an impasse around a year ago: the consensus was that hardware key support was needed, but that requirement was in conflict with the "always have a software fallback" rule for the crypto subsystem. For example, a private key securely generated by and stored in a TPM could not be copied out for use by a software algorithm. Has anything come about to resolve this impasse? There were some patches around to add keyring support by associating a key ID with an akcipher socket, but that approach ran in to a mismatch between the proposed keyring API for the verify operation and the semantics of AF_ALG verify. AF_ALG is best suited for crypto use cases where a socket is set up once and there are lots of reads and writes to justify the setup cost. With asymmetric crypto, the setup cost is high when you might only use the socket for a brief time to do one verify or encrypt operation. Given the efficiency and hardware key issues, AF_ALG seems to be mismatched with asymmetric crypto. Have you looked at the proposed keyctl() support for crypto operations? Thanks, -- Mat Martineau Intel OTC
Re: [PATCH v5 2/5] lib: Add zstd modules
On 2017-08-10 15:25, Hugo Mills wrote: On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: On 08/10/2017 04:30 AM, Eric Biggers wrote: Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). I did btrfs benchmarks of kernel trees and other normal data sets as well. The numbers were in line with what Nick is posting here. zstd is a big win over both lzo and zlib from a btrfs point of view. It's true Nick's patches only support a single compression level in btrfs, but that's because btrfs doesn't have a way to pass in the compression ratio. It could easily be a mount option, it was just outside the scope of Nick's initial work. Could we please not add more mount options? I get that they're easy to implement, but it's a very blunt instrument. What we tend to see (with both nodatacow and compress) is people using the mount options, then asking for exceptions, discovering that they can't do that, and then falling back to doing it with attributes or btrfs properties. Could we just start with btrfs properties this time round, and cut out the mount option part of this cycle. AFAIUI, the intent is to extend the compression type specification for both the mount options and the property, not to add a new mount option. I think we all agree that `mount -o compress=zstd3` is a lot better than `mount -o compress=zstd,compresslevel=3`. In the long run, it'd be great to see most of the btrfs-specific mount options get deprecated and ultimately removed entirely, in favour of attributes/properties, where feasible. Are properties set on the root subvolume inherited properly? Because unless they are, we can't get the same semantics. Two other counter arguments on completely removing BTRFS-specific mount options: 1. It's a lot easier and a lot more clearly defined to change things that affect global behavior of the FS by a remount than having to iterate everything in the FS to update properties. If I'm disabling autodefrag, I'd much rather just `mount -o remount,noautodefrag` than `find / -xdev -exec btrfs property set \{\} autodefrag false`, as the first will take effect for everything simultaneously and run exponentially quicker. 2. There are some things that don't make sense as per-object settings or are otherwise nonsensical on objects. Many, but not all, of the BTRFS specific mount options fall into this category IMO, with the notable exception of compress[-force], [no]autodefrag, [no]datacow, and [no]datasum. Some other options do make sense as properties of the filesystem (commit, flushoncommit, {inode,space}_cache, max_inline, metadata_ratio, [no]ssd, and [no]treelog are such options), but many are one-off options that affect behavior on mount (like skip_balance, clear_cache, nologreplay, norecovery, usebbackuproot, and subvol).
Re: [PATCH v5 2/5] lib: Add zstd modules
On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: > On 08/10/2017 04:30 AM, Eric Biggers wrote: > > > >Theses benchmarks are misleading because they compress the whole file as a > >single stream without resetting the dictionary, which isn't how data will > >typically be compressed in kernel mode. With filesystem compression the data > >has to be divided into small chunks that can each be decompressed > >independently. > >That eliminates one of the primary advantages of Zstandard (support for large > >dictionary sizes). > > I did btrfs benchmarks of kernel trees and other normal data sets as > well. The numbers were in line with what Nick is posting here. > zstd is a big win over both lzo and zlib from a btrfs point of view. > > It's true Nick's patches only support a single compression level in > btrfs, but that's because btrfs doesn't have a way to pass in the > compression ratio. It could easily be a mount option, it was just > outside the scope of Nick's initial work. Could we please not add more mount options? I get that they're easy to implement, but it's a very blunt instrument. What we tend to see (with both nodatacow and compress) is people using the mount options, then asking for exceptions, discovering that they can't do that, and then falling back to doing it with attributes or btrfs properties. Could we just start with btrfs properties this time round, and cut out the mount option part of this cycle. In the long run, it'd be great to see most of the btrfs-specific mount options get deprecated and ultimately removed entirely, in favour of attributes/properties, where feasible. Hugo. -- Hugo Mills | Klytus! Are your men on the right pills? Maybe you hugo@... carfax.org.uk | should execute their trainer! http://carfax.org.uk/ | PGP: E2AB1DE4 | Ming the Merciless, Flash Gordon signature.asc Description: Digital signature
Re: [PATCH v5 2/5] lib: Add zstd modules
On 8/10/17, 10:48 AM, "Austin S. Hemmelgarn"wrote: >On 2017-08-10 13:24, Eric Biggers wrote: >>On Thu, Aug 10, 2017 at 07:32:18AM -0400, Austin S. Hemmelgarn wrote: >>>On 2017-08-10 04:30, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: > > It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. >>> Large data-sets with WORM access patterns and infrequent writes >>> immediately come to mind as a use case for the highest compression >>> level. >>> >>> As a more specific example, the company I work for has a very large >>> amount of documentation, and we keep all old versions. This is all >>> stored on a file server which is currently using BTRFS. Once a >>> document is written, it's almost never rewritten, so write >>> performance only matters for the first write. However, they're read >>> back pretty frequently, so we need good read performance. As of >>> right now, the system is set to use LZO compression by default, and >>> then when a new document is added, the previous version of that >>> document gets re-compressed using zlib compression, which actually >>> results in pretty significant space savings most of the time. I >>> would absolutely love to use zstd compression with this system with >>> the highest compression level, because most people don't care how >>> long it takes to write the file out, but they do care how long it >>> takes to read a file (even if it's an older version). >> >> This may be a reasonable use case, but note this cannot just be the regular >> "zstd" compression setting, since filesystem compression by default must >> provide >> reasonable performance for many different access patterns. See the patch in >> this series which actually adds zstd compression to btrfs; it only uses >> level 1. >> I do not see a patch which adds a higher compression mode. It would need to >> be >> a special setting like "zstdhc" that users could opt-in to on specific >> directories. It also would need to be compared to simply compressing in >> userspace. In many cases compressing in userspace is probably the better >> solution for the use case in question because it works on any filesystem, >> allows >> using any compression algorithm, and if random access is not needed it is >> possible to compress each file as a single stream (like a .xz file), which >> produces a much better compression ratio than the block-by-block compression >> that filesystems have to use. > There has been discussion as well as (I think) initial patches merged > for support of specifying the compression level for algorithms which > support multiple compression levels in BTRFS. I was actually under the > impression that we had decided to use level 3 as the default for zstd, > but that apparently isn't the case, and with the benchmark issues, it > may not be once proper benchmarks are run. There are some initial patches to add compression levels to BtrFS [1]. Once it's ready, we can add compression levels to zstd. The default compression level in the current patch is 3. [1] https://lkml.kernel.org/r/20170724172939.24527-1-dste...@suse.com
Re: [PATCH v5 2/5] lib: Add zstd modules
On 8/10/17, 1:30 AM, "Eric Biggers"wrote: > On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: >> >> It can compress at speeds approaching lz4, and quality approaching lzma. > > Well, for a very loose definition of "approaching", and certainly not at the > same time. I doubt there's a use case for using the highest compression > levels > in kernel mode --- especially the ones using zstd_opt.h. > >> >> The code was ported from the upstream zstd source repository. > > What version? zstd-1.1.4 with patches applied from upstream. I'll include it in the next patch version. >> `linux/zstd.h` header was modified to match linux kernel style. >> The cross-platform and allocation code was stripped out. Instead zstd >> requires the caller to pass a preallocated workspace. The source files >> were clang-formatted [1] to match the Linux Kernel style as much as >> possible. > > It would be easier to compare to the upstream version if it was not all > reformatted. There is a chance that bugs were introduced by Linux-specific > changes, and it would be nice if they could be easily reviewed. (Also I don't > know what clang-format settings you used, but there are still a lot of > differences from the Linux coding style.) The clang-format settings I used are available in the zstd repo [1]. I left the line length long, since it looked terrible otherwise.I set up a branch in my zstd GitHub fork called "original-formatted" [2]. I've taken the source I based the kernel patches off of [3] and ran clang-format without any other changes. If you have any suggestions to improve the clang-formatting please let me know. >> >> I benchmarked zstd compression as a special character device. I ran zstd >> and zlib compression at several levels, as well as performing no >> compression, which measure the time spent copying the data to kernel space. >> Data is passed to the compresser 4096 B at a time. The benchmark file is >> located in the upstream zstd source repository under >> `contrib/linux-kernel/zstd_compress_test.c` [2]. >> >> I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. >> The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, >> 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is >> 211,988,480 B large. Run the following commands for the benchmark: >> >> sudo modprobe zstd_compress_test >> sudo mknod zstd_compress_test c 245 0 >> sudo cp silesia.tar zstd_compress_test >> >> The time is reported by the time of the userland `cp`. >> The MB/s is computed with >> >> 1,536,217,008 B / time(buffer size, hash) >> >> which includes the time to copy from userland. >> The Adjusted MB/s is computed with >> >> 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). >> >> The memory reported is the amount of memory the compressor requests. >> >> | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | >> |--|--|--|---|-|--|--| >> | none | 11988480 |0.100 | 1 | 2119.88 |- |- | >> | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | >> | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | >> | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | >> | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | >> | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | >> | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | >> | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | >> | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | >> | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | >> | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | >> > > Theses benchmarks are misleading because they compress the whole file as a > single stream without resetting the dictionary, which isn't how data will > typically be compressed in kernel mode. With filesystem compression the data > has to be divided into small chunks that can each be decompressed > independently. > That eliminates one of the primary advantages of Zstandard (support for large > dictionary sizes). This benchmark isn't meant to be representative of a filesystem scenario. I wanted to show off zstd without anything else going on. Even in filesystems where the data is chunked, zstd uses the whole chunk as the window (128 KB in BtrFS and SquashFS by default), where zlib uses 32 KB. I have benchmarks for BtrFS and SquashFS in their respective patches [4][5], and I've copied the BtrFS table below (which was run with 2 threads). | Method | Ratio | Compression MB/s | Decompression speed | |-|---|--|-| | None| 0.99 | 504 | 686 | | lzo | 1.66 |
Re: [PATCH v5 2/5] lib: Add zstd modules
On 08/10/2017 03:00 PM, Eric Biggers wrote: On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: On 08/10/2017 04:30 AM, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: The memory reported is the amount of memory the compressor requests. | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | |--|--|--|---|-|--|--| | none | 11988480 |0.100 | 1 | 2119.88 |- |- | | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). I did btrfs benchmarks of kernel trees and other normal data sets as well. The numbers were in line with what Nick is posting here. zstd is a big win over both lzo and zlib from a btrfs point of view. It's true Nick's patches only support a single compression level in btrfs, but that's because btrfs doesn't have a way to pass in the compression ratio. It could easily be a mount option, it was just outside the scope of Nick's initial work. I am not surprised --- Zstandard is closer to the state of the art, both format-wise and implementation-wise, than the other choices in BTRFS. My point is that benchmarks need to account for how much data is compressed at a time. This is a common mistake when comparing different compression algorithms; the algorithm name and compression level do not tell the whole story. The dictionary size is extremely significant. No one is going to compress or decompress a 200 MB file as a single stream in kernel mode, so it does not make sense to justify adding Zstandard *to the kernel* based on such a benchmark. It is going to be divided into chunks. How big are the chunks in BTRFS? I thought that it compressed only one page (4 KiB) at a time, but I hope that has been, or is being, improved; 32 KiB - 128 KiB should be a better amount. (And if the amount of data compressed at a time happens to be different between the different algorithms, note that BTRFS benchmarks are likely to be measuring that as much as the algorithms themselves.) Btrfs hooks the compression code into the delayed allocation mechanism we use to gather large extents for COW. So if you write 100MB to a file, we'll have 100MB to compress at a time (within the limits of the amount of pages we allow to collect before forcing it down). But we want to balance how much memory you might need to uncompress during random reads. So we have an artificial limit of 128KB that we send at a time to the compression code. It's easy to change this, it's just a tradeoff made to limit the cost of reading small bits. It's the same for zlib,lzo and the new zstd patch. -chris
Re: [PATCH v5 2/5] lib: Add zstd modules
On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: > On 08/10/2017 04:30 AM, Eric Biggers wrote: > >On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: > > >>The memory reported is the amount of memory the compressor requests. > >> > >>| Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | > >>|--|--|--|---|-|--|--| > >>| none | 11988480 |0.100 | 1 | 2119.88 |- |- | > >>| zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | > >>| zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | > >>| zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | > >>| zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | > >>| zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | > >>| zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | > >>| zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | > >>| zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | > >>| zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | > >>| zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | > >> > > > >Theses benchmarks are misleading because they compress the whole file as a > >single stream without resetting the dictionary, which isn't how data will > >typically be compressed in kernel mode. With filesystem compression the data > >has to be divided into small chunks that can each be decompressed > >independently. > >That eliminates one of the primary advantages of Zstandard (support for large > >dictionary sizes). > > I did btrfs benchmarks of kernel trees and other normal data sets as > well. The numbers were in line with what Nick is posting here. > zstd is a big win over both lzo and zlib from a btrfs point of view. > > It's true Nick's patches only support a single compression level in > btrfs, but that's because btrfs doesn't have a way to pass in the > compression ratio. It could easily be a mount option, it was just > outside the scope of Nick's initial work. > I am not surprised --- Zstandard is closer to the state of the art, both format-wise and implementation-wise, than the other choices in BTRFS. My point is that benchmarks need to account for how much data is compressed at a time. This is a common mistake when comparing different compression algorithms; the algorithm name and compression level do not tell the whole story. The dictionary size is extremely significant. No one is going to compress or decompress a 200 MB file as a single stream in kernel mode, so it does not make sense to justify adding Zstandard *to the kernel* based on such a benchmark. It is going to be divided into chunks. How big are the chunks in BTRFS? I thought that it compressed only one page (4 KiB) at a time, but I hope that has been, or is being, improved; 32 KiB - 128 KiB should be a better amount. (And if the amount of data compressed at a time happens to be different between the different algorithms, note that BTRFS benchmarks are likely to be measuring that as much as the algorithms themselves.) Eric
Re: [PATCH v5 2/5] lib: Add zstd modules
On 2017-08-10 13:24, Eric Biggers wrote: On Thu, Aug 10, 2017 at 07:32:18AM -0400, Austin S. Hemmelgarn wrote: On 2017-08-10 04:30, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. Large data-sets with WORM access patterns and infrequent writes immediately come to mind as a use case for the highest compression level. As a more specific example, the company I work for has a very large amount of documentation, and we keep all old versions. This is all stored on a file server which is currently using BTRFS. Once a document is written, it's almost never rewritten, so write performance only matters for the first write. However, they're read back pretty frequently, so we need good read performance. As of right now, the system is set to use LZO compression by default, and then when a new document is added, the previous version of that document gets re-compressed using zlib compression, which actually results in pretty significant space savings most of the time. I would absolutely love to use zstd compression with this system with the highest compression level, because most people don't care how long it takes to write the file out, but they do care how long it takes to read a file (even if it's an older version). This may be a reasonable use case, but note this cannot just be the regular "zstd" compression setting, since filesystem compression by default must provide reasonable performance for many different access patterns. See the patch in this series which actually adds zstd compression to btrfs; it only uses level 1. I do not see a patch which adds a higher compression mode. It would need to be a special setting like "zstdhc" that users could opt-in to on specific directories. It also would need to be compared to simply compressing in userspace. In many cases compressing in userspace is probably the better solution for the use case in question because it works on any filesystem, allows using any compression algorithm, and if random access is not needed it is possible to compress each file as a single stream (like a .xz file), which produces a much better compression ratio than the block-by-block compression that filesystems have to use. There has been discussion as well as (I think) initial patches merged for support of specifying the compression level for algorithms which support multiple compression levels in BTRFS. I was actually under the impression that we had decided to use level 3 as the default for zstd, but that apparently isn't the case, and with the benchmark issues, it may not be once proper benchmarks are run. Also, on the note of compressing in userspace, the use case I quoted at least can't do that because we have to deal with Windows clients and users have to be able to open files directly on said Windows clients. I entirely agree that real archival storage is better off using userspace compression, but sometimes real archival storage isn't an option. Note also that LZ4HC is in the kernel source tree currently but no one is using it vs. the regular LZ4. I think it is the kind of thing that sounded useful originally, but at the end of the day no one really wants to use it in kernel mode. I'd certainly be interested in actual patches, though. Part of that is the fact that BTRFS is one of the only consumers (AFAIK) of this API that can freely choose all aspects of their usage, and the consensus here (which I don't agree with I might add) amounts to the argument that 'we already have compression with a compression ratio, we don't need more things like that'. I would personally love to see LZ4HC support in BTRFS (based on testing my own use cases, LZ4 is more deterministic than LZO for both compression and decompression, and most of the non archival usage I have of BTRFS benefits from determinism), but there's not any point in me writing up such a patch because it's almost certain to get rejected because BTRFS already has LZO. The main reason that zstd is getting considered at all is that the quoted benchmarks show clear benefits in decompression speed relative to zlib and far better compression ratios than LZO.
[RFC PATCH 03/10] staging: fsl-mc: dpio: add order preservation support
From: Radu AlexeOrder preservation is a feature that will be supported in dpni, dpseci and dpci devices. This is a preliminary patch for the changes to be introduced in the corresponding drivers. Signed-off-by: Radu Alexe Signed-off-by: Horia Geantă --- drivers/staging/fsl-mc/include/dpopr.h | 110 + 1 file changed, 110 insertions(+) create mode 100644 drivers/staging/fsl-mc/include/dpopr.h diff --git a/drivers/staging/fsl-mc/include/dpopr.h b/drivers/staging/fsl-mc/include/dpopr.h new file mode 100644 index ..e1110af2fe54 --- /dev/null +++ b/drivers/staging/fsl-mc/include/dpopr.h @@ -0,0 +1,110 @@ +/* + * Copyright 2017 NXP + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of the above-listed copyright holders nor the + * names of any contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ +#ifndef __FSL_DPOPR_H_ +#define __FSL_DPOPR_H_ + +/* Data Path Order Restoration API + * Contains initialization APIs and runtime APIs for the Order Restoration + */ + +/** Order Restoration properties */ + +/** + * Create a new Order Point Record option + */ +#define OPR_OPT_CREATE 0x1 +/** + * Retire an existing Order Point Record option + */ +#define OPR_OPT_RETIRE 0x2 + +/** + * struct opr_cfg - Structure representing OPR configuration + * @oprrws: Order point record (OPR) restoration window size (0 to 5) + * 0 - Window size is 32 frames. + * 1 - Window size is 64 frames. + * 2 - Window size is 128 frames. + * 3 - Window size is 256 frames. + * 4 - Window size is 512 frames. + * 5 - Window size is 1024 frames. + * @oa: OPR auto advance NESN window size (0 disabled, 1 enabled) + * @olws: OPR acceptable late arrival window size (0 to 3) + * 0 - Disabled. Late arrivals are always rejected. + * 1 - Window size is 32 frames. + * 2 - Window size is the same as the OPR restoration + * window size configured in the OPRRWS field. + * 3 - Window size is 8192 frames. Late arrivals are + * always accepted. + * @oeane: Order restoration list (ORL) resource exhaustion + * advance NESN enable (0 disabled, 1 enabled) + * @oloe: OPR loose ordering enable (0 disabled, 1 enabled) + */ +struct opr_cfg { + u8 oprrws; + u8 oa; + u8 olws; + u8 oeane; + u8 oloe; +}; + +/** + * struct opr_qry - Structure representing OPR configuration + * @enable: Enabled state + * @rip: Retirement In Progress + * @ndsn: Next dispensed sequence number + * @nesn: Next expected sequence number + * @ea_hseq: Early arrival head sequence number + * @hseq_nlis: HSEQ not last in sequence + * @ea_tseq: Early arrival tail sequence number + * @tseq_nlis: TSEQ not last in sequence + * @ea_tptr: Early arrival tail pointer + * @ea_hptr: Early arrival head pointer + * @opr_id: Order Point Record ID + * @opr_vid: Order Point Record Virtual ID + */ +struct opr_qry { + char enable; + char rip; + u16 ndsn; + u16 nesn; + u16 ea_hseq; + char hseq_nlis; + u16 ea_tseq; + char tseq_nlis; + u16 ea_tptr; + u16 ea_hptr; +
[RFC PATCH 04/10] staging: fsl-dpaa2/eth: move generic FD defines to DPIO
Previous commits: 6e2387e8f19e ("staging: fsl-dpaa2/eth: Add Freescale DPAA2 Ethernet driver") 39163c0ce0f4 ("staging: fsl-dpaa2/eth: Errors checking update") have added bits that are not specific to the WRIOP accelerator. Move these where they belong (in DPIO) such that other accelerators can make use of them. While here, fix the values of FD_CTRL_FSE and FD_CTRL_FAERR, which were shifted off by one bit. Signed-off-by: Horia Geantă--- drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 8 +++- drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h | 19 +-- drivers/staging/fsl-mc/include/dpaa2-fd.h | 12 3 files changed, 20 insertions(+), 19 deletions(-) diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c index b9a0a315e6fb..a1d5c371e1c4 100644 --- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c +++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c @@ -410,8 +410,7 @@ static int build_sg_fd(struct dpaa2_eth_priv *priv, dpaa2_fd_set_format(fd, dpaa2_fd_sg); dpaa2_fd_set_addr(fd, addr); dpaa2_fd_set_len(fd, skb->len); - dpaa2_fd_set_ctrl(fd, DPAA2_FD_CTRL_ASAL | DPAA2_FD_CTRL_PTA | - DPAA2_FD_CTRL_PTV1); + dpaa2_fd_set_ctrl(fd, DPAA2_FD_CTRL_ASAL | FD_CTRL_PTA | FD_CTRL_PTV1); return 0; @@ -464,8 +463,7 @@ static int build_single_fd(struct dpaa2_eth_priv *priv, dpaa2_fd_set_offset(fd, (u16)(skb->data - buffer_start)); dpaa2_fd_set_len(fd, skb->len); dpaa2_fd_set_format(fd, dpaa2_fd_single); - dpaa2_fd_set_ctrl(fd, DPAA2_FD_CTRL_ASAL | DPAA2_FD_CTRL_PTA | - DPAA2_FD_CTRL_PTV1); + dpaa2_fd_set_ctrl(fd, DPAA2_FD_CTRL_ASAL | FD_CTRL_PTA | FD_CTRL_PTV1); return 0; } @@ -653,7 +651,7 @@ static void dpaa2_eth_tx_conf(struct dpaa2_eth_priv *priv, /* We only check error bits in the FAS field if corresponding * FAERR bit is set in FD and the FAS field is marked as valid */ - has_fas_errors = (fd_errors & DPAA2_FD_CTRL_FAERR) && + has_fas_errors = (fd_errors & FD_CTRL_FAERR) && !!(dpaa2_fd_get_frc(fd) & DPAA2_FD_FRC_FASV); if (net_ratelimit()) netdev_dbg(priv->net_dev, "TX frame FD error: %x08\n", diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h index e6d28a249fc1..dfbb60b1 100644 --- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h +++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h @@ -120,23 +120,14 @@ struct dpaa2_eth_swa { #define DPAA2_FD_FRC_FASWOV0x0800 #define DPAA2_FD_FRC_FAICFDV 0x0400 -/* Error bits in FD CTRL */ -#define DPAA2_FD_CTRL_UFD 0x0004 -#define DPAA2_FD_CTRL_SBE 0x0008 -#define DPAA2_FD_CTRL_FSE 0x0010 -#define DPAA2_FD_CTRL_FAERR0x0020 - -#define DPAA2_FD_RX_ERR_MASK (DPAA2_FD_CTRL_SBE | \ -DPAA2_FD_CTRL_FAERR) -#define DPAA2_FD_TX_ERR_MASK (DPAA2_FD_CTRL_UFD | \ -DPAA2_FD_CTRL_SBE | \ -DPAA2_FD_CTRL_FSE | \ -DPAA2_FD_CTRL_FAERR) +#define DPAA2_FD_RX_ERR_MASK (FD_CTRL_SBE | FD_CTRL_FAERR) +#define DPAA2_FD_TX_ERR_MASK (FD_CTRL_UFD| \ +FD_CTRL_SBE| \ +FD_CTRL_FSE| \ +FD_CTRL_FAERR) /* Annotation bits in FD CTRL */ #define DPAA2_FD_CTRL_ASAL 0x0002 /* ASAL = 128 */ -#define DPAA2_FD_CTRL_PTA 0x0080 -#define DPAA2_FD_CTRL_PTV1 0x0040 /* Frame annotation status */ struct dpaa2_fas { diff --git a/drivers/staging/fsl-mc/include/dpaa2-fd.h b/drivers/staging/fsl-mc/include/dpaa2-fd.h index 992fdc7ba5b8..72328415c26d 100644 --- a/drivers/staging/fsl-mc/include/dpaa2-fd.h +++ b/drivers/staging/fsl-mc/include/dpaa2-fd.h @@ -101,6 +101,18 @@ struct dpaa2_fd { #define FL_FINAL_FLAG_MASK 0x1 #define FL_FINAL_FLAG_SHIFT15 +/* Error bits in FD CTRL */ +#define FD_CTRL_ERR_MASK 0x00FF +#define FD_CTRL_UFD0x0004 +#define FD_CTRL_SBE0x0008 +#define FD_CTRL_FLC0x0010 +#define FD_CTRL_FSE0x0020 +#define FD_CTRL_FAERR 0x0040 + +/* Annotation bits in FD CTRL */ +#define FD_CTRL_PTA0x0080 +#define FD_CTRL_PTV1 0x0040 + enum dpaa2_fd_format { dpaa2_fd_single = 0, dpaa2_fd_list, -- 2.12.0.264.gd6db3f216544
[RFC PATCH 01/10] staging: fsl-mc: dpio: add frame list format support
Add support for dpaa2_fd_list format, i.e. dpaa2_fl_entry structure and accessors. Frame list entries (FLEs) are similar, but not identical to frame descriptors (FDs): + "F" (final) bit - FMT[b'01] is reserved - DD, SC, DROPP bits (covered by "FD compatibility" field in FLE case) - FLC[5:0] not used for stashing Signed-off-by: Horia Geantă--- drivers/staging/fsl-mc/include/dpaa2-fd.h | 243 ++ 1 file changed, 243 insertions(+) diff --git a/drivers/staging/fsl-mc/include/dpaa2-fd.h b/drivers/staging/fsl-mc/include/dpaa2-fd.h index cf7857f00a5c..992fdc7ba5b8 100644 --- a/drivers/staging/fsl-mc/include/dpaa2-fd.h +++ b/drivers/staging/fsl-mc/include/dpaa2-fd.h @@ -91,6 +91,15 @@ struct dpaa2_fd { #define SG_BPID_MASK 0x3FFF #define SG_FINAL_FLAG_MASK 0x1 #define SG_FINAL_FLAG_SHIFT15 +#define FL_SHORT_LEN_FLAG_MASK 0x1 +#define FL_SHORT_LEN_FLAG_SHIFT14 +#define FL_SHORT_LEN_MASK 0x3 +#define FL_OFFSET_MASK 0x0FFF +#define FL_FORMAT_MASK 0x3 +#define FL_FORMAT_SHIFT12 +#define FL_BPID_MASK 0x3FFF +#define FL_FINAL_FLAG_MASK 0x1 +#define FL_FINAL_FLAG_SHIFT15 enum dpaa2_fd_format { dpaa2_fd_single = 0, @@ -448,4 +457,238 @@ static inline void dpaa2_sg_set_final(struct dpaa2_sg_entry *sg, bool final) sg->format_offset |= cpu_to_le16(final << SG_FINAL_FLAG_SHIFT); } +/** + * struct dpaa2_fl_entry - structure for frame list entry. + * @addr: address in the FLE + * @len: length in the FLE + * @bpid: buffer pool ID + * @format_offset: format, offset, and short-length fields + * @frc: frame context + * @ctrl: control bits...including pta, pvt1, pvt2, err, etc + * @flc: flow context address + */ +struct dpaa2_fl_entry { + __le64 addr; + __le32 len; + __le16 bpid; + __le16 format_offset; + __le32 frc; + __le32 ctrl; + __le64 flc; +}; + +enum dpaa2_fl_format { + dpaa2_fl_single = 0, + dpaa2_fl_res, + dpaa2_fl_sg +}; + +/** + * dpaa2_fl_get_addr() - get the addr field of FLE + * @fle: the given frame list entry + * + * Return the address in the frame list entry. + */ +static inline dma_addr_t dpaa2_fl_get_addr(const struct dpaa2_fl_entry *fle) +{ + return (dma_addr_t)le64_to_cpu(fle->addr); +} + +/** + * dpaa2_fl_set_addr() - Set the addr field of FLE + * @fle: the given frame list entry + * @addr: the address needs to be set in frame list entry + */ +static inline void dpaa2_fl_set_addr(struct dpaa2_fl_entry *fle, +dma_addr_t addr) +{ + fle->addr = cpu_to_le64(addr); +} + +/** + * dpaa2_fl_get_frc() - Get the frame context in the FLE + * @fle: the given frame list entry + * + * Return the frame context field in the frame lsit entry. + */ +static inline u32 dpaa2_fl_get_frc(const struct dpaa2_fl_entry *fle) +{ + return le32_to_cpu(fle->frc); +} + +/** + * dpaa2_fl_set_frc() - Set the frame context in the FLE + * @fle: the given frame list entry + * @frc: the frame context needs to be set in frame list entry + */ +static inline void dpaa2_fl_set_frc(struct dpaa2_fl_entry *fle, u32 frc) +{ + fle->frc = cpu_to_le32(frc); +} + +/** + * dpaa2_fl_get_ctrl() - Get the control bits in the FLE + * @fle: the given frame list entry + * + * Return the control bits field in the frame list entry. + */ +static inline u32 dpaa2_fl_get_ctrl(const struct dpaa2_fl_entry *fle) +{ + return le32_to_cpu(fle->ctrl); +} + +/** + * dpaa2_fl_set_ctrl() - Set the control bits in the FLE + * @fle: the given frame list entry + * @ctrl: the control bits to be set in the frame list entry + */ +static inline void dpaa2_fl_set_ctrl(struct dpaa2_fl_entry *fle, u32 ctrl) +{ + fle->ctrl = cpu_to_le32(ctrl); +} + +/** + * dpaa2_fl_get_flc() - Get the flow context in the FLE + * @fle: the given frame list entry + * + * Return the flow context in the frame list entry. + */ +static inline dma_addr_t dpaa2_fl_get_flc(const struct dpaa2_fl_entry *fle) +{ + return (dma_addr_t)le64_to_cpu(fle->flc); +} + +/** + * dpaa2_fl_set_flc() - Set the flow context field of FLE + * @fle: the given frame list entry + * @flc_addr: the flow context needs to be set in frame list entry + */ +static inline void dpaa2_fl_set_flc(struct dpaa2_fl_entry *fle, + dma_addr_t flc_addr) +{ + fle->flc = cpu_to_le64(flc_addr); +} + +static inline bool dpaa2_fl_short_len(const struct dpaa2_fl_entry *fle) +{ + return !!((le16_to_cpu(fle->format_offset) >> + FL_SHORT_LEN_FLAG_SHIFT) & FL_SHORT_LEN_FLAG_MASK); +} + +/** + * dpaa2_fl_get_len() - Get the length in the FLE + * @fle: the given frame list entry + * + * Return the length field in the frame list entry. + */ +static inline u32 dpaa2_fl_get_len(const struct dpaa2_fl_entry *fle) +{ + if (dpaa2_fl_short_len(fle)) +
[RFC PATCH 00/10] crypto: caam - add DPAA2 (DPSECI) driver
Hi, This patch set adds the CAAM crypto engine driver for DPAA2 (Data Path Acceleration Architecture v2) found on ARMv8-based SoCs like LS1088A, LS2088A. Driver consists of: -DPSECI (Data Path SEC Interface) backend - low-level API that allows to manage DPSECI devices (DPAA2 objects) that sit on the Management Complex (MC) fsl-mc bus -algorithms frontend - AEAD and ablkcipher algorithms implementation Patches 1-4 include DPIO object dependencies. I am aware that DPIO is currently in staging, however I don't consider these to be a large feature set. Anyhow, please let me know if going with the patches through staging is acceptable. Patches 5-9 are the core of the patch set, adding the driver. For symmetric encryption the legacy ablkcipher interface is used; the plan is to convert to skcipher all CAAM frontends at once at a certain point in time. Patch 10 enables driver on arm64. It will be built only if dependency on DPIO (CONFIG_FSL_MC_DPIO) is satisfied. Thanks, Horia Horia Geantă (9): staging: fsl-mc: dpio: add frame list format support staging: fsl-mc: dpio: add congestion notification support staging: fsl-dpaa2/eth: move generic FD defines to DPIO crypto: caam/qi - prepare for gcm(aes) support crypto: caam - add DPAA2-CAAM (DPSECI) backend API crypto: caam - add Queue Interface v2 error codes crypto: caam/qi2 - add DPAA2-CAAM driver crypto: caam/qi2 - add ablkcipher algorithms arm64: defconfig: enable CAAM crypto engine on QorIQ DPAA2 SoCs Radu Alexe (1): staging: fsl-mc: dpio: add order preservation support arch/arm64/configs/defconfig |1 + drivers/crypto/Makefile|2 +- drivers/crypto/caam/Kconfig| 57 +- drivers/crypto/caam/Makefile |9 +- drivers/crypto/caam/caamalg.c | 19 +- drivers/crypto/caam/caamalg_desc.c | 165 +- drivers/crypto/caam/caamalg_desc.h | 24 +- drivers/crypto/caam/caamalg_qi2.c | 3949 drivers/crypto/caam/caamalg_qi2.h | 243 ++ drivers/crypto/caam/compat.h |1 + drivers/crypto/caam/dpseci.c | 858 + drivers/crypto/caam/dpseci.h | 395 +++ drivers/crypto/caam/dpseci_cmd.h | 261 ++ drivers/crypto/caam/error.c| 75 +- drivers/crypto/caam/error.h|6 +- drivers/crypto/caam/key_gen.c | 30 - drivers/crypto/caam/key_gen.h | 30 + drivers/crypto/caam/regs.h |2 + drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c |8 +- drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h | 19 +- drivers/staging/fsl-mc/include/dpaa2-fd.h | 255 ++ drivers/staging/fsl-mc/include/dpaa2-io.h | 43 + drivers/staging/fsl-mc/include/dpopr.h | 110 + 23 files changed, 6463 insertions(+), 99 deletions(-) create mode 100644 drivers/crypto/caam/caamalg_qi2.c create mode 100644 drivers/crypto/caam/caamalg_qi2.h create mode 100644 drivers/crypto/caam/dpseci.c create mode 100644 drivers/crypto/caam/dpseci.h create mode 100644 drivers/crypto/caam/dpseci_cmd.h create mode 100644 drivers/staging/fsl-mc/include/dpopr.h -- 2.12.0.264.gd6db3f216544
[RFC PATCH 09/10] crypto: caam/qi2 - add ablkcipher algorithms
Add support to submit the following ablkcipher algorithms via the DPSECI backend: cbc({aes,des,des3_ede}) ctr(aes), rfc3686(ctr(aes)) xts(aes) Signed-off-by: Horia Geantă--- drivers/crypto/caam/Kconfig | 1 + drivers/crypto/caam/caamalg_qi2.c | 816 ++ drivers/crypto/caam/caamalg_qi2.h | 23 +- 3 files changed, 839 insertions(+), 1 deletion(-) diff --git a/drivers/crypto/caam/Kconfig b/drivers/crypto/caam/Kconfig index e45d39d9007e..eb202e59c4fa 100644 --- a/drivers/crypto/caam/Kconfig +++ b/drivers/crypto/caam/Kconfig @@ -159,6 +159,7 @@ config CRYPTO_DEV_FSL_DPAA2_CAAM tristate "QorIQ DPAA2 CAAM (DPSECI) driver" depends on FSL_MC_DPIO select CRYPTO_DEV_FSL_CAAM_COMMON + select CRYPTO_BLKCIPHER select CRYPTO_AUTHENC select CRYPTO_AEAD ---help--- diff --git a/drivers/crypto/caam/caamalg_qi2.c b/drivers/crypto/caam/caamalg_qi2.c index 9dc5e1184e80..f32c518bc680 100644 --- a/drivers/crypto/caam/caamalg_qi2.c +++ b/drivers/crypto/caam/caamalg_qi2.c @@ -1047,6 +1047,457 @@ static int rfc4543_setkey(struct crypto_aead *aead, return ret; } +static int ablkcipher_setkey(struct crypto_ablkcipher *ablkcipher, +const u8 *key, unsigned int keylen) +{ + struct caam_ctx *ctx = crypto_ablkcipher_ctx(ablkcipher); + struct crypto_tfm *tfm = crypto_ablkcipher_tfm(ablkcipher); + const char *alg_name = crypto_tfm_alg_name(tfm); + struct device *dev = ctx->dev; + struct caam_flc *flc; + unsigned int ivsize = crypto_ablkcipher_ivsize(ablkcipher); + u32 *desc; + u32 ctx1_iv_off = 0; + const bool ctr_mode = ((ctx->cdata.algtype & OP_ALG_AAI_MASK) == + OP_ALG_AAI_CTR_MOD128); + const bool is_rfc3686 = (ctr_mode && strstr(alg_name, "rfc3686")); + + memcpy(ctx->key, key, keylen); +#ifdef DEBUG + print_hex_dump(KERN_ERR, "key in @" __stringify(__LINE__)": ", + DUMP_PREFIX_ADDRESS, 16, 4, key, keylen, 1); +#endif + /* +* AES-CTR needs to load IV in CONTEXT1 reg +* at an offset of 128bits (16bytes) +* CONTEXT1[255:128] = IV +*/ + if (ctr_mode) + ctx1_iv_off = 16; + + /* +* RFC3686 specific: +* | CONTEXT1[255:128] = {NONCE, IV, COUNTER} +* | *key = {KEY, NONCE} +*/ + if (is_rfc3686) { + ctx1_iv_off = 16 + CTR_RFC3686_NONCE_SIZE; + keylen -= CTR_RFC3686_NONCE_SIZE; + } + + ctx->key_dma = dma_map_single(dev, ctx->key, keylen, DMA_TO_DEVICE); + if (dma_mapping_error(dev, ctx->key_dma)) { + dev_err(dev, "unable to map key i/o memory\n"); + return -ENOMEM; + } + ctx->cdata.keylen = keylen; + ctx->cdata.key_virt = ctx->key; + ctx->cdata.key_inline = true; + + /* ablkcipher_encrypt shared descriptor */ + flc = >flc[ENCRYPT]; + desc = flc->sh_desc; + + cnstr_shdsc_ablkcipher_encap(desc, >cdata, ivsize, +is_rfc3686, ctx1_iv_off); + + flc->flc[1] = desc_len(desc); /* SDL */ + flc->flc_dma = dma_map_single(dev, flc, sizeof(flc->flc) + + desc_bytes(desc), DMA_TO_DEVICE); + if (dma_mapping_error(dev, flc->flc_dma)) { + dev_err(dev, "unable to map shared descriptor\n"); + return -ENOMEM; + } + + /* ablkcipher_decrypt shared descriptor */ + flc = >flc[DECRYPT]; + desc = flc->sh_desc; + + cnstr_shdsc_ablkcipher_decap(desc, >cdata, ivsize, +is_rfc3686, ctx1_iv_off); + + flc->flc[1] = desc_len(desc); /* SDL */ + flc->flc_dma = dma_map_single(dev, flc, sizeof(flc->flc) + + desc_bytes(desc), DMA_TO_DEVICE); + if (dma_mapping_error(dev, flc->flc_dma)) { + dev_err(dev, "unable to map shared descriptor\n"); + return -ENOMEM; + } + + /* ablkcipher_givencrypt shared descriptor */ + flc = >flc[GIVENCRYPT]; + desc = flc->sh_desc; + + cnstr_shdsc_ablkcipher_givencap(desc, >cdata, + ivsize, is_rfc3686, ctx1_iv_off); + + flc->flc[1] = desc_len(desc); /* SDL */ + flc->flc_dma = dma_map_single(dev, flc, sizeof(flc->flc) + + desc_bytes(desc), DMA_TO_DEVICE); + if (dma_mapping_error(dev, flc->flc_dma)) { + dev_err(dev, "unable to map shared descriptor\n"); + return -ENOMEM; + } + + return 0; +} + +static int xts_ablkcipher_setkey(struct crypto_ablkcipher *ablkcipher, +const u8 *key, unsigned int keylen) +{ + struct caam_ctx *ctx = crypto_ablkcipher_ctx(ablkcipher); + struct device *dev =
[RFC PATCH 07/10] crypto: caam - add Queue Interface v2 error codes
Add support to translate error codes returned by QI v2, i.e. Queue Interface present on DataPath Acceleration Architecture v2 (DPAA2). Signed-off-by: Horia Geantă--- drivers/crypto/caam/error.c | 75 +++-- drivers/crypto/caam/error.h | 6 +++- drivers/crypto/caam/regs.h | 2 ++ 3 files changed, 79 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/caam/error.c b/drivers/crypto/caam/error.c index 3d639f3b45aa..65756bab800f 100644 --- a/drivers/crypto/caam/error.c +++ b/drivers/crypto/caam/error.c @@ -107,6 +107,54 @@ static const struct { { 0xF1, "3GPP HFN matches or exceeds the Threshold" }, }; +static const struct { + u8 value; + const char *error_text; +} qi_error_list[] = { + { 0x1F, "Job terminated by FQ or ICID flush" }, + { 0x20, "FD format error"}, + { 0x21, "FD command format error"}, + { 0x23, "FL format error"}, + { 0x25, "CRJD specified in FD, but not enabled in FLC"}, + { 0x30, "Max. buffer size too small"}, + { 0x31, "DHR exceeds max. buffer size (allocate mode, S/G format)"}, + { 0x32, "SGT exceeds max. buffer size (allocate mode, S/G format"}, + { 0x33, "Size over/underflow (allocate mode)"}, + { 0x34, "Size over/underflow (reuse mode)"}, + { 0x35, "Length exceeds max. short length (allocate mode, S/G/ format)"}, + { 0x36, "Memory footprint exceeds max. value (allocate mode, S/G/ format)"}, + { 0x41, "SBC frame format not supported (allocate mode)"}, + { 0x42, "Pool 0 invalid / pool 1 size < pool 0 size (allocate mode)"}, + { 0x43, "Annotation output enabled but ASAR = 0 (allocate mode)"}, + { 0x44, "Unsupported or reserved frame format or SGHR = 1 (reuse mode)"}, + { 0x45, "DHR correction underflow (reuse mode, single buffer format)"}, + { 0x46, "Annotation length exceeds offset (reuse mode)"}, + { 0x48, "Annotation output enabled but ASA limited by ASAR (reuse mode)"}, + { 0x49, "Data offset correction exceeds input frame data length (reuse mode)"}, + { 0x4B, "Annotation output enabled but ASA cannote be expanded (frame list)"}, + { 0x51, "Unsupported IF reuse mode"}, + { 0x52, "Unsupported FL use mode"}, + { 0x53, "Unsupported RJD use mode"}, + { 0x54, "Unsupported inline descriptor use mode"}, + { 0xC0, "Table buffer pool 0 depletion"}, + { 0xC1, "Table buffer pool 1 depletion"}, + { 0xC2, "Data buffer pool 0 depletion, no OF allocated"}, + { 0xC3, "Data buffer pool 1 depletion, no OF allocated"}, + { 0xC4, "Data buffer pool 0 depletion, partial OF allocated"}, + { 0xC5, "Data buffer pool 1 depletion, partial OF allocated"}, + { 0xD0, "FLC read error"}, + { 0xD1, "FL read error"}, + { 0xD2, "FL write error"}, + { 0xD3, "OF SGT write error"}, + { 0xD4, "PTA read error"}, + { 0xD5, "PTA write error"}, + { 0xD6, "OF SGT F-bit write error"}, + { 0xD7, "ASA write error"}, + { 0xE1, "FLC[ICR]=0 ICID error"}, + { 0xE2, "FLC[ICR]=1 ICID error"}, + { 0xE4, "source of ICID flush not trusted (BDI = 0)"}, +}; + static const char * const cha_id_list[] = { "", "AES", @@ -235,6 +283,27 @@ static void report_deco_status(struct device *jrdev, const u32 status, status, error, idx_str, idx, err_str, err_err_code); } +static void report_qi_status(struct device *qidev, const u32 status, +const char *error) +{ + u8 err_id = status & JRSTA_QIERR_ERROR_MASK; + const char *err_str = "unidentified error value 0x"; + char err_err_code[3] = { 0 }; + int i; + + for (i = 0; i < ARRAY_SIZE(qi_error_list); i++) + if (qi_error_list[i].value == err_id) + break; + + if (i != ARRAY_SIZE(qi_error_list) && qi_error_list[i].error_text) + err_str = qi_error_list[i].error_text; + else + snprintf(err_err_code, sizeof(err_err_code), "%02x", err_id); + + dev_err(qidev, "%08x: %s: %s%s\n", + status, error, err_str, err_err_code); +} + static void report_jr_status(struct device *jrdev, const u32 status, const char *error) { @@ -249,7 +318,7 @@ static void report_cond_code_status(struct device *jrdev, const u32 status, status, error, __func__); } -void caam_jr_strstatus(struct device *jrdev, u32 status) +void caam_strstatus(struct device *jrdev, u32 status, bool qi_v2) { static const struct stat_src { void (*report_ssed)(struct device *jrdev, const u32 status, @@ -261,7 +330,7 @@ void caam_jr_strstatus(struct device *jrdev, u32 status) { report_ccb_status, "CCB" }, { report_jump_status, "Jump" }, { report_deco_status, "DECO" }, - { NULL, "Queue Manager
[RFC PATCH 10/10] arm64: defconfig: enable CAAM crypto engine on QorIQ DPAA2 SoCs
Enable CAAM (Cryptographic Accelerator and Assurance Module) driver for QorIQ Data Path Acceleration Architecture (DPAA) v2. It handles DPSECI (Data Path SEC Interface) DPAA2 objects that sit on the Management Complex (MC) fsl-mc bus. Signed-off-by: Horia Geantă--- arch/arm64/configs/defconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index 6c7d147eed54..43455ad6fff5 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -561,6 +561,7 @@ CONFIG_MEMTEST=y CONFIG_SECURITY=y CONFIG_CRYPTO_ECHAINIV=y CONFIG_CRYPTO_ANSI_CPRNG=y +CONFIG_CRYPTO_DEV_FSL_DPAA2_CAAM=y CONFIG_ARM64_CRYPTO=y CONFIG_CRYPTO_SHA1_ARM64_CE=y CONFIG_CRYPTO_SHA2_ARM64_CE=y -- 2.12.0.264.gd6db3f216544
[RFC PATCH 06/10] crypto: caam - add DPAA2-CAAM (DPSECI) backend API
Add the low-level API that allows to manage DPSECI DPAA2 objects that sit on the Management Complex (MC) fsl-mc bus. The API is compatible with MC firmware 10.2.0+. Signed-off-by: Horia Geantă--- drivers/crypto/caam/dpseci.c | 858 +++ drivers/crypto/caam/dpseci.h | 395 ++ drivers/crypto/caam/dpseci_cmd.h | 261 3 files changed, 1514 insertions(+) create mode 100644 drivers/crypto/caam/dpseci.c create mode 100644 drivers/crypto/caam/dpseci.h create mode 100644 drivers/crypto/caam/dpseci_cmd.h diff --git a/drivers/crypto/caam/dpseci.c b/drivers/crypto/caam/dpseci.c new file mode 100644 index ..dec05ecbeab1 --- /dev/null +++ b/drivers/crypto/caam/dpseci.c @@ -0,0 +1,858 @@ +/* + * Copyright 2013-2016 Freescale Semiconductor Inc. + * Copyright 2017 NXP + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the names of the above-listed copyright holders nor the + * names of any contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +#include "../../../drivers/staging/fsl-mc/include/mc.h" +#include "../../../drivers/staging/fsl-mc/include/dpopr.h" +#include "dpseci.h" +#include "dpseci_cmd.h" + +/** + * dpseci_open() - Open a control session for the specified object + * @mc_io: Pointer to MC portal's I/O object + * @cmd_flags: Command flags; one or more of 'MC_CMD_FLAG_' + * @dpseci_id: DPSECI unique ID + * @token: Returned token; use in subsequent API calls + * + * This function can be used to open a control session for an already created + * object; an object may have been declared in the DPL or by calling the + * dpseci_create() function. + * This function returns a unique authentication token, associated with the + * specific object ID and the specific MC portal; this token must be used in all + * subsequent commands for this specific object. + * + * Return: '0' on success, error code otherwise + */ +int dpseci_open(struct fsl_mc_io *mc_io, u32 cmd_flags, int dpseci_id, + u16 *token) +{ + struct mc_command cmd = { 0 }; + struct dpseci_cmd_open *cmd_params; + int err; + + cmd.header = mc_encode_cmd_header(DPSECI_CMDID_OPEN, + cmd_flags, + 0); + cmd_params = (struct dpseci_cmd_open *)cmd.params; + cmd_params->dpseci_id = cpu_to_le32(dpseci_id); + err = mc_send_command(mc_io, ); + if (err) + return err; + + *token = mc_cmd_hdr_read_token(); + + return 0; +} + +/** + * dpseci_close() - Close the control session of the object + * @mc_io: Pointer to MC portal's I/O object + * @cmd_flags: Command flags; one or more of 'MC_CMD_FLAG_' + * @token: Token of DPSECI object + * + * After this function is called, no further operations are allowed on the + * object without opening a new control session. + * + * Return: '0' on success, error code otherwise + */ +int dpseci_close(struct fsl_mc_io *mc_io, u32 cmd_flags, u16 token) +{ + struct mc_command cmd = { 0 }; + + cmd.header = mc_encode_cmd_header(DPSECI_CMDID_CLOSE, + cmd_flags, + token); + return mc_send_command(mc_io, ); +} + +/** + * dpseci_create() - Create
[RFC PATCH 08/10] crypto: caam/qi2 - add DPAA2-CAAM driver
Add CAAM driver that works using the DPSECI backend, i.e. manages DPSECI DPAA2 objects sitting on the Management Complex (MC) fsl-mc bus. Data transfers (crypto requests) are sent/received to/from CAAM crypto engine via Queue Interface (v2), this being similar to existing caam/qi. OTOH, configuration/setup (obtaining virtual queue IDs, authorization etc.) is done by sending commands to the MC f/w. Note that the CAAM accelerator included in DPAA2 platforms still has Job Rings. However, the driver being added does not handle access via this backend. Kconfig & Makefile are updated such that DPAA2-CAAM (a.k.a. "caam/qi2") driver does not depend on caam/jr or caam/qi backends - which rely on platform bus support (ctrl.c). Support for the following aead and authenc algorithms is also added in this patch: -aead: gcm(aes) rfc4106(gcm(aes)) rfc4543(gcm(aes)) -authenc: authenc(hmac({md5,sha*}),cbc({aes,des,des3_ede})) echainiv(authenc(hmac({md5,sha*}),cbc({aes,des,des3_ede}))) authenc(hmac({md5,sha*}),rfc3686(ctr(aes)) seqiv(authenc(hmac({md5,sha*}),rfc3686(ctr(aes))) Signed-off-by: Horia Geantă--- drivers/crypto/Makefile |2 +- drivers/crypto/caam/Kconfig | 56 +- drivers/crypto/caam/Makefile |9 +- drivers/crypto/caam/caamalg_qi2.c | 3133 + drivers/crypto/caam/caamalg_qi2.h | 222 +++ drivers/crypto/caam/compat.h |1 + drivers/crypto/caam/key_gen.c | 30 - drivers/crypto/caam/key_gen.h | 30 + 8 files changed, 3432 insertions(+), 51 deletions(-) create mode 100644 drivers/crypto/caam/caamalg_qi2.c create mode 100644 drivers/crypto/caam/caamalg_qi2.h diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index b12eb3c99430..50c3436611f1 100644 --- a/drivers/crypto/Makefile +++ b/drivers/crypto/Makefile @@ -9,7 +9,7 @@ obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chelsio/ obj-$(CONFIG_CRYPTO_DEV_CPT) += cavium/cpt/ obj-$(CONFIG_CRYPTO_DEV_NITROX) += cavium/nitrox/ obj-$(CONFIG_CRYPTO_DEV_EXYNOS_RNG) += exynos-rng.o -obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM) += caam/ +obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM_COMMON) += caam/ obj-$(CONFIG_CRYPTO_DEV_GEODE) += geode-aes.o obj-$(CONFIG_CRYPTO_DEV_HIFN_795X) += hifn_795x.o obj-$(CONFIG_CRYPTO_DEV_IMGTEC_HASH) += img-hash.o diff --git a/drivers/crypto/caam/Kconfig b/drivers/crypto/caam/Kconfig index e36aeacd7635..e45d39d9007e 100644 --- a/drivers/crypto/caam/Kconfig +++ b/drivers/crypto/caam/Kconfig @@ -1,6 +1,10 @@ +config CRYPTO_DEV_FSL_CAAM_COMMON + tristate + config CRYPTO_DEV_FSL_CAAM - tristate "Freescale CAAM-Multicore driver backend" + tristate "Freescale CAAM-Multicore platform driver backend" depends on FSL_SOC || ARCH_MXC || ARCH_LAYERSCAPE + select CRYPTO_DEV_FSL_CAAM_COMMON help Enables the driver module for Freescale's Cryptographic Accelerator and Assurance Module (CAAM), also known as the SEC version 4 (SEC4). @@ -11,9 +15,19 @@ config CRYPTO_DEV_FSL_CAAM To compile this driver as a module, choose M here: the module will be called caam. +if CRYPTO_DEV_FSL_CAAM + +config CRYPTO_DEV_FSL_CAAM_IMX + def_bool SOC_IMX6 || SOC_IMX7D + +config CRYPTO_DEV_FSL_CAAM_DEBUG + bool "Enable debug output in CAAM driver" + help + Selecting this will enable printing of various debug + information in the CAAM driver. + config CRYPTO_DEV_FSL_CAAM_JR tristate "Freescale CAAM Job Ring driver backend" - depends on CRYPTO_DEV_FSL_CAAM default y help Enables the driver module for Job Rings which are part of @@ -24,9 +38,10 @@ config CRYPTO_DEV_FSL_CAAM_JR To compile this driver as a module, choose M here: the module will be called caam_jr. +if CRYPTO_DEV_FSL_CAAM_JR + config CRYPTO_DEV_FSL_CAAM_RINGSIZE int "Job Ring size" - depends on CRYPTO_DEV_FSL_CAAM_JR range 2 9 default "9" help @@ -44,7 +59,6 @@ config CRYPTO_DEV_FSL_CAAM_RINGSIZE config CRYPTO_DEV_FSL_CAAM_INTC bool "Job Ring interrupt coalescing" - depends on CRYPTO_DEV_FSL_CAAM_JR help Enable the Job Ring's interrupt coalescing feature. @@ -74,7 +88,6 @@ config CRYPTO_DEV_FSL_CAAM_INTC_TIME_THLD config CRYPTO_DEV_FSL_CAAM_CRYPTO_API tristate "Register algorithm implementations with the Crypto API" - depends on CRYPTO_DEV_FSL_CAAM_JR default y select CRYPTO_AEAD select CRYPTO_AUTHENC @@ -89,7 +102,7 @@ config CRYPTO_DEV_FSL_CAAM_CRYPTO_API config CRYPTO_DEV_FSL_CAAM_CRYPTO_API_QI tristate "Queue Interface as Crypto API backend" - depends on CRYPTO_DEV_FSL_CAAM_JR && FSL_DPAA && NET + depends on FSL_DPAA && NET default y select CRYPTO_AUTHENC select CRYPTO_BLKCIPHER @@ -106,7 +119,6 @@ config CRYPTO_DEV_FSL_CAAM_CRYPTO_API_QI config CRYPTO_DEV_FSL_CAAM_AHASH_API
[RFC PATCH 05/10] crypto: caam/qi - prepare for gcm(aes) support
Update gcm(aes) descriptors (generic, rfc4106 and rfc4543) such that they would also work when submitted via the QI interface. Signed-off-by: Horia Geantă--- drivers/crypto/caam/caamalg.c | 19 +++-- drivers/crypto/caam/caamalg_desc.c | 165 ++--- drivers/crypto/caam/caamalg_desc.h | 24 -- 3 files changed, 183 insertions(+), 25 deletions(-) diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c index e0d1b5c3c1ba..94e12ec8141c 100644 --- a/drivers/crypto/caam/caamalg.c +++ b/drivers/crypto/caam/caamalg.c @@ -290,6 +290,7 @@ static int gcm_set_sh_desc(struct crypto_aead *aead) { struct caam_ctx *ctx = crypto_aead_ctx(aead); struct device *jrdev = ctx->jrdev; + unsigned int ivsize = crypto_aead_ivsize(aead); u32 *desc; int rem_bytes = CAAM_DESC_BYTES_MAX - GCM_DESC_JOB_IO_LEN - ctx->cdata.keylen; @@ -311,7 +312,7 @@ static int gcm_set_sh_desc(struct crypto_aead *aead) } desc = ctx->sh_desc_enc; - cnstr_shdsc_gcm_encap(desc, >cdata, ctx->authsize); + cnstr_shdsc_gcm_encap(desc, >cdata, ivsize, ctx->authsize, false); dma_sync_single_for_device(jrdev, ctx->sh_desc_enc_dma, desc_bytes(desc), DMA_TO_DEVICE); @@ -328,7 +329,7 @@ static int gcm_set_sh_desc(struct crypto_aead *aead) } desc = ctx->sh_desc_dec; - cnstr_shdsc_gcm_decap(desc, >cdata, ctx->authsize); + cnstr_shdsc_gcm_decap(desc, >cdata, ivsize, ctx->authsize, false); dma_sync_single_for_device(jrdev, ctx->sh_desc_dec_dma, desc_bytes(desc), DMA_TO_DEVICE); @@ -349,6 +350,7 @@ static int rfc4106_set_sh_desc(struct crypto_aead *aead) { struct caam_ctx *ctx = crypto_aead_ctx(aead); struct device *jrdev = ctx->jrdev; + unsigned int ivsize = crypto_aead_ivsize(aead); u32 *desc; int rem_bytes = CAAM_DESC_BYTES_MAX - GCM_DESC_JOB_IO_LEN - ctx->cdata.keylen; @@ -370,7 +372,8 @@ static int rfc4106_set_sh_desc(struct crypto_aead *aead) } desc = ctx->sh_desc_enc; - cnstr_shdsc_rfc4106_encap(desc, >cdata, ctx->authsize); + cnstr_shdsc_rfc4106_encap(desc, >cdata, ivsize, ctx->authsize, + false); dma_sync_single_for_device(jrdev, ctx->sh_desc_enc_dma, desc_bytes(desc), DMA_TO_DEVICE); @@ -387,7 +390,8 @@ static int rfc4106_set_sh_desc(struct crypto_aead *aead) } desc = ctx->sh_desc_dec; - cnstr_shdsc_rfc4106_decap(desc, >cdata, ctx->authsize); + cnstr_shdsc_rfc4106_decap(desc, >cdata, ivsize, ctx->authsize, + false); dma_sync_single_for_device(jrdev, ctx->sh_desc_dec_dma, desc_bytes(desc), DMA_TO_DEVICE); @@ -409,6 +413,7 @@ static int rfc4543_set_sh_desc(struct crypto_aead *aead) { struct caam_ctx *ctx = crypto_aead_ctx(aead); struct device *jrdev = ctx->jrdev; + unsigned int ivsize = crypto_aead_ivsize(aead); u32 *desc; int rem_bytes = CAAM_DESC_BYTES_MAX - GCM_DESC_JOB_IO_LEN - ctx->cdata.keylen; @@ -430,7 +435,8 @@ static int rfc4543_set_sh_desc(struct crypto_aead *aead) } desc = ctx->sh_desc_enc; - cnstr_shdsc_rfc4543_encap(desc, >cdata, ctx->authsize); + cnstr_shdsc_rfc4543_encap(desc, >cdata, ivsize, ctx->authsize, + false); dma_sync_single_for_device(jrdev, ctx->sh_desc_enc_dma, desc_bytes(desc), DMA_TO_DEVICE); @@ -447,7 +453,8 @@ static int rfc4543_set_sh_desc(struct crypto_aead *aead) } desc = ctx->sh_desc_dec; - cnstr_shdsc_rfc4543_decap(desc, >cdata, ctx->authsize); + cnstr_shdsc_rfc4543_decap(desc, >cdata, ivsize, ctx->authsize, + false); dma_sync_single_for_device(jrdev, ctx->sh_desc_dec_dma, desc_bytes(desc), DMA_TO_DEVICE); diff --git a/drivers/crypto/caam/caamalg_desc.c b/drivers/crypto/caam/caamalg_desc.c index 530c14ee32de..54c6ff2ff975 100644 --- a/drivers/crypto/caam/caamalg_desc.c +++ b/drivers/crypto/caam/caamalg_desc.c @@ -587,10 +587,13 @@ EXPORT_SYMBOL(cnstr_shdsc_aead_givencap); * @desc: pointer to buffer used for descriptor construction * @cdata: pointer to block cipher transform definitions * Valid algorithm values - OP_ALG_ALGSEL_AES ANDed with OP_ALG_AAI_GCM. + * @ivsize: initialization vector size * @icvsize: integrity check value (ICV) size (truncated or full) + * @is_qi: true when called from caam/qi */ void cnstr_shdsc_gcm_encap(u32 * const desc, struct alginfo *cdata, - unsigned int icvsize) + unsigned int ivsize,
[RFC PATCH 02/10] staging: fsl-mc: dpio: add congestion notification support
Add support for Congestion State Change Notifications (CSCN), which allow DPIO users to be notified when a congestion group changes its state (due to hitting the entrance / exit threshold). Signed-off-by: Ioana RadulescuSigned-off-by: Radu Alexe Signed-off-by: Horia Geantă --- drivers/staging/fsl-mc/include/dpaa2-io.h | 43 +++ 1 file changed, 43 insertions(+) diff --git a/drivers/staging/fsl-mc/include/dpaa2-io.h b/drivers/staging/fsl-mc/include/dpaa2-io.h index 002829cecd75..e7af7d647ab1 100644 --- a/drivers/staging/fsl-mc/include/dpaa2-io.h +++ b/drivers/staging/fsl-mc/include/dpaa2-io.h @@ -136,4 +136,47 @@ struct dpaa2_io_store *dpaa2_io_store_create(unsigned int max_frames, void dpaa2_io_store_destroy(struct dpaa2_io_store *s); struct dpaa2_dq *dpaa2_io_store_next(struct dpaa2_io_store *s, int *is_last); +/***/ +/* CSCN*/ +/***/ + +/** + * struct dpaa2_cscn - The CSCN message format + * @verb: identifies the type of message (should be 0x27). + * @stat: status bits related to dequeuing response (not used) + * @state: bit 0 = 0/1 if CG is no/is congested + * @reserved: reserved byte + * @cgid: congest grp ID - the first 16 bits + * @ctx: context data + * + * Congestion management can be implemented in software through + * the use of Congestion State Change Notifications (CSCN). These + * are messages written by DPAA2 hardware to memory whenever the + * instantaneous count (I_CNT field in the CG) exceeds the + * Congestion State (CS) entrance threshold, signifying congestion + * entrance, or when the instantaneous count returns below exit + * threshold, signifying congestion exit. The format of the message + * is given by the dpaa2_cscn structure. Bit 0 of the state field + * represents congestion state written by the hardware. + */ +struct dpaa2_cscn { + u8 verb; + u8 stat; + u8 state; + u8 reserved; + __le32 cgid; + __le64 ctx; +}; + +#define DPAA2_CSCN_SIZE64 +#define DPAA2_CSCN_ALIGN 16 + +#define DPAA2_CSCN_STATE_MASK 0x1 +#define DPAA2_CSCN_CONGESTED 1 + +static inline bool dpaa2_cscn_state_congested(struct dpaa2_cscn *cscn) +{ + return ((cscn->state & DPAA2_CSCN_STATE_MASK) == DPAA2_CSCN_CONGESTED); +} + #endif /* __FSL_DPAA2_IO_H */ -- 2.12.0.264.gd6db3f216544
Re: [PATCH v5 2/5] lib: Add zstd modules
On 08/10/2017 04:30 AM, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: The memory reported is the amount of memory the compressor requests. | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | |--|--|--|---|-|--|--| | none | 11988480 |0.100 | 1 | 2119.88 |- |- | | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). I did btrfs benchmarks of kernel trees and other normal data sets as well. The numbers were in line with what Nick is posting here. zstd is a big win over both lzo and zlib from a btrfs point of view. It's true Nick's patches only support a single compression level in btrfs, but that's because btrfs doesn't have a way to pass in the compression ratio. It could easily be a mount option, it was just outside the scope of Nick's initial work. -chris
Re: [PATCH v5 2/5] lib: Add zstd modules
On Thu, Aug 10, 2017 at 10:57:01AM -0400, Austin S. Hemmelgarn wrote: > Also didn't think to mention this, but I could see the max level > being very popular for use with SquashFS root filesystems used in > LiveCD's. Currently, they have to decide between read performance > and image size, while zstd would provide both. The high compression levels of Zstandard are indeed a great fit for SquashFS, but SquashFS images are created in userspace by squashfs-tools. The kernel only needs to be able to decompress them. (Also, while Zstandard provides very good tradeoffs and will probably become the preferred algorithm for SquashFS, it's misleading to imply that users won't have to make decisions anymore. It does not compress as well as XZ or decompress as fast as LZ4, except maybe in very carefully crafted benchmarks.) Eric
Re: [PATCH v5 2/5] lib: Add zstd modules
On Thu, Aug 10, 2017 at 07:32:18AM -0400, Austin S. Hemmelgarn wrote: > On 2017-08-10 04:30, Eric Biggers wrote: > >On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: > >> > >>It can compress at speeds approaching lz4, and quality approaching lzma. > > > >Well, for a very loose definition of "approaching", and certainly not at the > >same time. I doubt there's a use case for using the highest compression > >levels > >in kernel mode --- especially the ones using zstd_opt.h. > Large data-sets with WORM access patterns and infrequent writes > immediately come to mind as a use case for the highest compression > level. > > As a more specific example, the company I work for has a very large > amount of documentation, and we keep all old versions. This is all > stored on a file server which is currently using BTRFS. Once a > document is written, it's almost never rewritten, so write > performance only matters for the first write. However, they're read > back pretty frequently, so we need good read performance. As of > right now, the system is set to use LZO compression by default, and > then when a new document is added, the previous version of that > document gets re-compressed using zlib compression, which actually > results in pretty significant space savings most of the time. I > would absolutely love to use zstd compression with this system with > the highest compression level, because most people don't care how > long it takes to write the file out, but they do care how long it > takes to read a file (even if it's an older version). This may be a reasonable use case, but note this cannot just be the regular "zstd" compression setting, since filesystem compression by default must provide reasonable performance for many different access patterns. See the patch in this series which actually adds zstd compression to btrfs; it only uses level 1. I do not see a patch which adds a higher compression mode. It would need to be a special setting like "zstdhc" that users could opt-in to on specific directories. It also would need to be compared to simply compressing in userspace. In many cases compressing in userspace is probably the better solution for the use case in question because it works on any filesystem, allows using any compression algorithm, and if random access is not needed it is possible to compress each file as a single stream (like a .xz file), which produces a much better compression ratio than the block-by-block compression that filesystems have to use. Note also that LZ4HC is in the kernel source tree currently but no one is using it vs. the regular LZ4. I think it is the kind of thing that sounded useful originally, but at the end of the day no one really wants to use it in kernel mode. I'd certainly be interested in actual patches, though. Eric
Re: [PATCH v5 2/5] lib: Add zstd modules
On 2017-08-10 07:32, Austin S. Hemmelgarn wrote: On 2017-08-10 04:30, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. Large data-sets with WORM access patterns and infrequent writes immediately come to mind as a use case for the highest compression level. As a more specific example, the company I work for has a very large amount of documentation, and we keep all old versions. This is all stored on a file server which is currently using BTRFS. Once a document is written, it's almost never rewritten, so write performance only matters for the first write. However, they're read back pretty frequently, so we need good read performance. As of right now, the system is set to use LZO compression by default, and then when a new document is added, the previous version of that document gets re-compressed using zlib compression, which actually results in pretty significant space savings most of the time. I would absolutely love to use zstd compression with this system with the highest compression level, because most people don't care how long it takes to write the file out, but they do care how long it takes to read a file (even if it's an older version). Also didn't think to mention this, but I could see the max level being very popular for use with SquashFS root filesystems used in LiveCD's. Currently, they have to decide between read performance and image size, while zstd would provide both.
Re: [PATCH v4 2/4] crypto: add crypto_(un)register_ahashes()
On 08/10/2017 02:53 PM, Lars Persson wrote: From: Rabin VincentThere are already helpers to (un)register multiple normal and AEAD algos. Add one for ahashes too. Signed-off-by: Lars Persson Signed-off-by: Rabin Vincent --- v4: crypto_register_skciphers was used where crypto_unregister_skciphers was intended. The v4 change comment above in fact belongs to patch 3/4 of this series. Sorry for the confusion. BR, Lars
[PATCH] crypto: AF_ALG - get_page upon reassignment to TX SGL
Hi Herbert, The error can be triggered with the following test. Invoking that test in a while [ 1 ] loop shows that no memory is leaked. #include #include int main(int argc, char *argv[]) { char buf[8192]; struct kcapi_handle *handle; struct iovec iov; int ret; (void)argc; (void)argv; iov.iov_base = buf; ret = kcapi_cipher_init(, "ctr(aes)", 0); if (ret) return ret; ret = kcapi_cipher_setkey(handle, (unsigned char *)"0123456789abcdef", 16); if (ret) return ret; ret = kcapi_cipher_stream_init_enc(handle, (unsigned char *)"0123456789abcdef", NULL, 0); if (ret < 0) return ret; iov.iov_len = 4152; ret = kcapi_cipher_stream_update(handle, , 1); if (ret < 0) return ret; iov.iov_len = 4096; ret = kcapi_cipher_stream_op(handle, , 1); if (ret < 0) return ret; kcapi_cipher_destroy(handle); return 0; } ---8<--- When a page is assigned to a TX SGL, call get_page to increment the reference counter. It is possible that one page is referenced in multiple SGLs: - in the global TX SGL in case a previous af_alg_pull_tsgl only reassigned parts of a page to a per-request TX SGL - in the per-request TX SGL as assigned by af_alg_pull_tsgl Note, multiple requests can be active at the same time whose TX SGLs all point to different parts of the same page. Signed-off-by: Stephan Mueller--- crypto/af_alg.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index d6936c0e08d9..ffa9f4ccd9b4 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -641,9 +641,9 @@ void af_alg_pull_tsgl(struct sock *sk, size_t used, struct scatterlist *dst, if (dst_offset >= plen) { /* discard page before offset */ dst_offset -= plen; - put_page(page); } else { /* reassign page to dst after offset */ + get_page(page); sg_set_page(dst + j, page, plen - dst_offset, sg[i].offset + dst_offset); @@ -661,9 +661,7 @@ void af_alg_pull_tsgl(struct sock *sk, size_t used, struct scatterlist *dst, if (sg[i].length) return; - if (!dst) - put_page(page); - + put_page(page); sg_assign_page(sg + i, NULL); } -- 2.13.4
Re: [Freedombox-discuss] Hardware Crypto
To me it seems obvious that if the hardware provides a real RNG, that should be used to feed random(4). This solves a genuine problem and, even if calls to the hardware are expensive, overall overhead will not be high because random(4) does not need huge amounts of input. I'm much less certain hardware acceleration is worthwhile for ciphers & hashes, except where the CPU itself includes instructions to speed them up.
Re: [PATCH v8 1/4] crypto: AF_ALG -- add sign/verify API
Am Donnerstag, 10. August 2017, 15:59:33 CEST schrieb Tudor Ambarus: Hi Tudor, > On 08/10/2017 04:03 PM, Stephan Mueller wrote: > > Is there a style requirement for that? checkpatch.pl does not complain. I > > thought that one liners in a conditional should not have braces? > > Linux coding style requires braces in both branches when you have a > branch with a statement and the other with multiple statements. > > Checkpatch complains about this when you run it with --strict option. Ok, then I will add it. Thanks Ciao Stephan
Re: [PATCH v8 1/4] crypto: AF_ALG -- add sign/verify API
On 08/10/2017 04:03 PM, Stephan Mueller wrote: Is there a style requirement for that? checkpatch.pl does not complain. I thought that one liners in a conditional should not have braces? Linux coding style requires braces in both branches when you have a branch with a statement and the other with multiple statements. Checkpatch complains about this when you run it with --strict option. Cheers, ta
Re: [PATCH v8 1/4] crypto: AF_ALG -- add sign/verify API
Am Donnerstag, 10. August 2017, 14:49:39 CEST schrieb Tudor Ambarus: Hi Tudor, thanks for reviewing > > > > - err = ctx->enc ? crypto_aead_encrypt(>cra_u.aead_req) : > > -crypto_aead_decrypt(>cra_u.aead_req); > > - } else { > > + } else > > Unbalanced braces around else statement. Is there a style requirement for that? checkpatch.pl does not complain. I thought that one liners in a conditional should not have braces? > > - ctx->enc = 0; > > + ctx->op = 0; > > This implies decryption. Should we change the value of ALG_OP_DECRYPT? ALG_OP_DECRYPT is a user space interface, so we cannot change it. Do you see harm in leaving it as is? Note, I did not want to introduce functional changes that have no bearing on the addition of the sign/verify API. If you think this is problematic, I would like to add another patch that is dedicated to fix this. > > - err = ctx->enc ? > > - crypto_skcipher_encrypt(>cra_u.skcipher_req) : > > - crypto_skcipher_decrypt(>cra_u.skcipher_req); > > - } else { > > + } else > > Unbalanced braces around else statement. Same as above. Thanks a lot! Ciao Stephan
[PATCH v4 4/4] MAINTAINERS: Add ARTPEC crypto maintainer
Assign the Axis kernel team as maintainer for crypto drivers under drivers/crypto/axis. Signed-off-by: Lars Persson--- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index d5b6c71e783e..72186cf9820d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1129,6 +1129,7 @@ L:linux-arm-ker...@axis.com F: arch/arm/mach-artpec F: arch/arm/boot/dts/artpec6* F: drivers/clk/axis +F: drivers/crypto/axis F: drivers/pinctrl/pinctrl-artpec* F: Documentation/devicetree/bindings/pinctrl/axis,artpec6-pinctrl.txt -- 2.11.0
[PATCH v4 2/4] crypto: add crypto_(un)register_ahashes()
From: Rabin VincentThere are already helpers to (un)register multiple normal and AEAD algos. Add one for ahashes too. Signed-off-by: Lars Persson Signed-off-by: Rabin Vincent --- v4: crypto_register_skciphers was used where crypto_unregister_skciphers was intended. crypto/ahash.c | 29 + include/crypto/internal/hash.h | 2 ++ 2 files changed, 31 insertions(+) diff --git a/crypto/ahash.c b/crypto/ahash.c index 826cd7ab4d4a..5e8666e6ccae 100644 --- a/crypto/ahash.c +++ b/crypto/ahash.c @@ -588,6 +588,35 @@ int crypto_unregister_ahash(struct ahash_alg *alg) } EXPORT_SYMBOL_GPL(crypto_unregister_ahash); +int crypto_register_ahashes(struct ahash_alg *algs, int count) +{ + int i, ret; + + for (i = 0; i < count; i++) { + ret = crypto_register_ahash([i]); + if (ret) + goto err; + } + + return 0; + +err: + for (--i; i >= 0; --i) + crypto_unregister_ahash([i]); + + return ret; +} +EXPORT_SYMBOL_GPL(crypto_register_ahashes); + +void crypto_unregister_ahashes(struct ahash_alg *algs, int count) +{ + int i; + + for (i = count - 1; i >= 0; --i) + crypto_unregister_ahash([i]); +} +EXPORT_SYMBOL_GPL(crypto_unregister_ahashes); + int ahash_register_instance(struct crypto_template *tmpl, struct ahash_instance *inst) { diff --git a/include/crypto/internal/hash.h b/include/crypto/internal/hash.h index f6d9af3efa45..f0b44c16e88f 100644 --- a/include/crypto/internal/hash.h +++ b/include/crypto/internal/hash.h @@ -76,6 +76,8 @@ static inline int crypto_ahash_walk_last(struct crypto_hash_walk *walk) int crypto_register_ahash(struct ahash_alg *alg); int crypto_unregister_ahash(struct ahash_alg *alg); +int crypto_register_ahashes(struct ahash_alg *algs, int count); +void crypto_unregister_ahashes(struct ahash_alg *algs, int count); int ahash_register_instance(struct crypto_template *tmpl, struct ahash_instance *inst); void ahash_free_instance(struct crypto_instance *inst); -- 2.11.0
[PATCH v4 0/4] crypto: add driver for Axis ARTPEC crypto accelerator
This series adds a driver for the crypto accelerator in the ARTPEC series of SoCs from Axis Communications AB. Changelog v4: - The skcipher conversion had a mistake where the algos were registered instead of unregistered at module unloading. Changelog v3: - The patch author added his Signed-off-by on patch 2. Changelog v2: - Use xts_check_key() for xts keys. - Use CRYPTO_ALG_TYPE_SKCIPHER instead of CRYPTO_ALG_TYPE_ABLKCIPHER in cra_flags. Lars Persson (3): dt-bindings: crypto: add ARTPEC crypto crypto: axis: add ARTPEC-6/7 crypto accelerator driver MAINTAINERS: Add ARTPEC crypto maintainer Rabin Vincent (1): crypto: add crypto_(un)register_ahashes() .../devicetree/bindings/crypto/artpec6-crypto.txt | 16 + MAINTAINERS|1 + crypto/ahash.c | 29 + drivers/crypto/Kconfig | 21 + drivers/crypto/Makefile|1 + drivers/crypto/axis/Makefile |1 + drivers/crypto/axis/artpec6_crypto.c | 3192 include/crypto/internal/hash.h |2 + 8 files changed, 3263 insertions(+) create mode 100644 Documentation/devicetree/bindings/crypto/artpec6-crypto.txt create mode 100644 drivers/crypto/axis/Makefile create mode 100644 drivers/crypto/axis/artpec6_crypto.c -- 2.11.0
[PATCH v4 1/4] dt-bindings: crypto: add ARTPEC crypto
Document the device tree bindings for the ARTPEC crypto accelerator on ARTPEC-6 and ARTPEC-7 SoCs. Acked-by: Rob HerringSigned-off-by: Lars Persson --- .../devicetree/bindings/crypto/artpec6-crypto.txt| 16 1 file changed, 16 insertions(+) create mode 100644 Documentation/devicetree/bindings/crypto/artpec6-crypto.txt diff --git a/Documentation/devicetree/bindings/crypto/artpec6-crypto.txt b/Documentation/devicetree/bindings/crypto/artpec6-crypto.txt new file mode 100644 index ..d9cca4875bd6 --- /dev/null +++ b/Documentation/devicetree/bindings/crypto/artpec6-crypto.txt @@ -0,0 +1,16 @@ +Axis crypto engine with PDMA interface. + +Required properties: +- compatible : Should be one of the following strings: + "axis,artpec6-crypto" for the version in the Axis ARTPEC-6 SoC + "axis,artpec7-crypto" for the version in the Axis ARTPEC-7 SoC. +- reg: Base address and size for the PDMA register area. +- interrupts: Interrupt handle for the PDMA interrupt line. + +Example: + +crypto@f4264000 { + compatible = "axis,artpec6-crypto"; + reg = <0xf4264000 0x1000>; + interrupts = ; +}; -- 2.11.0
[PATCH v4 3/4] crypto: axis: add ARTPEC-6/7 crypto accelerator driver
This is an asynchronous crypto API driver for the accelerator present in the ARTPEC-6 and -7 SoCs from Axis Communications AB. The driver supports AES in ECB/CTR/CBC/XTS/GCM modes and SHA1/2 hash standards. Signed-off-by: Lars Persson--- drivers/crypto/Kconfig | 21 + drivers/crypto/Makefile |1 + drivers/crypto/axis/Makefile |1 + drivers/crypto/axis/artpec6_crypto.c | 3192 ++ 4 files changed, 3215 insertions(+) create mode 100644 drivers/crypto/axis/Makefile create mode 100644 drivers/crypto/axis/artpec6_crypto.c diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index 5b5393f1b87a..fe33c199fc1a 100644 --- a/drivers/crypto/Kconfig +++ b/drivers/crypto/Kconfig @@ -708,4 +708,25 @@ config CRYPTO_DEV_SAFEXCEL chain mode, AES cipher mode and SHA1/SHA224/SHA256/SHA512 hash algorithms. +config CRYPTO_DEV_ARTPEC6 + tristate "Support for Axis ARTPEC-6/7 hardware crypto acceleration." + depends on ARM && (ARCH_ARTPEC || COMPILE_TEST) + depends on HAS_DMA + depends on OF + select CRYPTO_AEAD + select CRYPTO_AES + select CRYPTO_ALGAPI + select CRYPTO_BLKCIPHER + select CRYPTO_CTR + select CRYPTO_HASH + select CRYPTO_SHA1 + select CRYPTO_SHA256 + select CRYPTO_SHA384 + select CRYPTO_SHA512 + help + Enables the driver for the on-chip crypto accelerator + of Axis ARTPEC SoCs. + + To compile this driver as a module, choose M here. + endif # CRYPTO_HW diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index de629165fde7..7bf0997eae25 100644 --- a/drivers/crypto/Makefile +++ b/drivers/crypto/Makefile @@ -44,3 +44,4 @@ obj-$(CONFIG_CRYPTO_DEV_VIRTIO) += virtio/ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/ obj-$(CONFIG_CRYPTO_DEV_BCM_SPU) += bcm/ obj-$(CONFIG_CRYPTO_DEV_SAFEXCEL) += inside-secure/ +obj-$(CONFIG_CRYPTO_DEV_ARTPEC6) += axis/ diff --git a/drivers/crypto/axis/Makefile b/drivers/crypto/axis/Makefile new file mode 100644 index ..be9a84a4b667 --- /dev/null +++ b/drivers/crypto/axis/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_CRYPTO_DEV_ARTPEC6) := artpec6_crypto.o diff --git a/drivers/crypto/axis/artpec6_crypto.c b/drivers/crypto/axis/artpec6_crypto.c new file mode 100644 index ..d9fbbf01062b --- /dev/null +++ b/drivers/crypto/axis/artpec6_crypto.c @@ -0,0 +1,3192 @@ +/* + * Driver for ARTPEC-6 crypto block using the kernel asynchronous crypto api. + * + *Copyright (C) 2014-2017 Axis Communications AB + */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +/* Max length of a line in all cache levels for Artpec SoCs. */ +#define ARTPEC_CACHE_LINE_MAX 32 + +#define PDMA_OUT_CFG 0x +#define PDMA_OUT_BUF_CFG 0x0004 +#define PDMA_OUT_CMD 0x0008 +#define PDMA_OUT_DESCRQ_PUSH 0x0010 +#define PDMA_OUT_DESCRQ_STAT 0x0014 + +#define A6_PDMA_IN_CFG 0x0028 +#define A6_PDMA_IN_BUF_CFG 0x002c +#define A6_PDMA_IN_CMD 0x0030 +#define A6_PDMA_IN_STATQ_PUSH 0x0038 +#define A6_PDMA_IN_DESCRQ_PUSH 0x0044 +#define A6_PDMA_IN_DESCRQ_STAT 0x0048 +#define A6_PDMA_INTR_MASK 0x0068 +#define A6_PDMA_ACK_INTR 0x006c +#define A6_PDMA_MASKED_INTR0x0074 + +#define A7_PDMA_IN_CFG 0x002c +#define A7_PDMA_IN_BUF_CFG 0x0030 +#define A7_PDMA_IN_CMD 0x0034 +#define A7_PDMA_IN_STATQ_PUSH 0x003c +#define A7_PDMA_IN_DESCRQ_PUSH 0x0048 +#define A7_PDMA_IN_DESCRQ_STAT 0x004C +#define A7_PDMA_INTR_MASK 0x006c +#define A7_PDMA_ACK_INTR 0x0070 +#define A7_PDMA_MASKED_INTR0x0078 + +#define PDMA_OUT_CFG_ENBIT(0) + +#define PDMA_OUT_BUF_CFG_DATA_BUF_SIZE GENMASK(4, 0) +#define PDMA_OUT_BUF_CFG_DESCR_BUF_SIZEGENMASK(9, 5) + +#define PDMA_OUT_CMD_START BIT(0) +#define A6_PDMA_OUT_CMD_STOP BIT(3) +#define A7_PDMA_OUT_CMD_STOP BIT(2) + +#define PDMA_OUT_DESCRQ_PUSH_LEN GENMASK(5, 0) +#define PDMA_OUT_DESCRQ_PUSH_ADDR GENMASK(31, 6) + +#define PDMA_OUT_DESCRQ_STAT_LEVEL GENMASK(3, 0) +#define PDMA_OUT_DESCRQ_STAT_SIZE GENMASK(7, 4) + +#define PDMA_IN_CFG_EN BIT(0) + +#define PDMA_IN_BUF_CFG_DATA_BUF_SIZE GENMASK(4, 0) +#define PDMA_IN_BUF_CFG_DESCR_BUF_SIZE GENMASK(9, 5) +#define PDMA_IN_BUF_CFG_STAT_BUF_SIZE GENMASK(14, 10) + +#define PDMA_IN_CMD_START BIT(0) +#define A6_PDMA_IN_CMD_FLUSH_STAT BIT(2) +#define A6_PDMA_IN_CMD_STOPBIT(3) +#define A7_PDMA_IN_CMD_FLUSH_STAT
Re: [PATCH v8 1/4] crypto: AF_ALG -- add sign/verify API
Hi, Stephan, On 08/10/2017 09:39 AM, Stephan Müller wrote: Add the flags for handling signature generation and signature verification. The af_alg helper code as well as the algif_skcipher and algif_aead code must be changed from a boolean indicating the cipher operation to an integer because there are now 4 different cipher operations that are defined. Yet, the algif_aead and algif_skcipher code still only allows encryption and decryption cipher operations. Signed-off-by: Stephan MuellerSigned-off-by: Tadeusz Struk --- crypto/af_alg.c | 10 +- crypto/algif_aead.c | 36 crypto/algif_skcipher.c | 26 +- include/crypto/if_alg.h | 4 ++-- include/uapi/linux/if_alg.h | 2 ++ 5 files changed, 50 insertions(+), 28 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index d6936c0e08d9..a35a9f854a04 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -859,7 +859,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, struct af_alg_tsgl *sgl; struct af_alg_control con = {}; long copied = 0; - bool enc = 0; + int op = 0; bool init = 0; int err = 0; @@ -870,11 +870,11 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, init = 1; switch (con.op) { + case ALG_OP_VERIFY: + case ALG_OP_SIGN: case ALG_OP_ENCRYPT: - enc = 1; - break; case ALG_OP_DECRYPT: - enc = 0; + op = con.op; break; default: return -EINVAL; @@ -891,7 +891,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, } if (init) { - ctx->enc = enc; + ctx->op = op; if (con.iv) memcpy(ctx->iv, con.iv->iv, ivsize); diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index 516b38c3a169..77abc04cf942 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -60,7 +60,7 @@ static inline bool aead_sufficient_data(struct sock *sk) * The minimum amount of memory needed for an AEAD cipher is * the AAD and in case of decryption the tag. */ - return ctx->used >= ctx->aead_assoclen + (ctx->enc ? 0 : as); + return ctx->used >= ctx->aead_assoclen + (ctx->op ? 0 : as); } static int aead_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) @@ -137,7 +137,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, * buffer provides the tag which is consumed resulting in only the * plaintext without a buffer for the tag returned to the caller. */ - if (ctx->enc) + if (ctx->op) outlen = used + as; else outlen = used - as; @@ -196,7 +196,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, /* Use the RX SGL as source (and destination) for crypto op. */ src = areq->first_rsgl.sgl.sg; - if (ctx->enc) { + if (ctx->op == ALG_OP_ENCRYPT) { /* * Encryption operation - The in-place cipher operation is * achieved by the following operation: @@ -212,7 +212,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, if (err) goto free; af_alg_pull_tsgl(sk, processed, NULL, 0); - } else { + } else if (ctx->op == ALG_OP_DECRYPT) { /* * Decryption operation - To achieve an in-place cipher * operation, the following SGL structure is used: @@ -258,6 +258,9 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, } else /* no RX SGL present (e.g. authentication only) */ src = areq->tsgl; + } else { + err = -EOPNOTSUPP; + goto free; } /* Initialize the crypto operation */ @@ -272,19 +275,28 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, aead_request_set_callback(>cra_u.aead_req, CRYPTO_TFM_REQ_MAY_BACKLOG, af_alg_async_cb, areq); - err = ctx->enc ? crypto_aead_encrypt(>cra_u.aead_req) : -crypto_aead_decrypt(>cra_u.aead_req); - } else { + } else Unbalanced braces around else statement. /* Synchronous operation */ aead_request_set_callback(>cra_u.aead_req, CRYPTO_TFM_REQ_MAY_BACKLOG, af_alg_complete,
Re: [PATCH v5 2/5] lib: Add zstd modules
On 2017-08-10 04:30, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. Large data-sets with WORM access patterns and infrequent writes immediately come to mind as a use case for the highest compression level. As a more specific example, the company I work for has a very large amount of documentation, and we keep all old versions. This is all stored on a file server which is currently using BTRFS. Once a document is written, it's almost never rewritten, so write performance only matters for the first write. However, they're read back pretty frequently, so we need good read performance. As of right now, the system is set to use LZO compression by default, and then when a new document is added, the previous version of that document gets re-compressed using zlib compression, which actually results in pretty significant space savings most of the time. I would absolutely love to use zstd compression with this system with the highest compression level, because most people don't care how long it takes to write the file out, but they do care how long it takes to read a file (even if it's an older version). The code was ported from the upstream zstd source repository. What version? `linux/zstd.h` header was modified to match linux kernel style. The cross-platform and allocation code was stripped out. Instead zstd requires the caller to pass a preallocated workspace. The source files were clang-formatted [1] to match the Linux Kernel style as much as possible. It would be easier to compare to the upstream version if it was not all reformatted. There is a chance that bugs were introduced by Linux-specific changes, and it would be nice if they could be easily reviewed. (Also I don't know what clang-format settings you used, but there are still a lot of differences from the Linux coding style.) I benchmarked zstd compression as a special character device. I ran zstd and zlib compression at several levels, as well as performing no compression, which measure the time spent copying the data to kernel space. Data is passed to the compresser 4096 B at a time. The benchmark file is located in the upstream zstd source repository under `contrib/linux-kernel/zstd_compress_test.c` [2]. I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is 211,988,480 B large. Run the following commands for the benchmark: sudo modprobe zstd_compress_test sudo mknod zstd_compress_test c 245 0 sudo cp silesia.tar zstd_compress_test The time is reported by the time of the userland `cp`. The MB/s is computed with 1,536,217,008 B / time(buffer size, hash) which includes the time to copy from userland. The Adjusted MB/s is computed with 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). The memory reported is the amount of memory the compressor requests. | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | |--|--|--|---|-|--|--| | none | 11988480 |0.100 | 1 | 2119.88 |- |- | | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes).
Re: [PATCH v5 2/5] lib: Add zstd modules
On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: > > It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. > > The code was ported from the upstream zstd source repository. What version? > `linux/zstd.h` header was modified to match linux kernel style. > The cross-platform and allocation code was stripped out. Instead zstd > requires the caller to pass a preallocated workspace. The source files > were clang-formatted [1] to match the Linux Kernel style as much as > possible. It would be easier to compare to the upstream version if it was not all reformatted. There is a chance that bugs were introduced by Linux-specific changes, and it would be nice if they could be easily reviewed. (Also I don't know what clang-format settings you used, but there are still a lot of differences from the Linux coding style.) > > I benchmarked zstd compression as a special character device. I ran zstd > and zlib compression at several levels, as well as performing no > compression, which measure the time spent copying the data to kernel space. > Data is passed to the compresser 4096 B at a time. The benchmark file is > located in the upstream zstd source repository under > `contrib/linux-kernel/zstd_compress_test.c` [2]. > > I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. > The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, > 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is > 211,988,480 B large. Run the following commands for the benchmark: > > sudo modprobe zstd_compress_test > sudo mknod zstd_compress_test c 245 0 > sudo cp silesia.tar zstd_compress_test > > The time is reported by the time of the userland `cp`. > The MB/s is computed with > > 1,536,217,008 B / time(buffer size, hash) > > which includes the time to copy from userland. > The Adjusted MB/s is computed with > > 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). > > The memory reported is the amount of memory the compressor requests. > > | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | > |--|--|--|---|-|--|--| > | none | 11988480 |0.100 | 1 | 2119.88 |- |- | > | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | > | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | > | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | > | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | > | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | > | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | > | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | > | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | > | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | > | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | > Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). Eric
Re: [PATCH v2] crypto: AF_ALG - consolidation of duplicate code
Am Donnerstag, 10. August 2017, 10:21:53 CEST schrieb Herbert Xu: Hi Herbert, > On Thu, Aug 10, 2017 at 10:16:48AM +0200, Stephan Mueller wrote: > > As now the AIO code path is updated, the bug that I was reporting last > > September allowing to crash the kernel via AF_ALG is fixed. > > > > As the patch is very invasive, I am not sure that patch set should be sent > > to stable. How do you propose we fix the crash bug in older kernels that > > are due to memory management problems in the AIO code path? > > Is it possible to create a minimal fix for the stable kernels? I think there is such patch already, see [1]. Your comment to that patch triggered my rewrite of the memory managment code. [1] https://www.spinics.net/lists/linux-crypto/msg21618.html Ciao Stephan
Re: [PATCH v2] crypto: AF_ALG - consolidation of duplicate code
On Thu, Aug 10, 2017 at 10:16:48AM +0200, Stephan Mueller wrote: > > As now the AIO code path is updated, the bug that I was reporting last > September allowing to crash the kernel via AF_ALG is fixed. > > As the patch is very invasive, I am not sure that patch set should be sent to > stable. How do you propose we fix the crash bug in older kernels that are due > to memory management problems in the AIO code path? Is it possible to create a minimal fix for the stable kernels? Thanks, -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2] crypto: AF_ALG - consolidation of duplicate code
Am Mittwoch, 9. August 2017, 15:57:34 CEST schrieb Herbert Xu: Hi Herbert, > > Patch applied. Thanks. Thanks. As now the AIO code path is updated, the bug that I was reporting last September allowing to crash the kernel via AF_ALG is fixed. As the patch is very invasive, I am not sure that patch set should be sent to stable. How do you propose we fix the crash bug in older kernels that are due to memory management problems in the AIO code path? Ciao Stephan
[PATCH v8 3/4] crypto: AF_ALG -- add asymmetric cipher
This patch adds the user space interface for asymmetric ciphers. The interface allows the use of sendmsg as well as vmsplice to provide data. The akcipher interface implementation uses the common AF_ALG interface code regarding TX and RX SGL handling. Signed-off-by: Stephan Mueller--- crypto/algif_akcipher.c | 466 include/crypto/if_alg.h | 2 + 2 files changed, 468 insertions(+) create mode 100644 crypto/algif_akcipher.c diff --git a/crypto/algif_akcipher.c b/crypto/algif_akcipher.c new file mode 100644 index ..1b36eb0b6e8f --- /dev/null +++ b/crypto/algif_akcipher.c @@ -0,0 +1,466 @@ +/* + * algif_akcipher: User-space interface for asymmetric cipher algorithms + * + * Copyright (C) 2017, Stephan Mueller + * + * This file provides the user-space API for asymmetric ciphers. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. + * + * The following concept of the memory management is used: + * + * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is + * filled by user space with the data submitted via sendpage/sendmsg. Filling + * up the TX SGL does not cause a crypto operation -- the data will only be + * tracked by the kernel. Upon receipt of one recvmsg call, the caller must + * provide a buffer which is tracked with the RX SGL. + * + * During the processing of the recvmsg operation, the cipher request is + * allocated and prepared. As part of the recvmsg operation, the processed + * TX buffers are extracted from the TX SGL into a separate SGL. + * + * After the completion of the crypto operation, the RX SGL and the cipher + * request is released. The extracted TX SGL parts are released together with + * the RX SGL release. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct akcipher_tfm { + struct crypto_akcipher *akcipher; + bool has_key; +}; + +static int akcipher_sendmsg(struct socket *sock, struct msghdr *msg, + size_t size) +{ + return af_alg_sendmsg(sock, msg, size, 0); +} + +static int _akcipher_recvmsg(struct socket *sock, struct msghdr *msg, +size_t ignored, int flags) +{ + struct sock *sk = sock->sk; + struct alg_sock *ask = alg_sk(sk); + struct sock *psk = ask->parent; + struct alg_sock *pask = alg_sk(psk); + struct af_alg_ctx *ctx = ask->private; + struct akcipher_tfm *akc = pask->private; + struct crypto_akcipher *tfm = akc->akcipher; + struct af_alg_async_req *areq; + int err = 0; + int maxsize; + size_t len = 0; + size_t used = 0; + + maxsize = crypto_akcipher_maxsize(tfm); + if (maxsize < 0) + return maxsize; + + /* Allocate cipher request for current operation. */ + areq = af_alg_alloc_areq(sk, sizeof(struct af_alg_async_req) + +crypto_akcipher_reqsize(tfm)); + if (IS_ERR(areq)) + return PTR_ERR(areq); + + /* convert iovecs of output buffers into RX SGL */ + err = af_alg_get_rsgl(sk, msg, flags, areq, maxsize, ); + if (err) + goto free; + + /* ensure output buffer is sufficiently large */ + if (len < maxsize) { + err = -EMSGSIZE; + goto free; + } + + /* +* Create a per request TX SGL for this request which tracks the +* SG entries from the global TX SGL. +*/ + used = ctx->used; + areq->tsgl_entries = af_alg_count_tsgl(sk, used, 0); + if (!areq->tsgl_entries) + areq->tsgl_entries = 1; + areq->tsgl = sock_kmalloc(sk, sizeof(*areq->tsgl) * areq->tsgl_entries, + GFP_KERNEL); + if (!areq->tsgl) { + err = -ENOMEM; + goto free; + } + sg_init_table(areq->tsgl, areq->tsgl_entries); + af_alg_pull_tsgl(sk, used, areq->tsgl, 0); + + /* Initialize the crypto operation */ + akcipher_request_set_tfm(>cra_u.akcipher_req, tfm); + akcipher_request_set_crypt(>cra_u.akcipher_req, areq->tsgl, + areq->first_rsgl.sgl.sg, used, len); + + if (msg->msg_iocb && !is_sync_kiocb(msg->msg_iocb)) { + /* AIO operation */ + areq->iocb = msg->msg_iocb; + akcipher_request_set_callback(>cra_u.akcipher_req, + CRYPTO_TFM_REQ_MAY_SLEEP, + af_alg_async_cb, areq); + } else + /* Synchronous operation */ +
[PATCH v8 2/4] crypto: AF_ALG -- add setpubkey setsockopt call
For supporting asymmetric ciphers, user space must be able to set the public key. The patch adds a new setsockopt call for setting the public key. Signed-off-by: Stephan Mueller--- crypto/af_alg.c | 18 +- include/crypto/if_alg.h | 1 + include/uapi/linux/if_alg.h | 1 + 3 files changed, 15 insertions(+), 5 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index a35a9f854a04..176921d7593a 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -203,13 +203,17 @@ static int alg_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) } static int alg_setkey(struct sock *sk, char __user *ukey, - unsigned int keylen) + unsigned int keylen, + int (*setkey)(void *private, const u8 *key, + unsigned int keylen)) { struct alg_sock *ask = alg_sk(sk); - const struct af_alg_type *type = ask->type; u8 *key; int err; + if (!setkey) + return -ENOPROTOOPT; + key = sock_kmalloc(sk, keylen, GFP_KERNEL); if (!key) return -ENOMEM; @@ -218,7 +222,7 @@ static int alg_setkey(struct sock *sk, char __user *ukey, if (copy_from_user(key, ukey, keylen)) goto out; - err = type->setkey(ask->private, key, keylen); + err = setkey(ask->private, key, keylen); out: sock_kzfree_s(sk, key, keylen); @@ -248,10 +252,14 @@ static int alg_setsockopt(struct socket *sock, int level, int optname, case ALG_SET_KEY: if (sock->state == SS_CONNECTED) goto unlock; - if (!type->setkey) + + err = alg_setkey(sk, optval, optlen, type->setkey); + break; + case ALG_SET_PUBKEY: + if (sock->state == SS_CONNECTED) goto unlock; - err = alg_setkey(sk, optval, optlen); + err = alg_setkey(sk, optval, optlen, type->setpubkey); break; case ALG_SET_AEAD_AUTHSIZE: if (sock->state == SS_CONNECTED) diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h index 50a21488f3ba..d1de8ed3e77b 100644 --- a/include/crypto/if_alg.h +++ b/include/crypto/if_alg.h @@ -55,6 +55,7 @@ struct af_alg_type { void *(*bind)(const char *name, u32 type, u32 mask); void (*release)(void *private); int (*setkey)(void *private, const u8 *key, unsigned int keylen); + int (*setpubkey)(void *private, const u8 *key, unsigned int keylen); int (*accept)(void *private, struct sock *sk); int (*accept_nokey)(void *private, struct sock *sk); int (*setauthsize)(void *private, unsigned int authsize); diff --git a/include/uapi/linux/if_alg.h b/include/uapi/linux/if_alg.h index d81dcca5bdd7..02e61627e089 100644 --- a/include/uapi/linux/if_alg.h +++ b/include/uapi/linux/if_alg.h @@ -34,6 +34,7 @@ struct af_alg_iv { #define ALG_SET_OP 3 #define ALG_SET_AEAD_ASSOCLEN 4 #define ALG_SET_AEAD_AUTHSIZE 5 +#define ALG_SET_PUBKEY 6 /* Operations */ #define ALG_OP_DECRYPT 0 -- 2.13.4
[PATCH v8 4/4] crypto: algif_akcipher - enable compilation
Add the Makefile and Kconfig updates to allow algif_akcipher to be compiled. Signed-off-by: Stephan Mueller--- crypto/Kconfig | 9 + crypto/Makefile | 1 + 2 files changed, 10 insertions(+) diff --git a/crypto/Kconfig b/crypto/Kconfig index 0a121f9ddf8e..fdcec68545f3 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -1760,6 +1760,15 @@ config CRYPTO_USER_API_AEAD This option enables the user-spaces interface for AEAD cipher algorithms. +config CRYPTO_USER_API_AKCIPHER + tristate "User-space interface for asymmetric key cipher algorithms" + depends on NET + select CRYPTO_AKCIPHER2 + select CRYPTO_USER_API + help + This option enables the user-spaces interface for asymmetric + key cipher algorithms. + config CRYPTO_HASH_INFO bool diff --git a/crypto/Makefile b/crypto/Makefile index d41f0331b085..12dbf2c5fe7c 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -133,6 +133,7 @@ obj-$(CONFIG_CRYPTO_USER_API_HASH) += algif_hash.o obj-$(CONFIG_CRYPTO_USER_API_SKCIPHER) += algif_skcipher.o obj-$(CONFIG_CRYPTO_USER_API_RNG) += algif_rng.o obj-$(CONFIG_CRYPTO_USER_API_AEAD) += algif_aead.o +obj-$(CONFIG_CRYPTO_USER_API_AKCIPHER) += algif_akcipher.o ecdh_generic-y := ecc.o ecdh_generic-y += ecdh.o -- 2.13.4
[PATCH v8 0/4] crypto: add algif_akcipher user space API
Hi, This patch set adds the AF_ALG user space API to externalize the asymmetric cipher API recently added to the kernel crypto API. The patch set is tested with the user space library of libkcapi [1]. Use [1] test/test.sh for a full test run. The test covers the following scenarios: * sendmsg of one IOVEC * sendmsg of 16 IOVECs with non-linear buffer * vmsplice of one IOVEC * vmsplice of 15 IOVECs with non-linear buffer * invoking multiple separate cipher operations with one open cipher handle * encryption with private key (using vector from testmgr.h) * encryption with public key (using vector from testmgr.h) * decryption with private key (using vector from testmgr.h) Note, to enable the test, edit line [2] from "4 99" to "4 13". [1] http://www.chronox.de/libkcapi.html [2] https://github.com/smuellerDD/libkcapi/blob/master/test/test.sh#L1452 Changes v8: * port to kernel 4.13 * port to consolidated AF_ALG code Stephan Mueller (4): crypto: AF_ALG -- add sign/verify API crypto: AF_ALG -- add setpubkey setsockopt call crypto: AF_ALG -- add asymmetric cipher crypto: algif_akcipher - enable compilation crypto/Kconfig | 9 + crypto/Makefile | 1 + crypto/af_alg.c | 28 ++- crypto/algif_aead.c | 36 ++-- crypto/algif_akcipher.c | 466 crypto/algif_skcipher.c | 26 ++- include/crypto/if_alg.h | 7 +- include/uapi/linux/if_alg.h | 3 + 8 files changed, 543 insertions(+), 33 deletions(-) create mode 100644 crypto/algif_akcipher.c -- 2.13.4
[PATCH v8 1/4] crypto: AF_ALG -- add sign/verify API
Add the flags for handling signature generation and signature verification. The af_alg helper code as well as the algif_skcipher and algif_aead code must be changed from a boolean indicating the cipher operation to an integer because there are now 4 different cipher operations that are defined. Yet, the algif_aead and algif_skcipher code still only allows encryption and decryption cipher operations. Signed-off-by: Stephan MuellerSigned-off-by: Tadeusz Struk --- crypto/af_alg.c | 10 +- crypto/algif_aead.c | 36 crypto/algif_skcipher.c | 26 +- include/crypto/if_alg.h | 4 ++-- include/uapi/linux/if_alg.h | 2 ++ 5 files changed, 50 insertions(+), 28 deletions(-) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index d6936c0e08d9..a35a9f854a04 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -859,7 +859,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, struct af_alg_tsgl *sgl; struct af_alg_control con = {}; long copied = 0; - bool enc = 0; + int op = 0; bool init = 0; int err = 0; @@ -870,11 +870,11 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, init = 1; switch (con.op) { + case ALG_OP_VERIFY: + case ALG_OP_SIGN: case ALG_OP_ENCRYPT: - enc = 1; - break; case ALG_OP_DECRYPT: - enc = 0; + op = con.op; break; default: return -EINVAL; @@ -891,7 +891,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, } if (init) { - ctx->enc = enc; + ctx->op = op; if (con.iv) memcpy(ctx->iv, con.iv->iv, ivsize); diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index 516b38c3a169..77abc04cf942 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -60,7 +60,7 @@ static inline bool aead_sufficient_data(struct sock *sk) * The minimum amount of memory needed for an AEAD cipher is * the AAD and in case of decryption the tag. */ - return ctx->used >= ctx->aead_assoclen + (ctx->enc ? 0 : as); + return ctx->used >= ctx->aead_assoclen + (ctx->op ? 0 : as); } static int aead_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) @@ -137,7 +137,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, * buffer provides the tag which is consumed resulting in only the * plaintext without a buffer for the tag returned to the caller. */ - if (ctx->enc) + if (ctx->op) outlen = used + as; else outlen = used - as; @@ -196,7 +196,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, /* Use the RX SGL as source (and destination) for crypto op. */ src = areq->first_rsgl.sgl.sg; - if (ctx->enc) { + if (ctx->op == ALG_OP_ENCRYPT) { /* * Encryption operation - The in-place cipher operation is * achieved by the following operation: @@ -212,7 +212,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, if (err) goto free; af_alg_pull_tsgl(sk, processed, NULL, 0); - } else { + } else if (ctx->op == ALG_OP_DECRYPT) { /* * Decryption operation - To achieve an in-place cipher * operation, the following SGL structure is used: @@ -258,6 +258,9 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, } else /* no RX SGL present (e.g. authentication only) */ src = areq->tsgl; + } else { + err = -EOPNOTSUPP; + goto free; } /* Initialize the crypto operation */ @@ -272,19 +275,28 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, aead_request_set_callback(>cra_u.aead_req, CRYPTO_TFM_REQ_MAY_BACKLOG, af_alg_async_cb, areq); - err = ctx->enc ? crypto_aead_encrypt(>cra_u.aead_req) : -crypto_aead_decrypt(>cra_u.aead_req); - } else { + } else /* Synchronous operation */ aead_request_set_callback(>cra_u.aead_req, CRYPTO_TFM_REQ_MAY_BACKLOG, af_alg_complete, >completion); - err = af_alg_wait_for_completion(ctx->enc ? -
[PATCH] crypto: MPI - kunmap after finishing accessing buffer
Hi Herbert, I found that issue while playing around with edge conditions in my algif_akcipher implementation. This issue only manifests in a segmentation violation on 32 bit machines and with an SGL where each SG points to one byte. SGLs with larger buffers seem to be not affected by this issue. Yet this access-after-unmap should be a candidate for stable, IMHO. ---8<--- Using sg_miter_start and sg_miter_next, the buffer of an SG is kmap'ed to *buff. The current code calls sg_miter_stop (and thus kunmap) on the SG entry before the last access of *buff. The patch moves the sg_miter_stop call after the last access to *buff to ensure that the memory pointed to by *buff is still mapped. Signed-off-by: Stephan Mueller--- lib/mpi/mpicoder.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c index 5a0f75a3bf01..eead4b339466 100644 --- a/lib/mpi/mpicoder.c +++ b/lib/mpi/mpicoder.c @@ -364,11 +364,11 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes) } miter.consumed = lzeros; - sg_miter_stop(); nbytes -= lzeros; nbits = nbytes * 8; if (nbits > MAX_EXTERN_MPI_BITS) { + sg_miter_stop(); pr_info("MPI: mpi too large (%u bits)\n", nbits); return NULL; } @@ -376,6 +376,8 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes) if (nbytes > 0) nbits -= count_leading_zeros(*buff) - (BITS_PER_LONG - 8); + sg_miter_stop(); + nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB); val = mpi_alloc(nlimbs); if (!val) -- 2.13.4