Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
On Wed, Feb 08, 2017 at 08:31:26PM -0700, Alden Tondettar wrote: > In short, the situation is: > > A) No usable hardware RNG or arch_get_random() (or we don't trust it...) Wait, why would you not trust arch_get_random()? Is it broken somehow on some arches? If so, why not fix that as well? thanks, greg k-h
Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
On Wed, Feb 08, 2017 at 08:31:26PM -0700, Alden Tondettar wrote: > The new non-blocking system introduced in commit e192be9d9a30 ("random: > replace non-blocking pool with a Chacha20-based CRNG") can under > some circumstances report itself initialized while it still contains > dangerously little entropy, as follows: > > Approximately every 64th call to add_interrupt_randomness(), the "fast" > pool of interrupt-timing-based entropy is fed into one of two places. At > calls numbered <= 256, the fast pool is XORed into the primary CRNG state. > At call 256, the CRNG is deemed initialized, getrandom(2) is unblocked, > and reading from /dev/urandom no longer gives warnings. > > At calls > 256, the fast pool is fed into the input pool, leaving the CRNG > untouched. > > The problem arises between call number 256 and 320. If crng_initialize() > is called at this time, it will overwrite the _entire_ CRNG state with > 48 bytes generated from the input pool. So in practice this isn't a problem because crng_initialize is called in early init. For reference, the ordering of init calls are: "early", <--- crng_initialize is here() "core", < ftrace is initialized here() "postcore", "arch", "subsys", < acpi_init is here() "fs", "device", < device probing is here "late", So in practice, call 256 typically happens **well** after crng_initialize. You can see where it is the boot messages, which is after 2.5 seconds into the boot: [2.570733] rtc_cmos 00:02: alarms up to one month, y3k, 114 bytes nvram, hpet irqs [2.570863] usbcore: registered new interface driver i2c-tiny-usb [2.571035] device-mapper: uevent: version 1.0.3 [2.571215] random: fast init done <- [2.571316] device-mapper: ioctl: 4.35.0-ioctl (2016-06-23) initialised: dm-de...@redhat.com [2.571678] device-mapper: multipath round-robin: version 1.1.0 loaded [2.571728] intel_pstate: Intel P-state driver initializing [2.572331] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input3 [2.572462] intel_pstate: HWP enabled [2.572464] sdhci: Secure Digital Host Controller Interface driver When is crng_initialize() called? Sometime *before* 0.05 seconds into the boot on my laptop: [0.054529] ftrace: allocating 29140 entries in 114 pages > In short, the situation is: > > A) No usable hardware RNG or arch_get_random() (or we don't trust it...) > B) add_interrupt_randomness() called 256-320 times but other >add_*_randomness() functions aren't adding much entropy. > C) then crng_initialize() is called > D) not enough calls to add_*_randomness() to push the entropy >estimate over 128 (yet) > E) getrandom(2) or /dev/urandom used for something important > > Based on a few experiments with VMs, A) through D) can occur easily in > practice. And with no HDD we have a window of about a minute or two for > E) to happen before add_interrupt_randomness() finally pushes the > estimate over 128 on its own. How did you determine when crng_initialize() was being called? On a VM generally there are fewer interrupts than on real hardware. On KVM, for I see the random: fast_init message being printed 3.6 seconds into the boot. On Google Compute Engine, the fast_init message happens 52 seconds into the boot. So what VM where you using? I'm trying to figure out whether this is hypothetical or real problem, and on what systems. - Ted
Re: [PATCH] Revert "hwrng: core - zeroize buffers with random data"
Stephan, Herbert? The zeroes in /dev/hwrng output are obviously complete crap, so there's something badly wrong somewhere. The locking, for example, is completely buggered. There's even a comment about it, but that comment makes the correct observation of "but y'know: randomness". But the memset() also being outside the lock makes a complete joke of the whole thing. Is the hwrng thing even worth maintaining? Compared to something like /dev/urandom, it clearly does not do a very good job. So I'm inclined to take the revert, but I'm also somewhat inclined to simply mark this crud broken when we have other things that clearly do a lot better. Linus On Tue, Feb 7, 2017 at 4:23 PM, David Daneywrote: > This reverts commit 2cc751545854d7bd7eedf4d7e377bb52e176cd07.
Re: [PATCH v7 0/5] Update LZ4 compressor module
On Thu, Feb 09, 2017 at 08:31:21AM +0900, Minchan Kim wrote: > > Today, I did zram-lz4 performance test with fio in current mmotm and > found it makes regression about 20%. > This may or may not be the cause of the specific regression you're observing, but I just noticed that the proposed patch drops a lot of FORCEINLINE annotations from upstream LZ4. The FORCEINLINE's are there for a reason, especially for the main decompression and compression functions which are basically "templates" that take in different sets of constant parameters, and should be left in. We should #define FORCEINLINE to __always_inline somewhere, or just do a s/FORCEINLINE/__always_inline/g. Note that the upstream LZ4 code is very carefully optimized, so we should not, in general, be changing things like when functions are force-inlined, what the hash table size is, etc. [Also, for some reason linux-crypto is apparently still not receiving patch 1/5 in the series. It's missing from the linux-crypto archive at http://www.spinics.net/lists/linux-crypto/, so it's not just me.] Thanks! Eric
Re: [PATCH v7 0/5] Update LZ4 compressor module
Hello Sven, On Sun, Feb 05, 2017 at 08:09:03PM +0100, Sven Schmidt wrote: > > This patchset is for updating the LZ4 compression module to a version based > on LZ4 v1.7.3 allowing to use the fast compression algorithm aka LZ4 fast > which provides an "acceleration" parameter as a tradeoff between > high compression ratio and high compression speed. > > We want to use LZ4 fast in order to support compression in lustre > and (mostly, based on that) investigate data reduction techniques in behalf of > storage systems. > > Also, it will be useful for other users of LZ4 compression, as with LZ4 fast > it is possible to enable applications to use fast and/or high compression > depending on the usecase. > For instance, ZRAM is offering a LZ4 backend and could benefit from an updated > LZ4 in the kernel. > > LZ4 homepage: http://www.lz4.org/ > LZ4 source repository: https://github.com/lz4/lz4 > Source version: 1.7.3 > > Benchmark (taken from [1], Core i5-4300U @1.9GHz): > |--||-- > Compressor | Compression | Decompression | Ratio > |--||-- > memcpy | 4200 MB/s | 4200 MB/s | 1.000 > LZ4 fast 50 | 1080 MB/s | 2650 MB/s | 1.375 > LZ4 fast 17 | 680 MB/s | 2220 MB/s | 1.607 > LZ4 fast 5 | 475 MB/s | 1920 MB/s | 1.886 > LZ4 default | 385 MB/s | 1850 MB/s | 2.101 > > [1] http://fastcompression.blogspot.de/2015/04/sampling-or-faster-lz4.html > > [PATCH 1/5] lib: Update LZ4 compressor module > [PATCH 2/5] lib/decompress_unlz4: Change module to work with new LZ4 module > version > [PATCH 3/5] crypto: Change LZ4 modules to work with new LZ4 module version > [PATCH 4/5] fs/pstore: fs/squashfs: Change usage of LZ4 to work with new LZ4 > version > [PATCH 5/5] lib/lz4: Remove back-compat wrappers Today, I did zram-lz4 performance test with fio in current mmotm and found it makes regression about 20%. "lz4-update" means current mmots(git://git.cmpxchg.org/linux-mmots.git) so applied your 5 patches. (But now sure current mmots has recent uptodate patches) "revert" means I reverted your 5 patches in current mmots. revertlz4-update seq-write 1547 1339 86.55% rand-write 22775 19381 85.10% seq-read 7035 5589 79.45% rand-read 78556 68479 87.17% mixed-seq(R) 1305 1066 81.69% mixed-seq(W) 1205984 81.66% mixed-rand(R) 17421 14993 86.06% mixed-rand(W) 17391 14968 86.07% My fio description file [global] bs=4k ioengine=sync size=100m numjobs=1 group_reporting buffer_compress_percentage=30 scramble_buffers=0 filename=/dev/zram0 loops=10 fsync_on_close=1 [seq-write] bs=64k rw=write stonewall [rand-write] rw=randwrite stonewall [seq-read] bs=64k rw=read stonewall [rand-read] rw=randread stonewall [mixed-seq] bs=64k rw=rw stonewall [mixed-rand] rw=randrw stonewall
[PATCH v2 0/2] crypto: AF_ALG memory management fix
Hi Herbert, Changes v2: * import fix from Harsh Jainto remove SG from list before freeing * fix return code used for ki_complete to match AIO behavior with sync behavior * rename variable list -> tsgl_list * update the algif_aead patch to include a dynamic TX SGL allocation similar to what algif_skcipher does. This allows concurrent continuous read/write operations to the extent you requested. Although I have not implemented "pairs of TX/RX SGLs" as I think that is even more overhead, the implementation conceptually defines such pairs. The recvmsg call defines how much from the input data is processed. The caller can have arbitrary number of sendmsg calls where the data is added to the TX SGL before an recvmsg asks the kernel to process a given amount (or all) of the TX SGL. With the changes, you will see a lot of code duplication now as I deliberately tried to use the same struct and variable names, the same function names and even the same oder of functions. If you agree to this patch, I volunteer to provide a followup patch that will extract the code duplication into common functions. Please find attached memory management updates to - simplify the code: the old AIO memory management is very complex and seemingly very fragile -- the update now eliminates all reported bugs in the skcipher and AEAD interfaces which allowed the kernel to be crashed by an unprivileged user - streamline the code: there is one code path for AIO and sync operation; the code between algif_skcipher and algif_aead is very similar (if that patch set is accepted, I volunteer to reduce code duplication by moving service operations into af_alg.c and to further unify the TX SGL handling) - unify the AIO and sync operation which only differ in the kernel crypto API callback and whether to wait for the crypto operation or not - fix all reported bugs regarding the handling of multiple IOCBs. The following testing was performed: - stress testing to verify that no memleaks exist - testing using Tadeusz Struck AIO test tool (see https://github.com/tstruk/afalg_async_test) -- the AEAD test is not applicable any more due to the changed user space interface; the skcipher test works once the user space interface change is honored in the test code - using the libkcapi test suite, all tests including the originally failing ones (AIO with multiple IOCBs) work now -- the current libkcapi code artificially limits the AEAD operation to one IOCB. After altering the libkcapi code to allow multiple IOCBs, the testing works flawless. Stephan Mueller (2): crypto: skcipher AF_ALG - overhaul memory management crypto: aead AF_ALG - overhaul memory management crypto/algif_aead.c | 668 +--- crypto/algif_skcipher.c | 472 ++ 2 files changed, 544 insertions(+), 596 deletions(-) -- 2.9.3
[PATCH v2 2/2] crypto: aead AF_ALG - overhaul memory management
The updated memory management is described in the top part of the code. As one benefit of the changed memory management, the AIO and synchronous operation is now implemented in one common function. The AF_ALG operation uses the async kernel crypto API interface for each cipher operation. Thus, the only difference between the AIO and sync operation types visible from user space is: 1. the callback function to be invoked when the asynchronous operation is completed 2. whether to wait for the completion of the kernel crypto API operation or not The change includes the overhaul of the TX and RX SGL handling. The TX SGL holding the data sent from user space to the kernel is now dynamic similar to algif_skcipher. This dynamic nature allows a continuous operation of a thread sending data and a second thread receiving the data. These threads do not need to synchronize as the kernel processes as much data from the TX SGL to fill the RX SGL. The caller reading the data from the kernel defines the amount of data to be processed. Considering that the interface covers AEAD authenticating ciphers, the reader must provide the buffer in the correct size. Thus the reader defines the encryption size. Signed-off-by: Stephan Mueller--- crypto/algif_aead.c | 668 +++- 1 file changed, 351 insertions(+), 317 deletions(-) diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index 533265f..6c24c81 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -11,6 +11,26 @@ * under the terms of the GNU General Public License as published by the Free * Software Foundation; either version 2 of the License, or (at your option) * any later version. + * + * The following concept of the memory management is used: + * + * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is + * filled by user space with the data submitted via sendpage/sendmsg. Filling + * up the TX SGL does not cause a crypto operation -- the data will only be + * tracked by the kernel. Upon receipt of one recvmsg call, the caller must + * provide a buffer which is tracked with the RX SGL. + * + * During the processing of the recvmsg operation, the cipher request is + * allocated and prepared. To support multiple recvmsg operations operating + * on one TX SGL, an offset pointer into the TX SGL is maintained. The TX SGL + * that is used for the crypto request is scatterwalk_ffwd by the offset + * pointer to obtain the start address the crypto operation shall use for + * the input data. + * + * After the completion of the crypto operation, the RX SGL and the cipher + * request is released. The processed TX SGL parts are released together with + * the RX SGL release and the offset pointer is reduced by the released + * data. */ #include @@ -24,45 +44,55 @@ #include #include -struct aead_sg_list { - unsigned int cur; - struct scatterlist sg[ALG_MAX_PAGES]; +struct aead_tsgl { + struct list_head list; + unsigned int cur; /* Last processed SG entry */ + struct scatterlist sg[0]; /* Array of SGs forming the SGL */ }; -struct aead_async_rsgl { +struct aead_rsgl { struct af_alg_sgl sgl; struct list_head list; }; struct aead_async_req { - struct scatterlist *tsgl; - struct aead_async_rsgl first_rsgl; - struct list_head list; struct kiocb *iocb; - unsigned int tsgls; - char iv[]; + struct sock *sk; + + struct aead_rsgl first_rsgl;/* First RX SG */ + struct list_head rsgl_list; /* Track RX SGs */ + + unsigned int outlen;/* Filled output buf length */ + + unsigned int areqlen; /* Length of this data struct */ + struct aead_request aead_req; /* req ctx trails this struct */ }; struct aead_ctx { - struct aead_sg_list tsgl; - struct aead_async_rsgl first_rsgl; - struct list_head list; + struct list_head tsgl_list; /* Link to TX SGL */ void *iv; + size_t aead_assoclen; - struct af_alg_completion completion; + struct af_alg_completion completion;/* sync work queue */ - unsigned long used; + unsigned int inflight; /* Outstanding AIO ops */ + size_t used;/* TX bytes sent to kernel */ + size_t processed; /* Processed TX bytes */ - unsigned int len; - bool more; - bool merge; - bool enc; + bool more; /* More data to be expected? */ + bool merge; /* Merge new data into existing SG */ + bool enc; /* Crypto operation: enc, dec */ - size_t aead_assoclen; - struct aead_request aead_req; + unsigned int len; /* Length of allocated memory for this struct */ + struct crypto_aead *aead_tfm; }; +static DECLARE_WAIT_QUEUE_HEAD(aead_aio_finish_wait); + +#define MAX_SGL_ENTS ((4096 -
[PATCH v2 1/2] crypto: skcipher AF_ALG - overhaul memory management
The updated memory management is described in the top part of the code. As one benefit of the changed memory management, the AIO and synchronous operation is now implemented in one common function. The AF_ALG operation uses the async kernel crypto API interface for each cipher operation. Thus, the only difference between the AIO and sync operation types visible from user space is: 1. the callback function to be invoked when the asynchronous operation is completed 2. whether to wait for the completion of the kernel crypto API operation or not In addition, the code structure is adjusted to match the structure of algif_aead for easier code assessment. The user space interface changed slightly as follows: the old AIO operation returned zero upon success and < 0 in case of an error to user space. As all other AF_ALG interfaces (including the sync skcipher interface) returned the number of processed bytes upon success and < 0 in case of an error, the new skcipher interface (regardless of AIO or sync) returns the number of processed bytes in case of success. Signed-off-by: Stephan Mueller--- crypto/algif_skcipher.c | 472 1 file changed, 193 insertions(+), 279 deletions(-) diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index a9e79d8..e65ebc4 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -10,6 +10,25 @@ * Software Foundation; either version 2 of the License, or (at your option) * any later version. * + * The following concept of the memory management is used: + * + * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is + * filled by user space with the data submitted via sendpage/sendmsg. Filling + * up the TX SGL does not cause a crypto operation -- the data will only be + * tracked by the kernel. Upon receipt of one recvmsg call, the caller must + * provide a buffer which is tracked with the RX SGL. + * + * During the processing of the recvmsg operation, the cipher request is + * allocated and prepared. To support multiple recvmsg operations operating + * on one TX SGL, an offset pointer into the TX SGL is maintained. The TX SGL + * that is used for the crypto request is scatterwalk_ffwd by the offset + * pointer to obtain the start address the crypto operation shall use for + * the input data. + * + * After the completion of the crypto operation, the RX SGL and the cipher + * request is released. The processed TX SGL parts are released together with + * the RX SGL release and the offset pointer is reduced by the released + * data. */ #include @@ -31,78 +50,50 @@ struct skcipher_sg_list { struct scatterlist sg[0]; }; +struct skcipher_rsgl { + struct af_alg_sgl sgl; + struct list_head list; +}; + +struct skcipher_async_req { + struct kiocb *iocb; + struct sock *sk; + + struct skcipher_rsgl first_sgl; + struct list_head rsgl_list; + + unsigned int areqlen; + struct skcipher_request req; +}; + struct skcipher_tfm { struct crypto_skcipher *skcipher; bool has_key; }; struct skcipher_ctx { - struct list_head tsgl; - struct af_alg_sgl rsgl; + struct list_head tsgl_list; void *iv; struct af_alg_completion completion; - atomic_t inflight; + unsigned int inflight; size_t used; + size_t processed; - unsigned int len; bool more; bool merge; bool enc; - struct skcipher_request req; -}; - -struct skcipher_async_rsgl { - struct af_alg_sgl sgl; - struct list_head list; + unsigned int len; }; -struct skcipher_async_req { - struct kiocb *iocb; - struct skcipher_async_rsgl first_sgl; - struct list_head list; - struct scatterlist *tsg; - atomic_t *inflight; - struct skcipher_request req; -}; +static DECLARE_WAIT_QUEUE_HEAD(skcipher_aio_finish_wait); #define MAX_SGL_ENTS ((4096 - sizeof(struct skcipher_sg_list)) / \ sizeof(struct scatterlist) - 1) -static void skcipher_free_async_sgls(struct skcipher_async_req *sreq) -{ - struct skcipher_async_rsgl *rsgl, *tmp; - struct scatterlist *sgl; - struct scatterlist *sg; - int i, n; - - list_for_each_entry_safe(rsgl, tmp, >list, list) { - af_alg_free_sg(>sgl); - if (rsgl != >first_sgl) - kfree(rsgl); - } - sgl = sreq->tsg; - n = sg_nents(sgl); - for_each_sg(sgl, sg, n, i) - put_page(sg_page(sg)); - - kfree(sreq->tsg); -} - -static void skcipher_async_cb(struct crypto_async_request *req, int err) -{ - struct skcipher_async_req *sreq = req->data; - struct kiocb *iocb = sreq->iocb; - - atomic_dec(sreq->inflight); - skcipher_free_async_sgls(sreq); - kzfree(sreq); - iocb->ki_complete(iocb, err, err); -} - static inline
[PATCH] crypto: ccp - Set the AES size field for all modes
Ensure that the size field is correctly populated for all AES modes. Signed-off-by: Gary R Hook--- drivers/crypto/ccp/ccp-dev-v5.c |3 +-- drivers/crypto/ccp/ccp-dev.h|1 + drivers/crypto/ccp/ccp-ops.c|8 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/crypto/ccp/ccp-dev-v5.c b/drivers/crypto/ccp/ccp-dev-v5.c index 612898b..9c6ff8b8 100644 --- a/drivers/crypto/ccp/ccp-dev-v5.c +++ b/drivers/crypto/ccp/ccp-dev-v5.c @@ -284,8 +284,7 @@ static int ccp5_perform_aes(struct ccp_op *op) CCP_AES_ENCRYPT() = op->u.aes.action; CCP_AES_MODE() = op->u.aes.mode; CCP_AES_TYPE() = op->u.aes.type; - if (op->u.aes.mode == CCP_AES_MODE_CFB) - CCP_AES_SIZE() = 0x7f; + CCP_AES_SIZE() = op->u.aes.size; CCP5_CMD_FUNCTION() = function.raw; diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h index 649e561..2b5c01f 100644 --- a/drivers/crypto/ccp/ccp-dev.h +++ b/drivers/crypto/ccp/ccp-dev.h @@ -467,6 +467,7 @@ struct ccp_aes_op { enum ccp_aes_type type; enum ccp_aes_mode mode; enum ccp_aes_action action; + unsigned int size; }; struct ccp_xts_aes_op { diff --git a/drivers/crypto/ccp/ccp-ops.c b/drivers/crypto/ccp/ccp-ops.c index 50fae44..6878160 100644 --- a/drivers/crypto/ccp/ccp-ops.c +++ b/drivers/crypto/ccp/ccp-ops.c @@ -692,6 +692,14 @@ static int ccp_run_aes_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd) goto e_ctx; } } + switch (aes->mode) { + case CCP_AES_MODE_CFB: /* CFB128 only */ + case CCP_AES_MODE_CTR: + op.u.aes.size = AES_BLOCK_SIZE * BITS_PER_BYTE - 1; + break; + default: + op.u.aes.size = 0; + } /* Prepare the input and output data workareas. For in-place * operations we need to set the dma direction to BIDIRECTIONAL
Re: [PATCH v2 2/5] async_tx: Handle DMA devices having support for fewer PQ coefficients
On Wed, Feb 8, 2017 at 12:57 AM, Anup Patelwrote: > On Tue, Feb 7, 2017 at 11:46 PM, Dan Williams > wrote: >> On Tue, Feb 7, 2017 at 1:02 AM, Anup Patel wrote: >>> On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams >>> wrote: On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel wrote: > The DMAENGINE framework assumes that if PQ offload is supported by a > DMA device then all 256 PQ coefficients are supported. This assumption > does not hold anymore because we now have BCM-SBA-RAID offload engine > which supports PQ offload with limited number of PQ coefficients. > > This patch extends async_tx APIs to handle DMA devices with support > for fewer PQ coefficients. > > Signed-off-by: Anup Patel > Reviewed-by: Scott Branden I don't like this approach. Define an interface for md to query the offload engine once at the beginning of time. We should not be adding any new extensions to async_tx. >>> >>> Even if we do capability checks in Linux MD, we still need a way >>> for DMAENGINE drivers to advertise number of PQ coefficients >>> handled by the HW. >>> >>> I agree capability checks should be done once in Linux MD but I don't >>> see why this has to be part of BCM-SBA-RAID driver patches. We need >>> separate patchsets to address limitations of async_tx framework. >> >> Right, separate enabling before we pile on new hardware support to a >> known broken framework. > > Linux Async Tx not broken framework. The issue is: > 1. Its not complete enough > 2. Its not optimized for very high through-put offload engines I'm not understanding your point. I'm nak'ing this change to add yet more per-transaction capability checking to async_tx. I don't like the DMA_HAS_FEWER_PQ_COEF flag, especially since it is equal to DMA_HAS_PQ_CONTINUE. I'm not asking for all of async_tx's problems to be fixed before this new hardware support, I'm simply saying we should start the process of moving offload-engine capability checking to the raid code.
[PATCH] crypto: arm/aes-ce: assign err return conditionally
As the err value is not used unless there was an error it can be assigned conditionally here. Signed-off-by: Nicholas Mc Guire <der.h...@hofr.at> --- Not sure if this is really relevant and worth changing, effectively it is practically no change as gcc would move the err = PTR_ERR(simd); below unregister_simds: anyway (based on inspection of .lst/.s files) - so it is more of an adjust C-level to object level for readability. Patch was compile-tested with multi_v7_defconfig (implies CONFIG_CRYPTO_AES_ARM_CE=m) Patch is against 4.10-rc7 (localversion-next is next-20170208) arch/arm/crypto/aes-ce-glue.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index 883b84d..8f65030 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -437,9 +437,10 @@ static int __init aes_init(void) drvname = aes_algs[i].base.cra_driver_name + 2; basename = aes_algs[i].base.cra_driver_name; simd = simd_skcipher_create_compat(algname, drvname, basename); - err = PTR_ERR(simd); - if (IS_ERR(simd)) + if (IS_ERR(simd)) { + err = PTR_ERR(simd); goto unregister_simds; + } aes_simd_algs[i] = simd; } -- 2.1.4
Re: [PATCH v2 2/5] async_tx: Handle DMA devices having support for fewer PQ coefficients
On Tue, Feb 7, 2017 at 11:46 PM, Dan Williamswrote: > On Tue, Feb 7, 2017 at 1:02 AM, Anup Patel wrote: >> On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams >> wrote: >>> On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel wrote: The DMAENGINE framework assumes that if PQ offload is supported by a DMA device then all 256 PQ coefficients are supported. This assumption does not hold anymore because we now have BCM-SBA-RAID offload engine which supports PQ offload with limited number of PQ coefficients. This patch extends async_tx APIs to handle DMA devices with support for fewer PQ coefficients. Signed-off-by: Anup Patel Reviewed-by: Scott Branden >>> >>> I don't like this approach. Define an interface for md to query the >>> offload engine once at the beginning of time. We should not be adding >>> any new extensions to async_tx. >> >> Even if we do capability checks in Linux MD, we still need a way >> for DMAENGINE drivers to advertise number of PQ coefficients >> handled by the HW. >> >> I agree capability checks should be done once in Linux MD but I don't >> see why this has to be part of BCM-SBA-RAID driver patches. We need >> separate patchsets to address limitations of async_tx framework. > > Right, separate enabling before we pile on new hardware support to a > known broken framework. Linux Async Tx not broken framework. The issue is: 1. Its not complete enough 2. Its not optimized for very high through-put offload engines Regards, Anup