Re: [PATCH v9 00/20] simplify crypto wait for async op
On Sun, Oct 15, 2017 at 10:19:45AM +0100, Gilad Ben-Yossef wrote: > Many users of kernel async. crypto services have a pattern of > starting an async. crypto op and than using a completion > to wait for it to end. > > This patch set simplifies this common use case in two ways: > > First, by separating the return codes of the case where a > request is queued to a backlog due to the provider being > busy (-EBUSY) from the case the request has failed due > to the provider being busy and backlogging is not enabled > (-EAGAIN). > > Next, this change is than built on to create a generic API > to wait for a async. crypto operation to complete. > > The end result is a smaller code base and an API that is > easier to use and more difficult to get wrong. > > The patch set was boot tested on x86_64 and arm64 which > at the very least tests the crypto users via testmgr and > tcrypt but I do note that I do not have access to some > of the HW whose drivers are modified nor do I claim I was > able to test all of the corner cases. > > The patch set is based upon linux-next release tagged > next-20171013. Has there been any performance impact analysis of these changes? I ended up with patches for one of the crypto drivers which converted its interrupt handling to threaded interrupts being reverted because it caused a performance degredation. Moving code to latest APIs to simplify it is not always beneficial. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up
[PATCH] hw_random: update help description for omap-rng
omap-rng also supports Marvell Armada 7k/8k SoCs, but no mention of this is made in the help text, despite the dependency being added. Explicitly mention these SoCs in the help description so people know that it covers more than just TI SoCs. Fixes: 383212425c92 ("hwrng: omap - Add device variant for SafeXcel IP-76 found in Armada 8K") Signed-off-by: Russell King <rmk+ker...@armlinux.org.uk> --- drivers/char/hw_random/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/char/hw_random/Kconfig b/drivers/char/hw_random/Kconfig index ceff2fc524b1..0cafe08919c9 100644 --- a/drivers/char/hw_random/Kconfig +++ b/drivers/char/hw_random/Kconfig @@ -172,8 +172,8 @@ config HW_RANDOM_OMAP default HW_RANDOM ---help--- This driver provides kernel-side support for the Random Number - Generator hardware found on OMAP16xx, OMAP2/3/4/5 and AM33xx/AM43xx - multimedia processors. + Generator hardware found on OMAP16xx, OMAP2/3/4/5, AM33xx/AM43xx + multimedia processors, and Marvell Armada 7k/8k SoCs. To compile this driver as a module, choose M here: the module will be called omap-rng. -- 2.7.4
Re: [BISECT] ARM build errors on GCC v6.2 (crypto: arm/aes - replace scalar AES cipher)
On Sat, Jan 14, 2017 at 04:24:35PM +0200, Krzysztof Kozlowski wrote: > Hi, > > allyesconfig and multi_v7_defconfig fail to build on recent linux-next > on GCC 6.2.0. > > Errors: > ../arch/arm/crypto/aes-cipher-core.S: Assembler messages: > ../arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not > support `tt .req ip' in ARM mode > ../arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- > `movw tt,#:lower16:crypto_ft_tab' > ../arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- > `movt tt,#:upper16:crypto_ft_tab' > > Compiler: arm-linux-gnueabi-gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005 It's worth pointing out that _assembler_ messages come from _binutils_ rather than _gcc_. Please advise which version of _binutils_ you are using. Thanks. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] ARM: wire up HWCAP2 feature bits to the CPU modalias
On Mon, Jan 02, 2017 at 09:06:04PM +, Ard Biesheuvel wrote: > On 31 October 2016 at 16:13, Russell King - ARM Linux > <li...@armlinux.org.uk> wrote: > > On Sat, Oct 29, 2016 at 11:08:36AM +0100, Ard Biesheuvel wrote: > >> On 18 October 2016 at 11:52, Ard Biesheuvel <ard.biesheu...@linaro.org> > >> wrote: > >> > Wire up the generic support for exposing CPU feature bits via the > >> > modalias in /sys/device/system/cpu. This allows udev to automatically > >> > load modules for things like crypto algorithms that are implemented > >> > using optional instructions. > >> > > >> > Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> > >> > --- > >> > arch/arm/Kconfig | 1 + > >> > arch/arm/include/asm/cpufeature.h | 32 > >> > 2 files changed, 33 insertions(+) > >> > > >> > >> Russell, > >> > >> do you have any concerns regarding this patch? If not, I will drop it > >> into the patch system. > > > > It's still something I need to look at... I've been offline last week, > > and sort-of offline the previous week, so I'm catching up. > > > > Hi Russell, > > Any thoughts yet? None, and the patch is well buried now that it'll take me a while to find... back in mid-October? Yea, I'll have to drop everything and go digging through my mailboxes to find it... and I'm just catching up (again) after a week and a bit's time offline - yep, it's wonderful timing. Sorry, no time to look at it right now, you're not the only one wanting my attention at the moment. Please try again in about a week's time - don't leave it a few months, and please include the patch. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 11/14] Revert "crypto: caam - get rid of tasklet"
Please include Thomas in this. On Wed, Nov 09, 2016 at 10:46:21AM +0200, Horia Geantă wrote: > This reverts commit 66d2e2028091a074aa1290d2eeda5ddb1a6c329c. > > Quoting from Russell's findings: > https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg21136.html > > [quote] > Okay, I've re-tested, using a different way of measuring, because using > openssl speed is impractical for off-loaded engines. I've decided to > use this way to measure the performance: > > dd if=/dev/zero bs=1048576 count=128 | /usr/bin/time openssl dgst -md5 > > For the threaded IRQs case gives: > > 0.05user 2.74system 0:05.30elapsed 52%CPU (0avgtext+0avgdata 2400maxresident)k > 0.06user 2.52system 0:05.18elapsed 49%CPU (0avgtext+0avgdata 2404maxresident)k > 0.12user 2.60system 0:05.61elapsed 48%CPU (0avgtext+0avgdata 2460maxresident)k > => 5.36s => 25.0MB/s > > and the tasklet case: > > 0.08user 2.53system 0:04.83elapsed 54%CPU (0avgtext+0avgdata 2468maxresident)k > 0.09user 2.47system 0:05.16elapsed 49%CPU (0avgtext+0avgdata 2368maxresident)k > 0.10user 2.51system 0:04.87elapsed 53%CPU (0avgtext+0avgdata 2460maxresident)k > => 4.95 => 27.1MB/s > > which corresponds to an 8% slowdown for the threaded IRQ case. So, > tasklets are indeed faster than threaded IRQs. > > [...] > > I think I've proven from the above that this patch needs to be reverted > due to the performance regression, and that there _is_ most definitely > a deterimental effect of switching from tasklets to threaded IRQs. > [/quote] > > Signed-off-by: Horia Geantă> --- > drivers/crypto/caam/intern.h | 1 + > drivers/crypto/caam/jr.c | 25 - > 2 files changed, 17 insertions(+), 9 deletions(-) > > diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h > index 5d4c05074a5c..e2bcacc1a921 100644 > --- a/drivers/crypto/caam/intern.h > +++ b/drivers/crypto/caam/intern.h > @@ -41,6 +41,7 @@ struct caam_drv_private_jr { > struct device *dev; > int ridx; > struct caam_job_ring __iomem *rregs;/* JobR's register space */ > + struct tasklet_struct irqtask; > int irq;/* One per queue */ > > /* Number of scatterlist crypt transforms active on the JobR */ > diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c > index 7331ea734f37..c8604dfadbf5 100644 > --- a/drivers/crypto/caam/jr.c > +++ b/drivers/crypto/caam/jr.c > @@ -73,6 +73,8 @@ static int caam_jr_shutdown(struct device *dev) > > ret = caam_reset_hw_jr(dev); > > + tasklet_kill(>irqtask); > + > /* Release interrupt */ > free_irq(jrp->irq, dev); > > @@ -128,7 +130,7 @@ static irqreturn_t caam_jr_interrupt(int irq, void > *st_dev) > > /* >* Check the output ring for ready responses, kick > - * the threaded irq if jobs done. > + * tasklet if jobs done. >*/ > irqstate = rd_reg32(>rregs->jrintstatus); > if (!irqstate) > @@ -150,13 +152,18 @@ static irqreturn_t caam_jr_interrupt(int irq, void > *st_dev) > /* Have valid interrupt at this point, just ACK and trigger */ > wr_reg32(>rregs->jrintstatus, irqstate); > > - return IRQ_WAKE_THREAD; > + preempt_disable(); > + tasklet_schedule(>irqtask); > + preempt_enable(); > + > + return IRQ_HANDLED; > } > > -static irqreturn_t caam_jr_threadirq(int irq, void *st_dev) > +/* Deferred service handler, run as interrupt-fired tasklet */ > +static void caam_jr_dequeue(unsigned long devarg) > { > int hw_idx, sw_idx, i, head, tail; > - struct device *dev = st_dev; > + struct device *dev = (struct device *)devarg; > struct caam_drv_private_jr *jrp = dev_get_drvdata(dev); > void (*usercall)(struct device *dev, u32 *desc, u32 status, void *arg); > u32 *userdesc, userstatus; > @@ -230,8 +237,6 @@ static irqreturn_t caam_jr_threadirq(int irq, void > *st_dev) > > /* reenable / unmask IRQs */ > clrsetbits_32(>rregs->rconfig_lo, JRCFG_IMSK, 0); > - > - return IRQ_HANDLED; > } > > /** > @@ -389,10 +394,11 @@ static int caam_jr_init(struct device *dev) > > jrp = dev_get_drvdata(dev); > > + tasklet_init(>irqtask, caam_jr_dequeue, (unsigned long)dev); > + > /* Connect job ring interrupt handler. */ > - error = request_threaded_irq(jrp->irq, caam_jr_interrupt, > - caam_jr_threadirq, IRQF_SHARED, > - dev_name(dev), dev); > + error = request_irq(jrp->irq, caam_jr_interrupt, IRQF_SHARED, > + dev_name(dev), dev); > if (error) { > dev_err(dev, "can't connect JobR %d interrupt (%d)\n", > jrp->ridx, jrp->irq); > @@ -454,6 +460,7 @@ static int caam_jr_init(struct device *dev) > out_free_irq: > free_irq(jrp->irq, dev); > out_kill_deq: > + tasklet_kill(>irqtask); > return error; > } > > -- >
Re: [PATCH 1/5] ARM: wire up HWCAP2 feature bits to the CPU modalias
On Sat, Oct 29, 2016 at 11:08:36AM +0100, Ard Biesheuvel wrote: > On 18 October 2016 at 11:52, Ard Biesheuvelwrote: > > Wire up the generic support for exposing CPU feature bits via the > > modalias in /sys/device/system/cpu. This allows udev to automatically > > load modules for things like crypto algorithms that are implemented > > using optional instructions. > > > > Signed-off-by: Ard Biesheuvel > > --- > > arch/arm/Kconfig | 1 + > > arch/arm/include/asm/cpufeature.h | 32 > > 2 files changed, 33 insertions(+) > > > > Russell, > > do you have any concerns regarding this patch? If not, I will drop it > into the patch system. It's still something I need to look at... I've been offline last week, and sort-of offline the previous week, so I'm catching up. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND][PATCH] crypto: caam: add support for iMX6UL
On Mon, Oct 17, 2016 at 01:28:00PM +0200, Marcus Folkesson wrote: > i.MX6UL does only require three clocks to enable CAAM module. > > Signed-off-by: Marcus Folkesson> Acked-by: Rob Herring > Reviewed-by: Horia Geantă > --- > .../devicetree/bindings/crypto/fsl-sec4.txt| 20 + > drivers/crypto/caam/ctrl.c | 35 > -- > 2 files changed, 40 insertions(+), 15 deletions(-) > > diff --git a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt > b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt > index adeca34..10a425f 100644 > --- a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt > +++ b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt > @@ -123,6 +123,9 @@ PROPERTIES > > > EXAMPLE > + > +iMX6QDL/SX requires four clocks > + > crypto@30 { > compatible = "fsl,sec-v4.0"; > fsl,sec-era = <2>; > @@ -139,6 +142,23 @@ EXAMPLE > clock-names = "mem", "aclk", "ipg", "emi_slow"; > }; > > + > +iMX6UL does only require three clocks > + > + crypto: caam@214 { > + compatible = "fsl,sec-v4.0"; > + #address-cells = <1>; > + #size-cells = <1>; > + reg = <0x214 0x3c000>; > + ranges = <0 0x214 0x3c000>; > + interrupts = ; > + > + clocks = < IMX6UL_CLK_CAAM_MEM>, > + < IMX6UL_CLK_CAAM_ACLK>, > + < IMX6UL_CLK_CAAM_IPG>; > + clock-names = "mem", "aclk", "ipg"; > + }; > + > = > Job Ring (JR) Node > > diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c > index 0ec112e..5abaf37 100644 > --- a/drivers/crypto/caam/ctrl.c > +++ b/drivers/crypto/caam/ctrl.c > @@ -329,8 +329,8 @@ static int caam_remove(struct platform_device *pdev) > clk_disable_unprepare(ctrlpriv->caam_ipg); > clk_disable_unprepare(ctrlpriv->caam_mem); > clk_disable_unprepare(ctrlpriv->caam_aclk); > - clk_disable_unprepare(ctrlpriv->caam_emi_slow); > - > + if (!of_machine_is_compatible("fsl,imx6ul")) > + clk_disable_unprepare(ctrlpriv->caam_emi_slow); There is no need to re-lookup the platform here. Can't you check the validity of ctrlpriv->caam_emi_slow ? > return 0; > } > > @@ -481,14 +481,16 @@ static int caam_probe(struct platform_device *pdev) > } > ctrlpriv->caam_aclk = clk; > > - clk = caam_drv_identify_clk(>dev, "emi_slow"); > - if (IS_ERR(clk)) { > - ret = PTR_ERR(clk); > - dev_err(>dev, > - "can't identify CAAM emi_slow clk: %d\n", ret); > - return ret; > + if (!of_machine_is_compatible("fsl,imx6ul")) { > + clk = caam_drv_identify_clk(>dev, "emi_slow"); > + if (IS_ERR(clk)) { > + ret = PTR_ERR(clk); > + dev_err(>dev, > + "can't identify CAAM emi_slow clk: %d\n", ret); > + return ret; > + } > + ctrlpriv->caam_emi_slow = clk; > } > - ctrlpriv->caam_emi_slow = clk; > > ret = clk_prepare_enable(ctrlpriv->caam_ipg); > if (ret < 0) { > @@ -509,11 +511,13 @@ static int caam_probe(struct platform_device *pdev) > goto disable_caam_mem; > } > > - ret = clk_prepare_enable(ctrlpriv->caam_emi_slow); > - if (ret < 0) { > - dev_err(>dev, "can't enable CAAM emi slow clock: %d\n", > - ret); > - goto disable_caam_aclk; > + if (!of_machine_is_compatible("fsl,imx6ul")) { Same here. > + ret = clk_prepare_enable(ctrlpriv->caam_emi_slow); > + if (ret < 0) { > + dev_err(>dev, "can't enable CAAM emi slow clock: > %d\n", > + ret); > + goto disable_caam_aclk; > + } > } > > /* Get configuration properties from device tree */ > @@ -829,7 +833,8 @@ caam_remove: > iounmap_ctrl: > iounmap(ctrl); > disable_caam_emi_slow: > - clk_disable_unprepare(ctrlpriv->caam_emi_slow); > + if (!of_machine_is_compatible("fsl,imx6ul")) > + clk_disable_unprepare(ctrlpriv->caam_emi_slow); and here. > disable_caam_aclk: > clk_disable_unprepare(ctrlpriv->caam_aclk); > disable_caam_mem: > -- > 2.8.0 > -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: crypto: caam from tasklet to threadirq
Okay, I've re-tested, using a different way of measuring, because using openssl speed is impractical for off-loaded engines. I've decided to use this way to measure the performance: dd if=/dev/zero bs=1048576 count=128 | /usr/bin/time openssl dgst -md5 For the threaded IRQs case gives: 0.05user 2.74system 0:05.30elapsed 52%CPU (0avgtext+0avgdata 2400maxresident)k 0.06user 2.52system 0:05.18elapsed 49%CPU (0avgtext+0avgdata 2404maxresident)k 0.12user 2.60system 0:05.61elapsed 48%CPU (0avgtext+0avgdata 2460maxresident)k => 5.36s => 25.0MB/s and the tasklet case: 0.08user 2.53system 0:04.83elapsed 54%CPU (0avgtext+0avgdata 2468maxresident)k 0.09user 2.47system 0:05.16elapsed 49%CPU (0avgtext+0avgdata 2368maxresident)k 0.10user 2.51system 0:04.87elapsed 53%CPU (0avgtext+0avgdata 2460maxresident)k => 4.95 => 27.1MB/s which corresponds to an 8% slowdown for the threaded IRQ case. So, tasklets are indeed faster than threaded IRQs. I guess the reason is that tasklets are much simpler, being able to run just before we return to userspace without involving scheduler overheads, but that's speculation. I've tried to perf it, but... Samples: 31K of event 'cycles', Event count (approx.): 3552246846 Overhead Command Shared Object Symbol + 33.22% kworker/0:1 [kernel.vmlinux] [k] __do_softirq + 15.78% irq/311-2101000 [kernel.vmlinux] [k] __do_softirq +7.49% irqbalance [kernel.vmlinux] [k] __do_softirq +7.26% openssl [kernel.vmlinux] [k] __do_softirq +5.71% ksoftirqd/0 [kernel.vmlinux] [k] __do_softirq +3.64% kworker/0:2 [kernel.vmlinux] [k] __do_softirq +3.52% swapper [kernel.vmlinux] [k] __do_softirq +3.14% kworker/0:1 [kernel.vmlinux] [k] _raw_spin_unlock_irq I was going to try to get the threaded IRQ case, but I've ended up with perf getting buggered because of the iMX6 SMP perf disfunctionality: [ 3448.810416] irq 24: nobody cared (try booting with the "irqpoll" option) ... [ 3448.824528] Disabling IRQ #24 caused by FSL's utterly brain-dead idea of routing all the perf interrupts to single non-CPU local interrupt input, and the refusal of kernel folk to find an acceptable solution to support this. So, sorry, I'm not going to bother trying to get any further with this. If the job was not made harder stupid hardware design and kernel politics, then I might be more inclined to do deeper investigation, but right now I'm finding that I'm not interested in trying to jump through these stupid hoops. I think I've proven from the above that this patch needs to be reverted due to the performance regression, and that there _is_ most definitely a deterimental effect of switching from tasklets to threaded IRQs. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: crypto: caam from tasklet to threadirq
On Fri, Sep 16, 2016 at 02:01:00PM +, Cata Vasile wrote: > Hi, > > We've tried to test and benchmark your submitted work[1]. > > Cryptographic offloading is also used in IPsec in the Linux Kernel. In > heavy traffic scenarios, the NIC driver competes with the crypto device > driver. Most NICs use the NAPI context, which is one of the most > prioritized context types. In IPsec scenarios the performance is > trashed because, although raw data gets in to device, the data is > encrypted/decrypted and the dequeue code in CAAM driver has a hard time > being scheduled to actually call the callback to notify the networking > stack it can continue working with that data. Having received a reply from Thomas Gleixner today, there appears to be some disagreement with your findings, and a suggestion that the problem needs proper and more in-depth investigation. Thomas indicates that the NAPI processing shows an improvement when moved to the same context that threaded interrupts run in, as opposed to the current softirq context - which also would run the tasklets. What I would say is that if threaded IRQs are causing harm, then there seems to be something very wrong somewhere. > Being this scenario, at heavy load, the Kernel warns on rcu stalls and > the forwarding path has a lot of latency. Have you tried benchmarking > the board you used for testing? It's way too long ago for me to remember - these patches were created almost a year ago - October 20th 2015, which is when I'd have tested them. So, I'm afraid I can't help very much at this point, apart from trying to re-run some benchmarks. I'd suggest testing the openssl (with AF_ALG support), which is probably what I tested and benchmarked. However, as I say, it's far too long ago for me to really remember at this point. > I have ran some on our other platforms. The after benchmark fails to > run at the top level of the before results. Sorry, that last sentence doesn't make any sense to me. I don't have the bandwidth to look at this, and IPsec doesn't interest me one bit - I've never been able to work out how to setup IPsec locally. From what I remember when I looked into it many years ago, you had to have significant information about ipsec to get it up and running. Maybe things have changed since then, I don't know. If you want me to reproduce it, please send me a step-by-step idiots guide on setting up a working test scenario which reproduces your problem. Thanks. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
AF_ALG zero-size hash fails
Hi, While testing AF_ALG with openssl af-alg-rr, I've found that: OPENSSL_CONF=/shared/crypto/openssl-imx.cnf openssl dgst -sha1 http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] crypto: caam - avoid kernel warnings on probe failure
rc_cec snd_soc_fsl_spdif imx_pcm_dma videobuf2_vmalloc videobuf2_memops imx_thermal dw_hdmi_ahb_audio dw_hdmi_cec etnaviv fuse rc_pinnacle_pctv_hd [last unloaded: caam] CPU: 0 PID: 2346 Comm: modprobe Tainted: GW 4.8.0-rc1+ #2014 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [] (dump_backtrace) from [] (show_stack+0x18/0x1c) [] (show_stack) from [] (dump_stack+0xa4/0xdc) [] (dump_stack) from [] (__warn+0xdc/0x108) [] (__warn) from [] (warn_slowpath_null+0x28/0x30) [] (warn_slowpath_null) from [] (clk_core_unprepare+0x204/0x388) [] (clk_core_unprepare) from [] (clk_unprepare+0x2c/0x34) [] (clk_unprepare) from [] (caam_probe+0x404/0x1498 [caam]) [] (caam_probe [caam]) from [] (platform_drv_probe+0x58/0xb8) [] (platform_drv_probe) from [] (driver_probe_device+0x1fc/0x2b8) [] (driver_probe_device) from [] (__driver_attach+0xbc/0xc0) r10: r8:bf24b000 r7: r6:ef215844 r5:bf2490c4 r4:ef215810 [] (__driver_attach) from [] (bus_for_each_dev+0x5c/0x90) [] (bus_for_each_dev) from [] (driver_attach+0x24/0x28) [] (driver_attach) from [] (bus_add_driver+0xf4/0x200) [] (bus_add_driver) from [] (driver_register+0x80/0xfc) [] (driver_register) from [] (__platform_driver_register+0x48/0x4c) [] (__platform_driver_register) from [] (caam_driver_init+0x18/0x24 [caam]) [] (caam_driver_init [caam]) from [] (do_one_initcall+0x44/0x178) [] (do_one_initcall) from [] (do_init_module+0x68/0x1d8) [] (do_init_module) from [] (load_module+0x1974/0x20b0) [] (load_module) from [] (SyS_finit_module+0x94/0xa0) [] (SyS_finit_module) from [] (ret_fast_syscall+0x0/0x1c) ---[ end trace 34e3370d88bb1788 ]--- Signed-off-by: Russell King <rmk+ker...@armlinux.org.uk> --- drivers/crypto/caam/ctrl.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c index 0ec112ee5204..f4c044f5bcb2 100644 --- a/drivers/crypto/caam/ctrl.c +++ b/drivers/crypto/caam/ctrl.c @@ -826,6 +826,8 @@ static int caam_probe(struct platform_device *pdev) caam_remove: caam_remove(pdev); + return ret; + iounmap_ctrl: iounmap(ctrl); disable_caam_emi_slow: -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: AF_ALG broken?
On Tue, Aug 09, 2016 at 03:14:02PM +0800, Herbert Xu wrote: > On Tue, Aug 09, 2016 at 08:08:59AM +0100, Russell King - ARM Linux wrote: > > > > I thought I gave the commands and link to your example code. The > > openssl case is md5, though sha* also gives the same result. Your > > example code was sha1 iirc. I guess none of these would be using > > HMAC - the openssl cases used to give results compatible with the > > md5sum/ sha1sum etc userspace commands. > > > > /proc/crypto: > > > > name : md5 > > driver : md5-caam > > Right, caam is providing a setkey function for md5, which leads the > API to think that a key is required. We should fix it so that setkey > is only set for the HMAC-variant. Thanks, that works nicely again, and passes my tests. 8< From: Russell King <rmk+ker...@armlinux.org.uk> Subject: [PATCH] crypto: caam - fix non-hmac hashes Since 6de62f15b581 ("crypto: algif_hash - Require setkey before accept(2)"), the AF_ALG interface requires userspace to provide a key to any algorithm that has a setkey method. However, the non-HMAC algorithms are not keyed, so setting a key is unnecessary. Fix this by removing the setkey method from the non-keyed hash algorithms. Fixes: 6de62f15b581 ("crypto: algif_hash - Require setkey before accept(2)") Cc: <sta...@vger.kernel.org> Signed-off-by: Russell King <rmk+ker...@armlinux.org.uk> --- drivers/crypto/caam/caamhash.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index ea284e3909ef..9d7fc9ec0b7e 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1950,6 +1950,7 @@ caam_hash_alloc(struct caam_hash_template *template, template->name); snprintf(alg->cra_driver_name, CRYPTO_MAX_ALG_NAME, "%s", template->driver_name); + t_alg->ahash_alg.setkey = NULL; } alg->cra_module = THIS_MODULE; alg->cra_init = caam_hash_cra_init; -- 2.1.0 -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: AF_ALG broken?
On Mon, Aug 08, 2016 at 01:47:33PM -0400, Jeffrey Walton wrote: > > When trying to use the openssl AF_ALG module with 4.8-rc1 with imx > > caam, I get this: > > > > $ OPENSSL_CONF=/shared/crypto/openssl-imx.cnf strace openssl dgst -md5 > > > ... > > socket(PF_ALG, SOCK_SEQPACKET, 0) = 3 > > close(3)= 0 > > socket(PF_ALG, SOCK_SEQPACKET, 0) = 3 > > bind(3, {sa_family=AF_ALG, sa_data="hash\0\0\0\0\0\0\0\0\0\0"}, 88) = 0 > > accept(3, 0, NULL) = 4 > > fstat64(0, {st_mode=S_IFREG|0755, st_size=666864, ...}) = 0 > > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > > 0xb6fab000 > > read(0, > > "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0(\0\1\0\0\0\21'\2\0004\0\0\0"..., 8192) > > = 8192 > > send(4, > > "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0(\0\1\0\0\0\21'\2\0004\0\0\0"..., 8192, > > MSG_MORE) = -1 ENOKEY (Required key not available) > > As far as I know from testing on x86, it has never worked as expected. > I believe you have to use 'sendto' and 'recvfrom' because 'send' and > 'recv' use default structures, and they configure the object > incorrectly. This used to work, because that's how I've tested my previous CAAM and Marvell CESA patches. So, this is not a case of "never worked" but is definitely a regression caused by some kernel change. There's also people's presentations that illustrate example code: https://events.linuxfoundation.org/sites/events/files/slides/lcj-2014-crypto-user.pdf which can also be found at: https://lwn.net/Articles/410833/ and that example code comes from Herbert, so one would assume that it's tested and was working. Note that it doesn't use any send*, but uses write(). Testing that code on 4.8-rc (and 4.7 fwiw) gives: socket(PF_ALG, SOCK_SEQPACKET, 0) = 3 bind(3, {sa_family=AF_ALG, sa_data="hash\0\0\0\0\0\0\0\0\0\0"}, 88) = 0 accept(3, 0, NULL) = 4 write(4, "abc", 3) = -1 ENOKEY (Required key not available) read(4, 0xbec50508, 20) = -1 ENOKEY (Required key not available) IOW, the same problem - and it seems not to be a recent regression. Since the last time I tested CESA or CAAM was back in 4.4 times, it's got to be something between 4.4 and 4.7. Looking at the history, my guess would be the setkey changes - crypto: algif_skcipher - Require setkey before accept(2) crypto: af_alg - Disallow bind/setkey/... after accept(2) crypto: af_alg - Add nokey compatibility path crypto: hash - Add crypto_ahash_has_setkey crypto: algif_hash - Require setkey before accept(2) -- Rmk's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/11] crypto: caam: move job descriptor initialisation to ahash_edesc_alloc()
Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 84 +- 1 file changed, 34 insertions(+), 50 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 2c2c15b63059..9c3e74e4088e 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -770,7 +770,9 @@ static void ahash_done_ctx_dst(struct device *jrdev, u32 *desc, u32 err, * and space for hardware scatter table containing sg_num entries. */ static struct ahash_edesc *ahash_edesc_alloc(struct caam_hash_ctx *ctx, -int sg_num, gfp_t flags) +int sg_num, u32 *sh_desc, +dma_addr_t sh_desc_dma, +gfp_t flags) { struct ahash_edesc *edesc; unsigned int sg_size = sg_num * sizeof(struct sec4_sg_entry); @@ -781,6 +783,9 @@ static struct ahash_edesc *ahash_edesc_alloc(struct caam_hash_ctx *ctx, return NULL; } + init_job_desc_shared(edesc->hw_desc, sh_desc_dma, desc_len(sh_desc), +HDR_SHARE_DEFER | HDR_REVERSE); + return edesc; } @@ -799,12 +804,10 @@ static int ahash_update_ctx(struct ahash_request *req) int *next_buflen = state->current_buf ? >buflen_0 : >buflen_1, last_buflen; int in_len = *buflen + req->nbytes, to_hash; - u32 *sh_desc = ctx->sh_desc_update, *desc; - dma_addr_t ptr = ctx->sh_desc_update_dma; + u32 *desc; int src_nents, mapped_nents, sec4_sg_bytes, sec4_sg_src_index; struct ahash_edesc *edesc; int ret = 0; - int sh_len; last_buflen = *next_buflen; *next_buflen = in_len & (crypto_tfm_alg_blocksize(>base) - 1); @@ -838,7 +841,8 @@ static int ahash_update_ctx(struct ahash_request *req) * link tables */ edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index + mapped_nents, - flags); + ctx->sh_desc_update, + ctx->sh_desc_update_dma, flags); if (!edesc) { dma_unmap_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); return -ENOMEM; @@ -872,10 +876,7 @@ static int ahash_update_ctx(struct ahash_request *req) state->current_buf = !state->current_buf; - sh_len = desc_len(sh_desc); desc = edesc->hw_desc; - init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | -HDR_REVERSE); edesc->sec4_sg_dma = dma_map_single(jrdev, edesc->sec4_sg, sec4_sg_bytes, @@ -936,25 +937,23 @@ static int ahash_final_ctx(struct ahash_request *req) int buflen = state->current_buf ? state->buflen_1 : state->buflen_0; int last_buflen = state->current_buf ? state->buflen_0 : state->buflen_1; - u32 *sh_desc = ctx->sh_desc_fin, *desc; - dma_addr_t ptr = ctx->sh_desc_fin_dma; + u32 *desc; int sec4_sg_bytes, sec4_sg_src_index; int digestsize = crypto_ahash_digestsize(ahash); struct ahash_edesc *edesc; int ret = 0; - int sh_len; sec4_sg_src_index = 1 + (buflen ? 1 : 0); sec4_sg_bytes = sec4_sg_src_index * sizeof(struct sec4_sg_entry); /* allocate space for base edesc and hw desc commands, link tables */ - edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index, flags); + edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index, + ctx->sh_desc_fin, ctx->sh_desc_fin_dma, + flags); if (!edesc) return -ENOMEM; - sh_len = desc_len(sh_desc); desc = edesc->hw_desc; - init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); edesc->sec4_sg_bytes = sec4_sg_bytes; edesc->src_nents = 0; @@ -1018,14 +1017,12 @@ static int ahash_finup_ctx(struct ahash_request *req) int buflen = state->current_buf ? state->buflen_1 : state->buflen_0; int last_buflen = state->current_buf ? state->buflen_0 : state->buflen_1; - u32 *sh_desc = ctx->sh_desc_finup, *desc; - dma_addr_t ptr = ctx->sh_desc_finup_dma; + u32 *desc; int sec4_sg_bytes, sec4_sg_src_index; int src_nents, mapped_nents; int digestsize = crypto_ahash_digestsize(ahash); struct ahash_edesc *edesc; int ret = 0; - int sh_len; src_nents = sg_nents_for_len(req
[PATCH 10/11] crypto: caam: add ahash_edesc_add_src()
Add a helper to map the source scatterlist into the descriptor. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 137 + 1 file changed, 57 insertions(+), 80 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 9c3e74e4088e..ea284e3909ef 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -789,6 +789,41 @@ static struct ahash_edesc *ahash_edesc_alloc(struct caam_hash_ctx *ctx, return edesc; } +static int ahash_edesc_add_src(struct caam_hash_ctx *ctx, + struct ahash_edesc *edesc, + struct ahash_request *req, int nents, + unsigned int first_sg, + unsigned int first_bytes, size_t to_hash) +{ + dma_addr_t src_dma; + u32 options; + + if (nents > 1 || first_sg) { + struct sec4_sg_entry *sg = edesc->sec4_sg; + unsigned int sgsize = sizeof(*sg) * (first_sg + nents); + + sg_to_sec4_sg_last(req->src, nents, sg + first_sg, 0); + + src_dma = dma_map_single(ctx->jrdev, sg, sgsize, DMA_TO_DEVICE); + if (dma_mapping_error(ctx->jrdev, src_dma)) { + dev_err(ctx->jrdev, "unable to map S/G table\n"); + return -ENOMEM; + } + + edesc->sec4_sg_bytes = sgsize; + edesc->sec4_sg_dma = src_dma; + options = LDST_SGF; + } else { + src_dma = sg_dma_address(req->src); + options = 0; + } + + append_seq_in_ptr(edesc->hw_desc, src_dma, first_bytes + to_hash, + options); + + return 0; +} + /* submit update job descriptor */ static int ahash_update_ctx(struct ahash_request *req) { @@ -1018,7 +1053,7 @@ static int ahash_finup_ctx(struct ahash_request *req) int last_buflen = state->current_buf ? state->buflen_0 : state->buflen_1; u32 *desc; - int sec4_sg_bytes, sec4_sg_src_index; + int sec4_sg_src_index; int src_nents, mapped_nents; int digestsize = crypto_ahash_digestsize(ahash); struct ahash_edesc *edesc; @@ -1042,8 +1077,6 @@ static int ahash_finup_ctx(struct ahash_request *req) } sec4_sg_src_index = 1 + (buflen ? 1 : 0); - sec4_sg_bytes = (sec4_sg_src_index + mapped_nents) * -sizeof(struct sec4_sg_entry); /* allocate space for base edesc and hw desc commands, link tables */ edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index + mapped_nents, @@ -1057,7 +1090,6 @@ static int ahash_finup_ctx(struct ahash_request *req) desc = edesc->hw_desc; edesc->src_nents = src_nents; - edesc->sec4_sg_bytes = sec4_sg_bytes; ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_TO_DEVICE); @@ -1068,19 +1100,11 @@ static int ahash_finup_ctx(struct ahash_request *req) buf, state->buf_dma, buflen, last_buflen); - sg_to_sec4_sg_last(req->src, mapped_nents, - edesc->sec4_sg + sec4_sg_src_index, 0); - - edesc->sec4_sg_dma = dma_map_single(jrdev, edesc->sec4_sg, - sec4_sg_bytes, DMA_TO_DEVICE); - if (dma_mapping_error(jrdev, edesc->sec4_sg_dma)) { - dev_err(jrdev, "unable to map S/G table\n"); - ret = -ENOMEM; + ret = ahash_edesc_add_src(ctx, edesc, req, mapped_nents, + sec4_sg_src_index, ctx->ctx_len + buflen, + req->nbytes); + if (ret) goto err; - } - - append_seq_in_ptr(desc, edesc->sec4_sg_dma, ctx->ctx_len + - buflen + req->nbytes, LDST_SGF); edesc->dst_dma = map_seq_out_ptr_result(desc, jrdev, req->result, digestsize); @@ -1116,11 +1140,9 @@ static int ahash_digest(struct ahash_request *req) CRYPTO_TFM_REQ_MAY_SLEEP)) ? GFP_KERNEL : GFP_ATOMIC; u32 *desc; int digestsize = crypto_ahash_digestsize(ahash); - int src_nents, mapped_nents, sec4_sg_bytes; - dma_addr_t src_dma; + int src_nents, mapped_nents; struct ahash_edesc *edesc; int ret = 0; - u32 options; src_nents = sg_nents_for_len(req->src, req->nbytes); if (src_nents < 0) { @@ -1139,11 +1161,6 @@ static int ahash_digest(struct ahash_request *req) mapped_nents = 0; } - if (mapped_ne
[PATCH 11/11] crypto: caam: get rid of tasklet
Threaded interrupts can perform the function of the tasklet, and much more safely too - without races when trying to take the tasklet and interrupt down on device removal. With the old code, there is a window where we call tasklet_kill(). If the interrupt handler happens to be running on a different CPU, and subsequently calls tasklet_schedule(), the tasklet will be re-scheduled for execution. Switching to a hardirq/threadirq combination implementation avoids this, and it also means generic code deals with the teardown sequencing of the threaded and non-threaded parts. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/intern.h | 1 - drivers/crypto/caam/jr.c | 25 + 2 files changed, 9 insertions(+), 17 deletions(-) diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h index e2bcacc1a921..5d4c05074a5c 100644 --- a/drivers/crypto/caam/intern.h +++ b/drivers/crypto/caam/intern.h @@ -41,7 +41,6 @@ struct caam_drv_private_jr { struct device *dev; int ridx; struct caam_job_ring __iomem *rregs;/* JobR's register space */ - struct tasklet_struct irqtask; int irq;/* One per queue */ /* Number of scatterlist crypt transforms active on the JobR */ diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c index a81f551ac222..320228875e9a 100644 --- a/drivers/crypto/caam/jr.c +++ b/drivers/crypto/caam/jr.c @@ -73,8 +73,6 @@ static int caam_jr_shutdown(struct device *dev) ret = caam_reset_hw_jr(dev); - tasklet_kill(>irqtask); - /* Release interrupt */ free_irq(jrp->irq, dev); @@ -130,7 +128,7 @@ static irqreturn_t caam_jr_interrupt(int irq, void *st_dev) /* * Check the output ring for ready responses, kick -* tasklet if jobs done. +* the threaded irq if jobs done. */ irqstate = rd_reg32(>rregs->jrintstatus); if (!irqstate) @@ -152,18 +150,13 @@ static irqreturn_t caam_jr_interrupt(int irq, void *st_dev) /* Have valid interrupt at this point, just ACK and trigger */ wr_reg32(>rregs->jrintstatus, irqstate); - preempt_disable(); - tasklet_schedule(>irqtask); - preempt_enable(); - - return IRQ_HANDLED; + return IRQ_WAKE_THREAD; } -/* Deferred service handler, run as interrupt-fired tasklet */ -static void caam_jr_dequeue(unsigned long devarg) +static irqreturn_t caam_jr_threadirq(int irq, void *st_dev) { int hw_idx, sw_idx, i, head, tail; - struct device *dev = (struct device *)devarg; + struct device *dev = st_dev; struct caam_drv_private_jr *jrp = dev_get_drvdata(dev); void (*usercall)(struct device *dev, u32 *desc, u32 status, void *arg); u32 *userdesc, userstatus; @@ -237,6 +230,8 @@ static void caam_jr_dequeue(unsigned long devarg) /* reenable / unmask IRQs */ clrsetbits_32(>rregs->rconfig_lo, JRCFG_IMSK, 0); + + return IRQ_HANDLED; } /** @@ -394,11 +389,10 @@ static int caam_jr_init(struct device *dev) jrp = dev_get_drvdata(dev); - tasklet_init(>irqtask, caam_jr_dequeue, (unsigned long)dev); - /* Connect job ring interrupt handler. */ - error = request_irq(jrp->irq, caam_jr_interrupt, IRQF_SHARED, - dev_name(dev), dev); + error = request_threaded_irq(jrp->irq, caam_jr_interrupt, +caam_jr_threadirq, IRQF_SHARED, +dev_name(dev), dev); if (error) { dev_err(dev, "can't connect JobR %d interrupt (%d)\n", jrp->ridx, jrp->irq); @@ -460,7 +454,6 @@ static int caam_jr_init(struct device *dev) out_free_irq: free_irq(jrp->irq, dev); out_kill_deq: - tasklet_kill(>irqtask); return error; } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/11] crypto: caam: check and use dma_map_sg() return code
Strictly, dma_map_sg() may coalesce SG entries, but in practise on iMX hardware, this will never happen. However, dma_map_sg() can fail, and we completely fail to check its return value. So, fix this properly. Arrange the code to map the scatterlist early, so we know how many scatter table entries to allocate, and then fill them in. This allows us to keep relatively simple error cleanup paths. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 138 ++--- 1 file changed, 103 insertions(+), 35 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index e1925bf3a7cc..a639183d0115 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -187,15 +187,6 @@ static inline dma_addr_t buf_map_to_sec4_sg(struct device *jrdev, return buf_dma; } -/* Map req->src and put it in link table */ -static inline void src_map_to_sec4_sg(struct device *jrdev, - struct scatterlist *src, int src_nents, - struct sec4_sg_entry *sec4_sg) -{ - dma_map_sg(jrdev, src, src_nents, DMA_TO_DEVICE); - sg_to_sec4_sg_last(src, src_nents, sec4_sg, 0); -} - /* * Only put buffer in link table if it contains data, which is possible, * since a buffer has previously been used, and needs to be unmapped, @@ -791,7 +782,7 @@ static int ahash_update_ctx(struct ahash_request *req) int in_len = *buflen + req->nbytes, to_hash; u32 *sh_desc = ctx->sh_desc_update, *desc; dma_addr_t ptr = ctx->sh_desc_update_dma; - int src_nents, sec4_sg_bytes, sec4_sg_src_index; + int src_nents, mapped_nents, sec4_sg_bytes, sec4_sg_src_index; struct ahash_edesc *edesc; int ret = 0; int sh_len; @@ -807,8 +798,20 @@ static int ahash_update_ctx(struct ahash_request *req) dev_err(jrdev, "Invalid number of src SG.\n"); return src_nents; } + + if (src_nents) { + mapped_nents = dma_map_sg(jrdev, req->src, src_nents, + DMA_TO_DEVICE); + if (!mapped_nents) { + dev_err(jrdev, "unable to DMA map source\n"); + return -ENOMEM; + } + } else { + mapped_nents = 0; + } + sec4_sg_src_index = 1 + (*buflen ? 1 : 0); - sec4_sg_bytes = (sec4_sg_src_index + src_nents) * + sec4_sg_bytes = (sec4_sg_src_index + mapped_nents) * sizeof(struct sec4_sg_entry); /* @@ -820,6 +823,7 @@ static int ahash_update_ctx(struct ahash_request *req) if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); + dma_unmap_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); return -ENOMEM; } @@ -836,9 +840,10 @@ static int ahash_update_ctx(struct ahash_request *req) buf, state->buf_dma, *buflen, last_buflen); - if (src_nents) { - src_map_to_sec4_sg(jrdev, req->src, src_nents, - edesc->sec4_sg + sec4_sg_src_index); + if (mapped_nents) { + sg_to_sec4_sg_last(req->src, mapped_nents, + edesc->sec4_sg + sec4_sg_src_index, + 0); if (*next_buflen) scatterwalk_map_and_copy(next_buf, req->src, to_hash - *buflen, @@ -1001,7 +1006,7 @@ static int ahash_finup_ctx(struct ahash_request *req) u32 *sh_desc = ctx->sh_desc_finup, *desc; dma_addr_t ptr = ctx->sh_desc_finup_dma; int sec4_sg_bytes, sec4_sg_src_index; - int src_nents; + int src_nents, mapped_nents; int digestsize = crypto_ahash_digestsize(ahash); struct ahash_edesc *edesc; int ret = 0; @@ -1012,14 +1017,27 @@ static int ahash_finup_ctx(struct ahash_request *req) dev_err(jrdev, "Invalid number of src SG.\n"); return src_nents; } + + if (src_nents) { + mapped_nents = dma_map_sg(jrdev, req->src, src_nents, + DMA_TO_DEVICE); + if (!mapped_nents) { + dev_err(jrdev, "unable to DMA map source\n"); +
[PATCH 03/11] crypto: caam: incorporate job descriptor into struct ahash_edesc
Rather than giving the descriptor as hw_desc[0], give it's real size. All places where we allocate an ahash_edesc incorporate DESC_JOB_IO_LEN bytes of job descriptor. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 49 -- 1 file changed, 19 insertions(+), 30 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 47ea7b428156..ce9c1bc23795 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -604,7 +604,7 @@ struct ahash_edesc { int src_nents; int sec4_sg_bytes; struct sec4_sg_entry *sec4_sg; - u32 hw_desc[0]; + u32 hw_desc[DESC_JOB_IO_LEN / sizeof(u32)]; }; static inline void ahash_unmap(struct device *dev, @@ -815,8 +815,8 @@ static int ahash_update_ctx(struct ahash_request *req) * allocate space for base edesc and hw desc commands, * link tables */ - edesc = kzalloc(sizeof(*edesc) + DESC_JOB_IO_LEN + - sec4_sg_bytes, GFP_DMA | flags); + edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, + GFP_DMA | flags); if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); @@ -825,8 +825,7 @@ static int ahash_update_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)edesc + sizeof(struct ahash_edesc) + -DESC_JOB_IO_LEN; + edesc->sec4_sg = (void *)(edesc + 1); ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_BIDIRECTIONAL); @@ -925,8 +924,7 @@ static int ahash_final_ctx(struct ahash_request *req) sec4_sg_bytes = sec4_sg_src_index * sizeof(struct sec4_sg_entry); /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc) + DESC_JOB_IO_LEN + sec4_sg_bytes, - GFP_DMA | flags); + edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, GFP_DMA | flags); if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); return -ENOMEM; @@ -937,8 +935,7 @@ static int ahash_final_ctx(struct ahash_request *req) init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)edesc + sizeof(struct ahash_edesc) + -DESC_JOB_IO_LEN; + edesc->sec4_sg = (void *)(edesc + 1); edesc->src_nents = 0; ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, @@ -1016,8 +1013,7 @@ static int ahash_finup_ctx(struct ahash_request *req) sizeof(struct sec4_sg_entry); /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc) + DESC_JOB_IO_LEN + sec4_sg_bytes, - GFP_DMA | flags); + edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, GFP_DMA | flags); if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); return -ENOMEM; @@ -1029,8 +1025,7 @@ static int ahash_finup_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)edesc + sizeof(struct ahash_edesc) + -DESC_JOB_IO_LEN; + edesc->sec4_sg = (void *)(edesc + 1); ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_TO_DEVICE); @@ -1106,14 +1101,12 @@ static int ahash_digest(struct ahash_request *req) sec4_sg_bytes = 0; /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes + DESC_JOB_IO_LEN, - GFP_DMA | flags); + edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, GFP_DMA | flags); if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); return -ENOMEM; } - edesc->sec4_sg = (void *)edesc + sizeof(struct ahash_edesc) + - DESC_JOB_IO_LEN; + edesc->sec4_sg = (void *)(edesc + 1); edesc->sec4_sg_bytes = sec4_sg_bytes; edesc->src_nents = src_nents; @@ -1179,7 +1172,7 @@ static int ahash_final_no_ctx(struct ahash_request *req) int sh_len; /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzallo
[PATCH 05/11] crypto: caam: replace sec4_sg pointer with array
Since the extended descriptor includes the hardware descriptor, and the sec4 scatterlist immediately follows this, we can declare it as a array at the very end of the extended descriptor. This allows us to get rid of an initialiser for every site where we allocate an extended descriptor. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index e9c52cbf9a41..d2129be43bf1 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -595,16 +595,16 @@ static int ahash_setkey(struct crypto_ahash *ahash, * @sec4_sg_dma: physical mapped address of h/w link table * @src_nents: number of segments in input scatterlist * @sec4_sg_bytes: length of dma mapped sec4_sg space - * @sec4_sg: pointer to h/w link table * @hw_desc: the h/w job descriptor followed by any referenced link tables + * @sec4_sg: h/w link table */ struct ahash_edesc { dma_addr_t dst_dma; dma_addr_t sec4_sg_dma; int src_nents; int sec4_sg_bytes; - struct sec4_sg_entry *sec4_sg; u32 hw_desc[DESC_JOB_IO_LEN / sizeof(u32)] cacheline_aligned; + struct sec4_sg_entry sec4_sg[0]; }; static inline void ahash_unmap(struct device *dev, @@ -825,7 +825,6 @@ static int ahash_update_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_BIDIRECTIONAL); @@ -935,7 +934,6 @@ static int ahash_final_ctx(struct ahash_request *req) init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); edesc->src_nents = 0; ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, @@ -1025,7 +1023,6 @@ static int ahash_finup_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_TO_DEVICE); @@ -1106,7 +1103,7 @@ static int ahash_digest(struct ahash_request *req) dev_err(jrdev, "could not allocate extended descriptor\n"); return -ENOMEM; } - edesc->sec4_sg = (void *)(edesc + 1); + edesc->sec4_sg_bytes = sec4_sg_bytes; edesc->src_nents = src_nents; @@ -1264,7 +1261,6 @@ static int ahash_update_no_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); edesc->dst_dma = 0; state->buf_dma = buf_map_to_sec4_sg(jrdev, edesc->sec4_sg, @@ -1375,7 +1371,6 @@ static int ahash_finup_no_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg, buf, state->buf_dma, buflen, @@ -1470,7 +1465,6 @@ static int ahash_update_first(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); edesc->dst_dma = 0; if (src_nents > 1) { -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/11] crypto: caam: mark the hardware descriptor as cache line aligned
Mark the hardware descriptor as being cache line aligned; on DMA incoherent architectures, the hardware descriptor should sit in a separate cache line from the CPU accessed data to avoid polluting the caches. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index ce9c1bc23795..e9c52cbf9a41 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -604,7 +604,7 @@ struct ahash_edesc { int src_nents; int sec4_sg_bytes; struct sec4_sg_entry *sec4_sg; - u32 hw_desc[DESC_JOB_IO_LEN / sizeof(u32)]; + u32 hw_desc[DESC_JOB_IO_LEN / sizeof(u32)] cacheline_aligned; }; static inline void ahash_unmap(struct device *dev, -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/11] crypto: caam: ensure descriptor buffers are cacheline aligned
Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 85c8b048bdc1..47ea7b428156 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -99,17 +99,17 @@ static struct list_head hash_list; /* ahash per-session context */ struct caam_hash_ctx { - struct device *jrdev; - u32 sh_desc_update[DESC_HASH_MAX_USED_LEN]; - u32 sh_desc_update_first[DESC_HASH_MAX_USED_LEN]; - u32 sh_desc_fin[DESC_HASH_MAX_USED_LEN]; - u32 sh_desc_digest[DESC_HASH_MAX_USED_LEN]; - u32 sh_desc_finup[DESC_HASH_MAX_USED_LEN]; - dma_addr_t sh_desc_update_dma; + u32 sh_desc_update[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + u32 sh_desc_update_first[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + u32 sh_desc_fin[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + u32 sh_desc_digest[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + u32 sh_desc_finup[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + dma_addr_t sh_desc_update_dma cacheline_aligned; dma_addr_t sh_desc_update_first_dma; dma_addr_t sh_desc_fin_dma; dma_addr_t sh_desc_digest_dma; dma_addr_t sh_desc_finup_dma; + struct device *jrdev; u32 alg_type; u32 alg_op; u8 key[CAAM_MAX_HASH_KEY_SIZE]; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/11] crypto: caam: ensure that we clean up after an error
Ensure that we clean up allocations and DMA mappings after encountering an error rather than just giving up and leaking memory and resources. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 132 - 1 file changed, 79 insertions(+), 53 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index d2129be43bf1..e1925bf3a7cc 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -829,7 +829,7 @@ static int ahash_update_ctx(struct ahash_request *req) ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_BIDIRECTIONAL); if (ret) - return ret; + goto err; state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg + 1, @@ -860,7 +860,8 @@ static int ahash_update_ctx(struct ahash_request *req) DMA_TO_DEVICE); if (dma_mapping_error(jrdev, edesc->sec4_sg_dma)) { dev_err(jrdev, "unable to map S/G table\n"); - return -ENOMEM; + ret = -ENOMEM; + goto err; } append_seq_in_ptr(desc, edesc->sec4_sg_dma, ctx->ctx_len + @@ -875,13 +876,10 @@ static int ahash_update_ctx(struct ahash_request *req) #endif ret = caam_jr_enqueue(jrdev, desc, ahash_done_bi, req); - if (!ret) { - ret = -EINPROGRESS; - } else { - ahash_unmap_ctx(jrdev, edesc, req, ctx->ctx_len, - DMA_BIDIRECTIONAL); - kfree(edesc); - } + if (ret) + goto err; + + ret = -EINPROGRESS; } else if (*next_buflen) { scatterwalk_map_and_copy(buf + *buflen, req->src, 0, req->nbytes, 0); @@ -897,6 +895,11 @@ static int ahash_update_ctx(struct ahash_request *req) #endif return ret; + + err: + ahash_unmap_ctx(jrdev, edesc, req, ctx->ctx_len, DMA_BIDIRECTIONAL); + kfree(edesc); + return ret; } static int ahash_final_ctx(struct ahash_request *req) @@ -939,7 +942,7 @@ static int ahash_final_ctx(struct ahash_request *req) ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_TO_DEVICE); if (ret) - return ret; + goto err; state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg + 1, buf, state->buf_dma, buflen, @@ -951,7 +954,8 @@ static int ahash_final_ctx(struct ahash_request *req) sec4_sg_bytes, DMA_TO_DEVICE); if (dma_mapping_error(jrdev, edesc->sec4_sg_dma)) { dev_err(jrdev, "unable to map S/G table\n"); - return -ENOMEM; + ret = -ENOMEM; + goto err; } append_seq_in_ptr(desc, edesc->sec4_sg_dma, ctx->ctx_len + buflen, @@ -961,7 +965,8 @@ static int ahash_final_ctx(struct ahash_request *req) digestsize); if (dma_mapping_error(jrdev, edesc->dst_dma)) { dev_err(jrdev, "unable to map dst\n"); - return -ENOMEM; + ret = -ENOMEM; + goto err; } #ifdef DEBUG @@ -970,13 +975,14 @@ static int ahash_final_ctx(struct ahash_request *req) #endif ret = caam_jr_enqueue(jrdev, desc, ahash_done_ctx_src, req); - if (!ret) { - ret = -EINPROGRESS; - } else { - ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); - kfree(edesc); - } + if (ret) + goto err; + return -EINPROGRESS; + +err: + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + kfree(edesc); return ret; } @@ -1027,7 +1033,7 @@ static int ahash_finup_ctx(struct ahash_request *req) ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_TO_DEVICE); if (ret) - return ret; + goto err; state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg + 1, buf, state->buf_dma, buflen, @@ -1040,7 +1046,8 @@ static int ahash_finup_ctx(struct ahash_request *req) sec4_sg_bytes, DMA_TO_DEVICE); if (dma_mapping_error(jr
[PATCH 01/11] crypto: caam: fix DMA API mapping leak
caamhash contains this weird code: src_nents = sg_count(req->src, req->nbytes); dma_map_sg(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE); ... edesc->src_nents = src_nents; sg_count() returns zero when sg_nents_for_len() returns zero or one. This means we don't need to use a hardware scatterlist. However, setting src_nents to zero causes problems when we unmap: if (edesc->src_nents) dma_unmap_sg_chained(dev, req->src, edesc->src_nents, DMA_TO_DEVICE, edesc->chained); as zero here means that we have no entries to unmap. This causes us to leak DMA mappings, where we map one scatterlist entry and then fail to unmap it. This can be fixed in two ways: either by writing the number of entries that were requested of dma_map_sg(), or by reworking the "no SG required" case. We adopt the re-work solution here - we replace sg_count() with sg_nents_for_len(), so src_nents now contains the real number of scatterlist entries, and we then change the test for using the hardware scatterlist to src_nents > 1 rather than just non-zero. This change passes my sshd, openssl tests hashing /bin and tcrypt tests. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 26 +- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index f1ecc8df8d41..85c8b048bdc1 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1094,13 +1094,16 @@ static int ahash_digest(struct ahash_request *req) u32 options; int sh_len; - src_nents = sg_count(req->src, req->nbytes); + src_nents = sg_nents_for_len(req->src, req->nbytes); if (src_nents < 0) { dev_err(jrdev, "Invalid number of src SG.\n"); return src_nents; } - dma_map_sg(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE); - sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); + dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); + if (src_nents > 1) + sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); + else + sec4_sg_bytes = 0; /* allocate space for base edesc and hw desc commands, link tables */ edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes + DESC_JOB_IO_LEN, @@ -1118,7 +1121,7 @@ static int ahash_digest(struct ahash_request *req) desc = edesc->hw_desc; init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); - if (src_nents) { + if (src_nents > 1) { sg_to_sec4_sg_last(req->src, src_nents, edesc->sec4_sg, 0); edesc->sec4_sg_dma = dma_map_single(jrdev, edesc->sec4_sg, sec4_sg_bytes, DMA_TO_DEVICE); @@ -1246,7 +1249,7 @@ static int ahash_update_no_ctx(struct ahash_request *req) if (to_hash) { src_nents = sg_nents_for_len(req->src, -req->nbytes - (*next_buflen)); +req->nbytes - *next_buflen); if (src_nents < 0) { dev_err(jrdev, "Invalid number of src SG.\n"); return src_nents; @@ -1450,13 +1453,18 @@ static int ahash_update_first(struct ahash_request *req) to_hash = req->nbytes - *next_buflen; if (to_hash) { - src_nents = sg_count(req->src, req->nbytes - (*next_buflen)); + src_nents = sg_nents_for_len(req->src, +req->nbytes - *next_buflen); if (src_nents < 0) { dev_err(jrdev, "Invalid number of src SG.\n"); return src_nents; } - dma_map_sg(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE); - sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); + dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); + if (src_nents > 1) + sec4_sg_bytes = src_nents * + sizeof(struct sec4_sg_entry); + else + sec4_sg_bytes = 0; /* * allocate space for base edesc and hw desc commands, @@ -1476,7 +1484,7 @@ static int ahash_update_first(struct ahash_request *req) DESC_JOB_IO_LEN; edesc->dst_dma = 0; - if (src_nents) { + if (src_nents > 1) { sg_to_sec4_sg_last(req->src, src_nents, edesc->sec4_sg, 0);
[PATCH 00/11] Further iMX CAAM updates
This is a re-post (with hopefully bugs fixed from December's review). Untested, because AF_ALG appears to be broken in 4.8-rc1. Maybe someone can provide some hints how to test using tcrypt please? Here are further imx-caam updates that I've had since before the previous merge window. Please review and (I guess) if Freescale folk can provide acks etc that would be nice. Thanks. drivers/crypto/caam/caamhash.c | 540 ++--- drivers/crypto/caam/intern.h | 1 - drivers/crypto/caam/jr.c | 25 +- 3 files changed, 305 insertions(+), 261 deletions(-) -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] scatterlist: add sg_alloc_table_from_buf() helper
On Thu, Mar 31, 2016 at 04:45:57PM +0200, Boris Brezillon wrote: > Hi Russell, > > On Thu, 31 Mar 2016 15:14:13 +0100 > Russell King - ARM Linux <li...@arm.linux.org.uk> wrote: > > > On Thu, Mar 31, 2016 at 02:29:42PM +0200, Boris Brezillon wrote: > > > sg_alloc_table_from_buf() provides an easy solution to create an sg_table > > > from a virtual address pointer. This function takes care of dealing with > > > vmallocated buffers, buffer alignment, or DMA engine limitations (maximum > > > DMA transfer size). > > > > Please note that the DMA API does not take account of coherency of memory > > regions other than non-high/lowmem - there are specific extensions to > > deal with this. > > Ok, you said 'non-high/lowmem', this means vmalloced and kmapped buffers > already fall in this case, right? > > Could you tell me more about those specific extensions? I was slightly confused - the extensions I was thinking of are those listed at the bottom of Documentation/cachetlb.txt, which have nothing to do with DMA. However, it's probably worth reading Documentation/DMA-API-HOWTO.txt to read up on what kinds of memory are considered to be DMA-able in the kernel. > > What this means is that having an API that takes any virtual address > > pointer, converts it to a scatterlist which is then DMA mapped, is > > unsafe. > > Which means some implementations already get this wrong (see > spi_map_buf(), and I'm pretty sure it's not the only one). Quite possible, but that is driver stuff, and driver stuff gets things wrong all the time. :) > > It'll be okay for PIPT and non-aliasing VIPT cache architectures, but > > for other cache architectures this will hide this problem and make > > review harder. > > > > Ok, you lost me. I'll have to do my homework and try to understand what > this means :). P = physical address V = virtual address I = indexed T = tag The tag is held in each cache line. When a location is looked up in the cache, an index is used to locate a set of cache lines and the tag is compared to check which cache line in the set is the correct one (or whether the address even exists in the cache.) How the index and tag are derived varies between cache architectures. PIPT = indexed by physical address, tagged with physical address. Never aliases with itself in the presence of multiple virtual mappings. VIPT = indexed by virtual address, tagged with physical address. If the bits from the virtual address do not overlap the MMU page size, it is also alias free, otherwise aliases can exist, but can be eliminated by "cache colouring" - ensuring that a physical address is always mapped with the same overlapping bits. VIVT = indexed by virtual address, tagged with virtual address. The worst kind of cache, since every different mapping of the same physical address is guaranteed by design to alias with other mappings. There is little cache colouring between different kernel mappings (eg, between lowmem and vmalloc space.) What this means is that, while the DMA API takes care of DMA aliases in the lowmem mappings, an alias-prone VIPT cache will remain incoherent with DMA if it is remapped into vmalloc space, and the mapping happens to have a different cache colour. In other words, this is a data corruption issue. Hence, taking a range of vmalloc() addresses, converting them into a scatterlist, then using the DMA API on the scatterlist _only_ guarantees that the lowmem (and kmap'd highmem mappings) are coherent with DMA. There is no way for the DMA API to know that other mappings exist, and obviously flushing every possible cache line just because a mapping might exist multiplies the expense of the DMA API: not only in terms of time spent running through all the possibilities, which doubles for every aliasing bit of VIPT, but also TLB pressure since you'd have to create a mapping for each alias and tear it back down. VIVT is even worse, since there is no other virtual mapping which is coherent, would need to be known, and each mapping would need to be individually flushed. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] scatterlist: add sg_alloc_table_from_buf() helper
On Thu, Mar 31, 2016 at 02:29:42PM +0200, Boris Brezillon wrote: > sg_alloc_table_from_buf() provides an easy solution to create an sg_table > from a virtual address pointer. This function takes care of dealing with > vmallocated buffers, buffer alignment, or DMA engine limitations (maximum > DMA transfer size). Please note that the DMA API does not take account of coherency of memory regions other than non-high/lowmem - there are specific extensions to deal with this. What this means is that having an API that takes any virtual address pointer, converts it to a scatterlist which is then DMA mapped, is unsafe. It'll be okay for PIPT and non-aliasing VIPT cache architectures, but for other cache architectures this will hide this problem and make review harder. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] crypto: marvell/cesa - replace dma_to_phys with dma_map_single
On Thu, Mar 17, 2016 at 07:17:24PM -0400, ok...@codeaurora.org wrote: > What is the correct way? I don't want to write engine->sram_dma = sram Well, what the driver _is_ wanting to do is to go from a CPU physical address to a device DMA address. phys_to_dma() looks like the correct thing there to me, but I guess that's just an offset and doesn't take account of any IOMMU that may be in the way. If you have an IOMMU, then the whole phys_to_dma() thing is a total failure as it only does a linear translation, and there are no interfaces in the kernel to take account of an IOMMU in the way. So, it needs something designed for the job, implemented and discussed by the normal methods of proposing a new cross-arch interface for drivers to use. What I'm certain of, though, is that the change proposed in this patch will break current users of this driver: virt_to_page() on an address returned by ioremap() is completely undefined, and will result in either a kernel oops, or if not poking at memory which isn't a struct page, ultimately resulting in something that isn't SRAM being pointed to by "engine->sram_dma". -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] crypto: marvell/cesa - replace dma_to_phys with dma_map_single
On Thu, Mar 17, 2016 at 06:02:15PM -0400, Sinan Kaya wrote: > Getting ready to remove dma_to_phys API. Drivers should not be > using this API for DMA operations. Instead, they should go > through the dma_map or dma_alloc APIs. > > Signed-off-by: Sinan Kaya> --- > drivers/crypto/marvell/cesa.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c > index c0656e7..52ddfa4 100644 > --- a/drivers/crypto/marvell/cesa.c > +++ b/drivers/crypto/marvell/cesa.c > @@ -350,8 +350,8 @@ static int mv_cesa_get_sram(struct platform_device *pdev, > int idx) > if (IS_ERR(engine->sram)) > return PTR_ERR(engine->sram); > > - engine->sram_dma = phys_to_dma(cesa->dev, > -(phys_addr_t)res->start); > + engine->sram_dma = dma_map_single(cesa->dev, engine->sram, > + DMA_TO_DEVICE); Documentation/DMA-API.txt dma_addr_t dma_map_single(struct device *dev, void *cpu_addr, size_t size, enum dma_data_direction direction) Notes: Not all memory regions in a machine can be mapped by this API. Further, contiguous kernel virtual space may not be contiguous as physical memory. Since this API does not provide any scatter/gather capability, it will fail if the user tries to map a non-physically contiguous piece of memory. For this reason, memory to be mapped by this API should be obtained from sources which guarantee it to be physically contiguous (like kmalloc). Specifically, ioremapped memory will *not* work as you expect with dma_map_single(). -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 06/11] crypto: caam: ensure that we clean up after an error
On Wed, Dec 09, 2015 at 05:08:41PM +0200, Horia Geantă wrote: > On 12/7/2015 9:12 PM, Russell King wrote: > > Ensure that we clean up allocations and DMA mappings after encountering > > an error rather than just giving up and leaking memory and resources. > > > > Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> > > I guess the error cleanup code should be grouped under an "err" label, > instead of duplicating it. It'd be useful if you could quote the code you were commenting on please. I now need to dig out the patch from my git tree to work out the relevance of your comment is. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 07/11] crypto: caam: check and use dma_map_sg() return code
On Wed, Dec 09, 2015 at 05:20:45PM +0200, Horia Geantă wrote: > On 12/7/2015 9:12 PM, Russell King wrote: > > Strictly, dma_map_sg() may coalesce SG entries, but in practise on iMX > > hardware, this will never happen. However, dma_map_sg() can fail, and > > we completely fail to check its return value. So, fix this properly. > > > > Arrange the code to map the scatterlist early, so we know how many > > scatter table entries to allocate, and then fill them in. This allows > > us to keep relatively simple error cleanup paths. > > > > Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> > > Some tcrypt tests fail - looks like those with zero plaintext: > caam_jr ffe301000.jr: unable to map source for DMA > alg: hash: digest failed on test 1 for sha1-caam: ret=12 > [...] > > Need to be careful, dma_map_sg() returning zero is an error only if ptxt > is not null (alternatively src_nents returned by sg_nents_for_len() > could be checked). > > > @@ -1091,7 +1102,7 @@ static int ahash_digest(struct ahash_request *req) > > u32 *sh_desc = ctx->sh_desc_digest, *desc; > > dma_addr_t ptr = ctx->sh_desc_digest_dma; > > int digestsize = crypto_ahash_digestsize(ahash); > > - int src_nents, sec4_sg_bytes; > > + int src_nents, mapped_nents, sec4_sg_bytes; > > dma_addr_t src_dma; > > struct ahash_edesc *edesc; > > int ret = 0; > > @@ -1099,9 +1110,14 @@ static int ahash_digest(struct ahash_request *req) > > int sh_len; > > > > src_nents = sg_nents_for_len(req->src, req->nbytes); > > - dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); > > - if (src_nents > 1) > > - sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); > > + mapped_nents = dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); > > + if (mapped_nents == 0) { > > + dev_err(jrdev, "unable to map source for DMA\n"); > > + return -ENOMEM; > > + } > > This is at least one of the places where the error condition must change > to smth. like (src_nents != 0 && mapped_nents == 0). I guess we should avoid calling dma_map_sg() and dma_unmap_sg() when src_nents is zero. Thanks, I'll work that into the next patch revision. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 00/13] Further iMX CAAM updates
On Wed, Dec 09, 2015 at 05:06:03PM +0200, Horia Geantă wrote: > On 12/7/2015 9:11 PM, Russell King - ARM Linux wrote: > > Here are further imx-caam updates that I've had since before the > > previous merge window. Please review and (I guess) if Freescale > > folk can provide acks etc that would be nice. Thanks. > > Thanks Russell. > > Note that the patch set does not apply cleanly, please rebase on latest > cryptodev-2.6 tree. I'm afraid I'm not a fan of digging out random git trees, rebasing patches on top of them, and then not being able to test them locally without dragging the entire tree into my own tree. I'd rather wait another kernel cycle instead of polluting my own tree with commits from other trees. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC 07/11] crypto: caam: check and use dma_map_sg() return code
Strictly, dma_map_sg() may coalesce SG entries, but in practise on iMX hardware, this will never happen. However, dma_map_sg() can fail, and we completely fail to check its return value. So, fix this properly. Arrange the code to map the scatterlist early, so we know how many scatter table entries to allocate, and then fill them in. This allows us to keep relatively simple error cleanup paths. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 109 - 1 file changed, 74 insertions(+), 35 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 076cfddf32bb..9638e9f4f001 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -187,15 +187,6 @@ static inline dma_addr_t buf_map_to_sec4_sg(struct device *jrdev, return buf_dma; } -/* Map req->src and put it in link table */ -static inline void src_map_to_sec4_sg(struct device *jrdev, - struct scatterlist *src, int src_nents, - struct sec4_sg_entry *sec4_sg) -{ - dma_map_sg(jrdev, src, src_nents, DMA_TO_DEVICE); - sg_to_sec4_sg_last(src, src_nents, sec4_sg, 0); -} - /* * Only put buffer in link table if it contains data, which is possible, * since a buffer has previously been used, and needs to be unmapped, @@ -791,7 +782,7 @@ static int ahash_update_ctx(struct ahash_request *req) int in_len = *buflen + req->nbytes, to_hash; u32 *sh_desc = ctx->sh_desc_update, *desc; dma_addr_t ptr = ctx->sh_desc_update_dma; - int src_nents, sec4_sg_bytes, sec4_sg_src_index; + int src_nents, mapped_nents, sec4_sg_bytes, sec4_sg_src_index; struct ahash_edesc *edesc; int ret = 0; int sh_len; @@ -803,8 +794,19 @@ static int ahash_update_ctx(struct ahash_request *req) if (to_hash) { src_nents = sg_nents_for_len(req->src, req->nbytes - (*next_buflen)); + if (src_nents) { + mapped_nents = dma_map_sg(jrdev, req->src, src_nents, + DMA_TO_DEVICE); + if (!mapped_nents) { + dev_err(jrdev, "unable to DMA map source\n"); + return -ENOMEM; + } + } else { + mapped_nents = 0; + } + sec4_sg_src_index = 1 + (*buflen ? 1 : 0); - sec4_sg_bytes = (sec4_sg_src_index + src_nents) * + sec4_sg_bytes = (sec4_sg_src_index + mapped_nents) * sizeof(struct sec4_sg_entry); /* @@ -816,6 +818,7 @@ static int ahash_update_ctx(struct ahash_request *req) if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); + dma_unmap_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); return -ENOMEM; } @@ -836,9 +839,10 @@ static int ahash_update_ctx(struct ahash_request *req) buf, state->buf_dma, *buflen, last_buflen); - if (src_nents) { - src_map_to_sec4_sg(jrdev, req->src, src_nents, - edesc->sec4_sg + sec4_sg_src_index); + if (mapped_nents) { + sg_to_sec4_sg_last(req->src, mapped_nents, + edesc->sec4_sg + sec4_sg_src_index, + 0); if (*next_buflen) scatterwalk_map_and_copy(next_buf, req->src, to_hash - *buflen, @@ -1004,21 +1008,28 @@ static int ahash_finup_ctx(struct ahash_request *req) u32 *sh_desc = ctx->sh_desc_finup, *desc; dma_addr_t ptr = ctx->sh_desc_finup_dma; int sec4_sg_bytes, sec4_sg_src_index; - int src_nents; + int src_nents, mapped_nents; int digestsize = crypto_ahash_digestsize(ahash); struct ahash_edesc *edesc; int ret = 0; int sh_len; src_nents = sg_nents_for_len(req->src, req->nbytes); + mapped_nents = dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); + if (!mapped_nents) { + dev_err(jrdev, "unable to DMA map source\n"); + return -ENOMEM; + } + sec4_sg_src_index = 1 + (buflen ? 1 : 0); - sec4_sg_bytes = (sec4_sg_src_index + src_nents) * + sec4_sg_bytes = (sec4_sg_src_index + mapped_nents) *
[PATCH RFC 02/11] crypto: caam: ensure descriptor buffers are cacheline aligned
Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index eccde7207f92..6a6d74f38300 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -99,17 +99,17 @@ static struct list_head hash_list; /* ahash per-session context */ struct caam_hash_ctx { - struct device *jrdev; - u32 sh_desc_update[DESC_HASH_MAX_USED_LEN]; - u32 sh_desc_update_first[DESC_HASH_MAX_USED_LEN]; - u32 sh_desc_fin[DESC_HASH_MAX_USED_LEN]; - u32 sh_desc_digest[DESC_HASH_MAX_USED_LEN]; - u32 sh_desc_finup[DESC_HASH_MAX_USED_LEN]; - dma_addr_t sh_desc_update_dma; + u32 sh_desc_update[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + u32 sh_desc_update_first[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + u32 sh_desc_fin[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + u32 sh_desc_digest[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + u32 sh_desc_finup[DESC_HASH_MAX_USED_LEN] cacheline_aligned; + dma_addr_t sh_desc_update_dma cacheline_aligned; dma_addr_t sh_desc_update_first_dma; dma_addr_t sh_desc_fin_dma; dma_addr_t sh_desc_digest_dma; dma_addr_t sh_desc_finup_dma; + struct device *jrdev; u32 alg_type; u32 alg_op; u8 key[CAAM_MAX_HASH_KEY_SIZE]; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC 04/11] crypto: caam: mark the hardware descriptor as cache line aligned
Mark the hardware descriptor as being cache line aligned; on DMA incoherent architectures, the hardware descriptor should sit in a separate cache line from the CPU accessed data to avoid polluting the caches. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index cf557e1291b3..0a9665140d26 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -604,7 +604,7 @@ struct ahash_edesc { int src_nents; int sec4_sg_bytes; struct sec4_sg_entry *sec4_sg; - u32 hw_desc[DESC_JOB_IO_LEN / sizeof(u32)]; + u32 hw_desc[DESC_JOB_IO_LEN / sizeof(u32)] cacheline_aligned; }; static inline void ahash_unmap(struct device *dev, -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC 05/11] crypto: caam: replace sec4_sg pointer with array
Since the extended descriptor includes the hardware descriptor, and the sec4 scatterlist immediately follows this, we can declare it as a array at the very end of the extended descriptor. This allows us to get rid of an initialiser for every site where we allocate an extended descriptor. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 0a9665140d26..d48974d1897d 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -595,16 +595,16 @@ static int ahash_setkey(struct crypto_ahash *ahash, * @sec4_sg_dma: physical mapped address of h/w link table * @src_nents: number of segments in input scatterlist * @sec4_sg_bytes: length of dma mapped sec4_sg space - * @sec4_sg: pointer to h/w link table * @hw_desc: the h/w job descriptor followed by any referenced link tables + * @sec4_sg: h/w link table */ struct ahash_edesc { dma_addr_t dst_dma; dma_addr_t sec4_sg_dma; int src_nents; int sec4_sg_bytes; - struct sec4_sg_entry *sec4_sg; u32 hw_desc[DESC_JOB_IO_LEN / sizeof(u32)] cacheline_aligned; + struct sec4_sg_entry sec4_sg[0]; }; static inline void ahash_unmap(struct device *dev, @@ -821,7 +821,6 @@ static int ahash_update_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_BIDIRECTIONAL); @@ -931,7 +930,6 @@ static int ahash_final_ctx(struct ahash_request *req) init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); edesc->src_nents = 0; ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, @@ -1016,7 +1014,6 @@ static int ahash_finup_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_TO_DEVICE); @@ -1093,7 +1090,7 @@ static int ahash_digest(struct ahash_request *req) dev_err(jrdev, "could not allocate extended descriptor\n"); return -ENOMEM; } - edesc->sec4_sg = (void *)(edesc + 1); + edesc->sec4_sg_bytes = sec4_sg_bytes; edesc->src_nents = src_nents; @@ -1247,7 +1244,6 @@ static int ahash_update_no_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); edesc->dst_dma = 0; state->buf_dma = buf_map_to_sec4_sg(jrdev, edesc->sec4_sg, @@ -1354,7 +1350,6 @@ static int ahash_finup_no_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg, buf, state->buf_dma, buflen, @@ -1445,7 +1440,6 @@ static int ahash_update_first(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)(edesc + 1); edesc->dst_dma = 0; if (src_nents > 1) { -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC 11/11] crypto: caam: get rid of tasklet
Threaded interrupts can perform the function of the tasklet, and much more safely too - without races when trying to take the tasklet and interrupt down on device removal. With the old code, there is a window where we call tasklet_kill(). If the interrupt handler happens to be running on a different CPU, and subsequently calls tasklet_schedule(), the tasklet will be re-scheduled for execution. Switching to a hardirq/threadirq combination implementation avoids this, and it also means generic code deals with the teardown sequencing of the threaded and non-threaded parts. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/intern.h | 1 - drivers/crypto/caam/jr.c | 25 + 2 files changed, 9 insertions(+), 17 deletions(-) diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h index e2bcacc1a921..5d4c05074a5c 100644 --- a/drivers/crypto/caam/intern.h +++ b/drivers/crypto/caam/intern.h @@ -41,7 +41,6 @@ struct caam_drv_private_jr { struct device *dev; int ridx; struct caam_job_ring __iomem *rregs;/* JobR's register space */ - struct tasklet_struct irqtask; int irq;/* One per queue */ /* Number of scatterlist crypt transforms active on the JobR */ diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c index f7e0d8d4c3da..b77ee2b88f37 100644 --- a/drivers/crypto/caam/jr.c +++ b/drivers/crypto/caam/jr.c @@ -73,8 +73,6 @@ int caam_jr_shutdown(struct device *dev) ret = caam_reset_hw_jr(dev); - tasklet_kill(>irqtask); - /* Release interrupt */ free_irq(jrp->irq, dev); @@ -130,7 +128,7 @@ static irqreturn_t caam_jr_interrupt(int irq, void *st_dev) /* * Check the output ring for ready responses, kick -* tasklet if jobs done. +* the threaded irq if jobs done. */ irqstate = rd_reg32(>rregs->jrintstatus); if (!irqstate) @@ -152,18 +150,13 @@ static irqreturn_t caam_jr_interrupt(int irq, void *st_dev) /* Have valid interrupt at this point, just ACK and trigger */ wr_reg32(>rregs->jrintstatus, irqstate); - preempt_disable(); - tasklet_schedule(>irqtask); - preempt_enable(); - - return IRQ_HANDLED; + return IRQ_WAKE_THREAD; } -/* Deferred service handler, run as interrupt-fired tasklet */ -static void caam_jr_dequeue(unsigned long devarg) +static irqreturn_t caam_jr_threadirq(int irq, void *st_dev) { int hw_idx, sw_idx, i, head, tail; - struct device *dev = (struct device *)devarg; + struct device *dev = st_dev; struct caam_drv_private_jr *jrp = dev_get_drvdata(dev); void (*usercall)(struct device *dev, u32 *desc, u32 status, void *arg); u32 *userdesc, userstatus; @@ -237,6 +230,8 @@ static void caam_jr_dequeue(unsigned long devarg) /* reenable / unmask IRQs */ clrbits32(>rregs->rconfig_lo, JRCFG_IMSK); + + return IRQ_HANDLED; } /** @@ -394,11 +389,10 @@ static int caam_jr_init(struct device *dev) jrp = dev_get_drvdata(dev); - tasklet_init(>irqtask, caam_jr_dequeue, (unsigned long)dev); - /* Connect job ring interrupt handler. */ - error = request_irq(jrp->irq, caam_jr_interrupt, IRQF_SHARED, - dev_name(dev), dev); + error = request_threaded_irq(jrp->irq, caam_jr_interrupt, +caam_jr_threadirq, IRQF_SHARED, +dev_name(dev), dev); if (error) { dev_err(dev, "can't connect JobR %d interrupt (%d)\n", jrp->ridx, jrp->irq); @@ -460,7 +454,6 @@ static int caam_jr_init(struct device *dev) out_free_irq: free_irq(jrp->irq, dev); out_kill_deq: - tasklet_kill(>irqtask); return error; } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC 03/11] crypto: caam: incorporate job descriptor into struct ahash_edesc
Rather than giving the descriptor as hw_desc[0], give it's real size. All places where we allocate an ahash_edesc incorporate DESC_JOB_IO_LEN bytes of job descriptor. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 49 -- 1 file changed, 19 insertions(+), 30 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 6a6d74f38300..cf557e1291b3 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -604,7 +604,7 @@ struct ahash_edesc { int src_nents; int sec4_sg_bytes; struct sec4_sg_entry *sec4_sg; - u32 hw_desc[0]; + u32 hw_desc[DESC_JOB_IO_LEN / sizeof(u32)]; }; static inline void ahash_unmap(struct device *dev, @@ -811,8 +811,8 @@ static int ahash_update_ctx(struct ahash_request *req) * allocate space for base edesc and hw desc commands, * link tables */ - edesc = kzalloc(sizeof(*edesc) + DESC_JOB_IO_LEN + - sec4_sg_bytes, GFP_DMA | flags); + edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, + GFP_DMA | flags); if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); @@ -821,8 +821,7 @@ static int ahash_update_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)edesc + sizeof(struct ahash_edesc) + -DESC_JOB_IO_LEN; + edesc->sec4_sg = (void *)(edesc + 1); ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_BIDIRECTIONAL); @@ -921,8 +920,7 @@ static int ahash_final_ctx(struct ahash_request *req) sec4_sg_bytes = sec4_sg_src_index * sizeof(struct sec4_sg_entry); /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc) + DESC_JOB_IO_LEN + sec4_sg_bytes, - GFP_DMA | flags); + edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, GFP_DMA | flags); if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); return -ENOMEM; @@ -933,8 +931,7 @@ static int ahash_final_ctx(struct ahash_request *req) init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)edesc + sizeof(struct ahash_edesc) + -DESC_JOB_IO_LEN; + edesc->sec4_sg = (void *)(edesc + 1); edesc->src_nents = 0; ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, @@ -1007,8 +1004,7 @@ static int ahash_finup_ctx(struct ahash_request *req) sizeof(struct sec4_sg_entry); /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc) + DESC_JOB_IO_LEN + sec4_sg_bytes, - GFP_DMA | flags); + edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, GFP_DMA | flags); if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); return -ENOMEM; @@ -1020,8 +1016,7 @@ static int ahash_finup_ctx(struct ahash_request *req) edesc->src_nents = src_nents; edesc->sec4_sg_bytes = sec4_sg_bytes; - edesc->sec4_sg = (void *)edesc + sizeof(struct ahash_edesc) + -DESC_JOB_IO_LEN; + edesc->sec4_sg = (void *)(edesc + 1); ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_TO_DEVICE); @@ -1093,14 +1088,12 @@ static int ahash_digest(struct ahash_request *req) sec4_sg_bytes = 0; /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes + DESC_JOB_IO_LEN, - GFP_DMA | flags); + edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, GFP_DMA | flags); if (!edesc) { dev_err(jrdev, "could not allocate extended descriptor\n"); return -ENOMEM; } - edesc->sec4_sg = (void *)edesc + sizeof(struct ahash_edesc) + - DESC_JOB_IO_LEN; + edesc->sec4_sg = (void *)(edesc + 1); edesc->sec4_sg_bytes = sec4_sg_bytes; edesc->src_nents = src_nents; @@ -1166,7 +1159,7 @@ static int ahash_final_no_ctx(struct ahash_request *req) int sh_len; /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzallo
[PATCH RFC 08/11] crypto: caam: add ahash_edesc_alloc() for descriptor allocation
Add a helper function to perform the descriptor allocation. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 60 +++--- 1 file changed, 33 insertions(+), 27 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 9638e9f4f001..f35f4cfc27a7 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -765,6 +765,25 @@ static void ahash_done_ctx_dst(struct device *jrdev, u32 *desc, u32 err, req->base.complete(>base, err); } +/* + * Allocate an enhanced descriptor, which contains the hardware descriptor + * and space for hardware scatter table containing sg_num entries. + */ +static struct ahash_edesc *ahash_edesc_alloc(struct caam_hash_ctx *ctx, +int sg_num, gfp_t flags) +{ + struct ahash_edesc *edesc; + unsigned int sg_size = sg_num * sizeof(struct sec4_sg_entry); + + edesc = kzalloc(sizeof(*edesc) + sg_size, GFP_DMA | flags); + if (!edesc) { + dev_err(ctx->jrdev, "could not allocate extended descriptor\n"); + return NULL; + } + + return edesc; +} + /* submit update job descriptor */ static int ahash_update_ctx(struct ahash_request *req) { @@ -813,11 +832,9 @@ static int ahash_update_ctx(struct ahash_request *req) * allocate space for base edesc and hw desc commands, * link tables */ - edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, - GFP_DMA | flags); + edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index + mapped_nents, + flags); if (!edesc) { - dev_err(jrdev, - "could not allocate extended descriptor\n"); dma_unmap_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); return -ENOMEM; } @@ -930,11 +947,9 @@ static int ahash_final_ctx(struct ahash_request *req) sec4_sg_bytes = sec4_sg_src_index * sizeof(struct sec4_sg_entry); /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, GFP_DMA | flags); - if (!edesc) { - dev_err(jrdev, "could not allocate extended descriptor\n"); + edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index, flags); + if (!edesc) return -ENOMEM; - } sh_len = desc_len(sh_desc); desc = edesc->hw_desc; @@ -1026,9 +1041,9 @@ static int ahash_finup_ctx(struct ahash_request *req) sizeof(struct sec4_sg_entry); /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, GFP_DMA | flags); + edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index + mapped_nents, + flags); if (!edesc) { - dev_err(jrdev, "could not allocate extended descriptor\n"); dma_unmap_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); return -ENOMEM; } @@ -1122,9 +1137,9 @@ static int ahash_digest(struct ahash_request *req) sec4_sg_bytes = 0; /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, GFP_DMA | flags); + edesc = ahash_edesc_alloc(ctx, mapped_nents > 1 ? mapped_nents : 0, + flags); if (!edesc) { - dev_err(jrdev, "could not allocate extended descriptor\n"); dma_unmap_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); return -ENOMEM; } @@ -1198,13 +1213,10 @@ static int ahash_final_no_ctx(struct ahash_request *req) int sh_len; /* allocate space for base edesc and hw desc commands, link tables */ - edesc = kzalloc(sizeof(*edesc), GFP_DMA | flags); - if (!edesc) { - dev_err(jrdev, "could not allocate extended descriptor\n"); + edesc = ahash_edesc_alloc(ctx, 0, flags); + if (!edesc) return -ENOMEM; - } - edesc->sec4_sg_bytes = 0; sh_len = desc_len(sh_desc); desc = edesc->hw_desc; init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); @@ -1287,11 +1299,8 @@ static int ahash_update_no_ctx(struct ahash_request *req) * allocate space for base edesc and hw desc commands, * link tables */ - edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes, - GFP_DMA | flags); + edesc = ahash_edesc_alloc(ctx, 1 + mapped_nents, flags);
[PATCH RFC 09/11] crypto: caam: move job descriptor initialisation to ahash_edesc_alloc()
Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 84 +- 1 file changed, 34 insertions(+), 50 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index f35f4cfc27a7..241268d108ec 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -770,7 +770,9 @@ static void ahash_done_ctx_dst(struct device *jrdev, u32 *desc, u32 err, * and space for hardware scatter table containing sg_num entries. */ static struct ahash_edesc *ahash_edesc_alloc(struct caam_hash_ctx *ctx, -int sg_num, gfp_t flags) +int sg_num, u32 *sh_desc, +dma_addr_t sh_desc_dma, +gfp_t flags) { struct ahash_edesc *edesc; unsigned int sg_size = sg_num * sizeof(struct sec4_sg_entry); @@ -781,6 +783,9 @@ static struct ahash_edesc *ahash_edesc_alloc(struct caam_hash_ctx *ctx, return NULL; } + init_job_desc_shared(edesc->hw_desc, sh_desc_dma, desc_len(sh_desc), +HDR_SHARE_DEFER | HDR_REVERSE); + return edesc; } @@ -799,12 +804,10 @@ static int ahash_update_ctx(struct ahash_request *req) int *next_buflen = state->current_buf ? >buflen_0 : >buflen_1, last_buflen; int in_len = *buflen + req->nbytes, to_hash; - u32 *sh_desc = ctx->sh_desc_update, *desc; - dma_addr_t ptr = ctx->sh_desc_update_dma; + u32 *desc; int src_nents, mapped_nents, sec4_sg_bytes, sec4_sg_src_index; struct ahash_edesc *edesc; int ret = 0; - int sh_len; last_buflen = *next_buflen; *next_buflen = in_len & (crypto_tfm_alg_blocksize(>base) - 1); @@ -833,7 +836,8 @@ static int ahash_update_ctx(struct ahash_request *req) * link tables */ edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index + mapped_nents, - flags); + ctx->sh_desc_update, + ctx->sh_desc_update_dma, flags); if (!edesc) { dma_unmap_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); return -ENOMEM; @@ -871,10 +875,7 @@ static int ahash_update_ctx(struct ahash_request *req) state->current_buf = !state->current_buf; - sh_len = desc_len(sh_desc); desc = edesc->hw_desc; - init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | -HDR_REVERSE); edesc->sec4_sg_dma = dma_map_single(jrdev, edesc->sec4_sg, sec4_sg_bytes, @@ -935,25 +936,23 @@ static int ahash_final_ctx(struct ahash_request *req) int buflen = state->current_buf ? state->buflen_1 : state->buflen_0; int last_buflen = state->current_buf ? state->buflen_0 : state->buflen_1; - u32 *sh_desc = ctx->sh_desc_fin, *desc; - dma_addr_t ptr = ctx->sh_desc_fin_dma; + u32 *desc; int sec4_sg_bytes, sec4_sg_src_index; int digestsize = crypto_ahash_digestsize(ahash); struct ahash_edesc *edesc; int ret = 0; - int sh_len; sec4_sg_src_index = 1 + (buflen ? 1 : 0); sec4_sg_bytes = sec4_sg_src_index * sizeof(struct sec4_sg_entry); /* allocate space for base edesc and hw desc commands, link tables */ - edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index, flags); + edesc = ahash_edesc_alloc(ctx, sec4_sg_src_index, + ctx->sh_desc_fin, ctx->sh_desc_fin_dma, + flags); if (!edesc) return -ENOMEM; - sh_len = desc_len(sh_desc); desc = edesc->hw_desc; - init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); edesc->sec4_sg_bytes = sec4_sg_bytes; edesc->src_nents = 0; @@ -1020,14 +1019,12 @@ static int ahash_finup_ctx(struct ahash_request *req) int buflen = state->current_buf ? state->buflen_1 : state->buflen_0; int last_buflen = state->current_buf ? state->buflen_0 : state->buflen_1; - u32 *sh_desc = ctx->sh_desc_finup, *desc; - dma_addr_t ptr = ctx->sh_desc_finup_dma; + u32 *desc; int sec4_sg_bytes, sec4_sg_src_index; int src_nents, mapped_nents; int digestsize = crypto_ahash_digestsize(ahash); struct ahash_edesc *edesc; int ret = 0; - int sh_len; src_nents = sg_nents_for_len(r
[PATCH RFC 10/11] crypto: caam: add ahash_edesc_add_src()
Add a helper to map the source scatterlist into the descriptor. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 106 +++-- 1 file changed, 49 insertions(+), 57 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 241268d108ec..4e73d3218481 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -789,6 +789,40 @@ static struct ahash_edesc *ahash_edesc_alloc(struct caam_hash_ctx *ctx, return edesc; } +static int ahash_edesc_add_src(struct caam_hash_ctx *ctx, + struct ahash_edesc *edesc, + struct ahash_request *req, int nents, + unsigned int first_sg, unsigned int first_bytes) +{ + dma_addr_t src_dma; + u32 options; + + if (nents > 1 || first_sg) { + struct sec4_sg_entry *sg = edesc->sec4_sg; + unsigned int sgsize = sizeof(*sg) * (first_sg + nents); + + sg_to_sec4_sg_last(req->src, nents, sg + first_sg, 0); + + src_dma = dma_map_single(ctx->jrdev, sg, sgsize, DMA_TO_DEVICE); + if (dma_mapping_error(ctx->jrdev, src_dma)) { + dev_err(ctx->jrdev, "unable to map S/G table\n"); + return -ENOMEM; + } + + edesc->sec4_sg_bytes = sgsize; + edesc->sec4_sg_dma = src_dma; + options = LDST_SGF; + } else { + src_dma = sg_dma_address(req->src); + options = 0; + } + + append_seq_in_ptr(edesc->hw_desc, src_dma, first_bytes + req->nbytes, + options); + + return 0; +} + /* submit update job descriptor */ static int ahash_update_ctx(struct ahash_request *req) { @@ -1112,11 +1146,9 @@ static int ahash_digest(struct ahash_request *req) CRYPTO_TFM_REQ_MAY_SLEEP)) ? GFP_KERNEL : GFP_ATOMIC; u32 *desc; int digestsize = crypto_ahash_digestsize(ahash); - int src_nents, mapped_nents, sec4_sg_bytes; - dma_addr_t src_dma; + int src_nents, mapped_nents; struct ahash_edesc *edesc; int ret = 0; - u32 options; src_nents = sg_nents_for_len(req->src, req->nbytes); mapped_nents = dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); @@ -1125,11 +1157,6 @@ static int ahash_digest(struct ahash_request *req) return -ENOMEM; } - if (mapped_nents > 1) - sec4_sg_bytes = mapped_nents * sizeof(struct sec4_sg_entry); - else - sec4_sg_bytes = 0; - /* allocate space for base edesc and hw desc commands, link tables */ edesc = ahash_edesc_alloc(ctx, mapped_nents > 1 ? mapped_nents : 0, ctx->sh_desc_digest, ctx->sh_desc_digest_dma, @@ -1139,28 +1166,16 @@ static int ahash_digest(struct ahash_request *req) return -ENOMEM; } - edesc->sec4_sg_bytes = sec4_sg_bytes; edesc->src_nents = src_nents; - desc = edesc->hw_desc; - - if (src_nents > 1) { - sg_to_sec4_sg_last(req->src, mapped_nents, edesc->sec4_sg, 0); - edesc->sec4_sg_dma = dma_map_single(jrdev, edesc->sec4_sg, - sec4_sg_bytes, DMA_TO_DEVICE); - if (dma_mapping_error(jrdev, edesc->sec4_sg_dma)) { - dev_err(jrdev, "unable to map S/G table\n"); - ahash_unmap(jrdev, edesc, req, digestsize); - kfree(edesc); - return -ENOMEM; - } - src_dma = edesc->sec4_sg_dma; - options = LDST_SGF; - } else { - src_dma = sg_dma_address(req->src); - options = 0; + ret = ahash_edesc_add_src(ctx, edesc, req, mapped_nents, 0, 0); + if (ret) { + ahash_unmap(jrdev, edesc, req, digestsize); + kfree(edesc); + return ret; } - append_seq_in_ptr(desc, src_dma, req->nbytes, options); + + desc = edesc->hw_desc; edesc->dst_dma = map_seq_out_ptr_result(desc, jrdev, req->result, digestsize); @@ -1471,9 +1486,7 @@ static int ahash_update_first(struct ahash_request *req) >buflen_1 : >buflen_0; int to_hash; u32 *desc; - int sec4_sg_bytes, src_nents, mapped_nents; - dma_addr_t src_dma; - u32 options; + int src_nents, mapped_nents; struct ahash_edesc *edesc; int ret = 0; @@ -1490,11 +1503,6 @@ static int ahash_update_first(struct ahash_request *req) dev_err(jrdev
[PATCH RFC 06/11] crypto: caam: ensure that we clean up after an error
Ensure that we clean up allocations and DMA mappings after encountering an error rather than just giving up and leaking memory and resources. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 57 ++ 1 file changed, 52 insertions(+), 5 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index d48974d1897d..076cfddf32bb 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -824,8 +824,12 @@ static int ahash_update_ctx(struct ahash_request *req) ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_BIDIRECTIONAL); - if (ret) + if (ret) { + ahash_unmap_ctx(jrdev, edesc, req, ctx->ctx_len, + DMA_BIDIRECTIONAL); + kfree(edesc); return ret; + } state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg + 1, @@ -856,6 +860,9 @@ static int ahash_update_ctx(struct ahash_request *req) DMA_TO_DEVICE); if (dma_mapping_error(jrdev, edesc->sec4_sg_dma)) { dev_err(jrdev, "unable to map S/G table\n"); + ahash_unmap_ctx(jrdev, edesc, req, ctx->ctx_len, + DMA_BIDIRECTIONAL); + kfree(edesc); return -ENOMEM; } @@ -934,8 +941,11 @@ static int ahash_final_ctx(struct ahash_request *req) ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_TO_DEVICE); - if (ret) + if (ret) { + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + kfree(edesc); return ret; + } state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg + 1, buf, state->buf_dma, buflen, @@ -946,6 +956,8 @@ static int ahash_final_ctx(struct ahash_request *req) sec4_sg_bytes, DMA_TO_DEVICE); if (dma_mapping_error(jrdev, edesc->sec4_sg_dma)) { dev_err(jrdev, "unable to map S/G table\n"); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + kfree(edesc); return -ENOMEM; } @@ -956,6 +968,8 @@ static int ahash_final_ctx(struct ahash_request *req) digestsize); if (dma_mapping_error(jrdev, edesc->dst_dma)) { dev_err(jrdev, "unable to map dst\n"); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + kfree(edesc); return -ENOMEM; } @@ -1017,8 +1031,11 @@ static int ahash_finup_ctx(struct ahash_request *req) ret = ctx_map_to_sec4_sg(desc, jrdev, state, ctx->ctx_len, edesc->sec4_sg, DMA_TO_DEVICE); - if (ret) + if (ret) { + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + kfree(edesc); return ret; + } state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg + 1, buf, state->buf_dma, buflen, @@ -1031,6 +1048,8 @@ static int ahash_finup_ctx(struct ahash_request *req) sec4_sg_bytes, DMA_TO_DEVICE); if (dma_mapping_error(jrdev, edesc->sec4_sg_dma)) { dev_err(jrdev, "unable to map S/G table\n"); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + kfree(edesc); return -ENOMEM; } @@ -1041,6 +1060,8 @@ static int ahash_finup_ctx(struct ahash_request *req) digestsize); if (dma_mapping_error(jrdev, edesc->dst_dma)) { dev_err(jrdev, "unable to map dst\n"); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + kfree(edesc); return -ENOMEM; } @@ -1104,6 +1125,8 @@ static int ahash_digest(struct ahash_request *req) sec4_sg_bytes, DMA_TO_DEVICE); if (dma_mapping_error(jrdev, edesc->sec4_sg_dma)) { dev_err(jrdev, "unable to map S/G table\n"); + ahash_unmap(jrdev, edesc, req, digestsize); + kfree(edesc);
[PATCH RFC 01/11] crypto: caam: fix DMA API mapping leak
caamhash contains this weird code: src_nents = sg_count(req->src, req->nbytes); dma_map_sg(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE); ... edesc->src_nents = src_nents; sg_count() returns zero when sg_nents_for_len() returns zero or one. This means we don't need to use a hardware scatterlist. However, setting src_nents to zero causes problems when we unmap: if (edesc->src_nents) dma_unmap_sg_chained(dev, req->src, edesc->src_nents, DMA_TO_DEVICE, edesc->chained); as zero here means that we have no entries to unmap. This causes us to leak DMA mappings, where we map one scatterlist entry and then fail to unmap it. This can be fixed in two ways: either by writing the number of entries that were requested of dma_map_sg(), or by reworking the "no SG required" case. We adopt the re-work solution here - we replace sg_count() with sg_nents_for_len(), so src_nents now contains the real number of scatterlist entries, and we then change the test for using the hardware scatterlist to src_nents > 1 rather than just non-zero. This change passes my sshd, openssl tests hashing /bin and tcrypt tests. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 49106ea42887..eccde7207f92 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1085,9 +1085,12 @@ static int ahash_digest(struct ahash_request *req) u32 options; int sh_len; - src_nents = sg_count(req->src, req->nbytes); - dma_map_sg(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE); - sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); + src_nents = sg_nents_for_len(req->src, req->nbytes); + dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); + if (src_nents > 1) + sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); + else + sec4_sg_bytes = 0; /* allocate space for base edesc and hw desc commands, link tables */ edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes + DESC_JOB_IO_LEN, @@ -1105,7 +1108,7 @@ static int ahash_digest(struct ahash_request *req) desc = edesc->hw_desc; init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); - if (src_nents) { + if (src_nents > 1) { sg_to_sec4_sg_last(req->src, src_nents, edesc->sec4_sg, 0); edesc->sec4_sg_dma = dma_map_single(jrdev, edesc->sec4_sg, sec4_sg_bytes, DMA_TO_DEVICE); @@ -1232,8 +1235,8 @@ static int ahash_update_no_ctx(struct ahash_request *req) to_hash = in_len - *next_buflen; if (to_hash) { - src_nents = sg_nents_for_len(req->src, -req->nbytes - (*next_buflen)); + src_nents = sg_nents_for_len(req->src, req->nbytes - + *next_buflen); sec4_sg_bytes = (1 + src_nents) * sizeof(struct sec4_sg_entry); @@ -1429,9 +1432,14 @@ static int ahash_update_first(struct ahash_request *req) to_hash = req->nbytes - *next_buflen; if (to_hash) { - src_nents = sg_count(req->src, req->nbytes - (*next_buflen)); - dma_map_sg(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE); - sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); + src_nents = sg_nents_for_len(req->src, req->nbytes - + *next_buflen); + dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE); + if (src_nents > 1) + sec4_sg_bytes = src_nents * + sizeof(struct sec4_sg_entry); + else + sec4_sg_bytes = 0; /* * allocate space for base edesc and hw desc commands, @@ -1451,7 +1459,7 @@ static int ahash_update_first(struct ahash_request *req) DESC_JOB_IO_LEN; edesc->dst_dma = 0; - if (src_nents) { + if (src_nents > 1) { sg_to_sec4_sg_last(req->src, src_nents, edesc->sec4_sg, 0); edesc->sec4_sg_dma = dma_map_single(jrdev, -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC 00/13] Further iMX CAAM updates
Here are further imx-caam updates that I've had since before the previous merge window. Please review and (I guess) if Freescale folk can provide acks etc that would be nice. Thanks. drivers/crypto/caam/caamhash.c | 415 +++-- drivers/crypto/caam/intern.h | 1 - drivers/crypto/caam/jr.c | 25 +-- 3 files changed, 246 insertions(+), 195 deletions(-) -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/18] crypto: marvell: add flag to determine algorithm endianness
On Mon, Oct 19, 2015 at 03:04:51PM +, Jason Cooper wrote: > Hey Russell, > > On Sun, Oct 18, 2015 at 05:23:40PM +0100, Russell King wrote: > > static int mv_cesa_ahash_init(struct ahash_request *req, > > - struct mv_cesa_op_ctx *tmpl) > > + struct mv_cesa_op_ctx *tmpl, bool algo_le) > > nit: Would it make more sense the do bool algo_endian, and then ... I don't like "bool"s that don't self-document what they mean. What does "if (algo_endian)" mean? It's pretty clear what "if (algo_le)" means in the context of endianness, though I'll give you that "if (algo_little_endian)" would be a lot better. > > > @@ -861,7 +862,7 @@ static int mv_cesa_md5_init(struct ahash_request *req) > > > > mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_MD5); > > > > - mv_cesa_ahash_init(req, ); > > + mv_cesa_ahash_init(req, , true); > > mv_cesa_ahash_init(req, , ALGO_ENDIAN_LITTLE); > > 'true' doesn't seem as obvious. But again, nit-picky. I did think about: enum { ALGO_LITTLE_ENDIAN, ALGO_BIG_ENDIAN, }; and passing an int algo_endian around, but that seemed to be like too much bloat... though if you want to insist, I could make that change. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/18] crypto: further fixes for Marvell CESA hash
Following on from the previous series, this series addresses further problems with the Marvell CESA hash driver found while testing it my openssl/ssh scenarios. The first patch improves one from the previous series: we can get the transform more directly using req->base.tfm rather than going round the houses. The next few patches rework the algorithm endianness conversions. There are two things which depend on the algorithm endianness - the format of the result, and the format of the bit length in the last block. We introduce a flag to convey this information, and keep the creq->state format in CPU endian mode for consistency. Some of the inconsistent hash results are down to the template operation not being properly initialised - so we zero initialise all template operations. The following patches (from "factor out first fragment decisions to helper") rework the digest handling to ensure that we always provide the hardware with a complete block of data to hash, otherwise it can be left mid-calculation, which then causes state to leak to subsequent operations. This requires a re-structure of the way we put together the DMA entries, so it's done in relatively small steps. This results in the CESA driver passing all tests I can throw at it via the AF_ALG openssl plugin with the exception of asking for the hash of /dev/null. This returns an all zero result, rather than the correct hash value. This bug is pending further diagnosis, but it is believed not to be a driver specific bug as iMX6 CAAM also behaves the same. Unfortunately, this is a large series, but the driver unfortunately needs this level of bug fixing to work properly. drivers/crypto/marvell/cesa.h | 11 +- drivers/crypto/marvell/hash.c | 307 +- 2 files changed, 164 insertions(+), 154 deletions(-) -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/18] crypto: marvell: factor out adding an operation and launching it
Add a helper to add the fragment operation block followed by the DMA entry to launch the operation. Although at the moment this pattern only strictly appears at one site, two other sites can be factored as well by slightly changing the order in which the DMA operations are performed. This should be harmless as the only thing which matters is to have all the data loaded into SRAM prior to launching the operation. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 74 +-- 1 file changed, 36 insertions(+), 38 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index 938ecfeb8ffe..8111e73ca848 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -472,6 +472,29 @@ static int mv_cesa_ahash_cache_req(struct ahash_request *req, bool *cached) } static struct mv_cesa_op_ctx * +mv_cesa_dma_add_frag(struct mv_cesa_tdma_chain *chain, +struct mv_cesa_op_ctx *tmpl, unsigned int frag_len, +gfp_t flags) +{ + struct mv_cesa_op_ctx *op; + int ret; + + op = mv_cesa_dma_add_op(chain, tmpl, false, flags); + if (IS_ERR(op)) + return op; + + /* Set the operation block fragment length. */ + mv_cesa_set_mac_op_frag_len(op, frag_len); + + /* Append dummy desc to launch operation */ + ret = mv_cesa_dma_add_dummy_launch(chain, flags); + if (ret) + return ERR_PTR(ret); + + return op; +} + +static struct mv_cesa_op_ctx * mv_cesa_ahash_dma_add_cache(struct mv_cesa_tdma_chain *chain, struct mv_cesa_ahash_dma_iter *dma_iter, struct mv_cesa_ahash_req *creq, @@ -493,18 +516,9 @@ mv_cesa_ahash_dma_add_cache(struct mv_cesa_tdma_chain *chain, if (ret) return ERR_PTR(ret); - if (!dma_iter->base.op_len) { - op = mv_cesa_dma_add_op(chain, >op_tmpl, false, flags); - if (IS_ERR(op)) - return op; - - mv_cesa_set_mac_op_frag_len(op, creq->cache_ptr); - - /* Add dummy desc to launch crypto operation */ - ret = mv_cesa_dma_add_dummy_launch(chain, flags); - if (ret) - return ERR_PTR(ret); - } + if (!dma_iter->base.op_len) + op = mv_cesa_dma_add_frag(chain, >op_tmpl, + creq->cache_ptr, flags); return op; } @@ -518,28 +532,22 @@ mv_cesa_ahash_dma_add_data(struct mv_cesa_tdma_chain *chain, struct mv_cesa_op_ctx *op; int ret; - op = mv_cesa_dma_add_op(chain, >op_tmpl, false, flags); + /* Add input transfers */ + ret = mv_cesa_dma_add_op_transfers(chain, _iter->base, + _iter->src, flags); + if (ret) + return ERR_PTR(ret); + + op = mv_cesa_dma_add_frag(chain, >op_tmpl, dma_iter->base.op_len, + flags); if (IS_ERR(op)) return op; - mv_cesa_set_mac_op_frag_len(op, dma_iter->base.op_len); - if (mv_cesa_mac_op_is_first_frag(>op_tmpl)) mv_cesa_update_op_cfg(>op_tmpl, CESA_SA_DESC_CFG_MID_FRAG, CESA_SA_DESC_CFG_FRAG_MSK); - /* Add input transfers */ - ret = mv_cesa_dma_add_op_transfers(chain, _iter->base, - _iter->src, flags); - if (ret) - return ERR_PTR(ret); - - /* Add dummy desc to launch crypto operation */ - ret = mv_cesa_dma_add_dummy_launch(chain, flags); - if (ret) - return ERR_PTR(ret); - return op; } @@ -603,12 +611,6 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain, CESA_SA_DESC_CFG_MID_FRAG, CESA_SA_DESC_CFG_FRAG_MSK); - op = mv_cesa_dma_add_op(chain, >op_tmpl, false, flags); - if (IS_ERR(op)) - return op; - - mv_cesa_set_mac_op_frag_len(op, trailerlen - padoff); - ret = mv_cesa_dma_add_data_transfer(chain, CESA_SA_DATA_SRAM_OFFSET, ahashdreq->padding_dma + @@ -619,12 +621,8 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain, if (ret) return ERR_PTR(ret); - /* Add dummy desc to launch crypto operation */ - ret = mv_cesa_dma_add_dummy_launch(chain, flags); - if (ret) - return ERR_PTR(ret); - - return op; + return mv_cesa_dma_add_frag(chain, >op_tmpl, trailerlen - padoff, + flags); } static int
[PATCH 12/18] crypto: marvell: ensure iter.base.op_len is the full op length
When we process the last request of data, and the request contains user data, the loop in mv_cesa_ahash_dma_req_init() marks the first data size as being iter.base.op_len which does not include the size of the cache data. This means we end up hashing an insufficient amount of data. Fix this by always including the cache size in the first operation length of any request. This has the effect that for a request containing no user data, iter.base.op_len === iter.src.op_offset === creq->cache_ptr As a result, we include one further change to use iter.base.op_len in the cache-but-no-user-data case to make the next change clearer. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index c4693321fdc8..b4da73d294af 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -27,10 +27,10 @@ mv_cesa_ahash_req_iter_init(struct mv_cesa_ahash_dma_iter *iter, struct ahash_request *req) { struct mv_cesa_ahash_req *creq = ahash_request_ctx(req); - unsigned int len = req->nbytes; + unsigned int len = req->nbytes + creq->cache_ptr; if (!creq->last_req) - len = (len + creq->cache_ptr) & ~CESA_HASH_BLOCK_SIZE_MSK; + len &= ~CESA_HASH_BLOCK_SIZE_MSK; mv_cesa_req_dma_iter_init(>base, len); mv_cesa_sg_dma_iter_init(>src, req->src, DMA_TO_DEVICE); @@ -646,10 +646,10 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) goto err_free_tdma; } } while (mv_cesa_ahash_req_iter_next_op()); - } else if (creq->cache_ptr) { + } else if (iter.base.op_len) { /* Account for the data that was in the cache. */ op = mv_cesa_dma_add_frag(, >op_tmpl, - creq->cache_ptr, flags); + iter.base.op_len, flags); if (IS_ERR(op)) { ret = PTR_ERR(op); goto err_free_tdma; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/18] crypto: marvell: move mv_cesa_dma_add_frag() calls
Move the calls to mv_cesa_dma_add_frag() into the parent function, mv_cesa_ahash_dma_req_init(). This is in preparation to changing when we generate the operation blocks, as we need to avoid generating a block for a partial hash block at the end of the user data. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 71 ++- 1 file changed, 29 insertions(+), 42 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index f567243da005..787cec1715ed 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -499,51 +499,23 @@ mv_cesa_dma_add_frag(struct mv_cesa_tdma_chain *chain, return op; } -static struct mv_cesa_op_ctx * +static int mv_cesa_ahash_dma_add_cache(struct mv_cesa_tdma_chain *chain, struct mv_cesa_ahash_dma_iter *dma_iter, struct mv_cesa_ahash_req *creq, gfp_t flags) { struct mv_cesa_ahash_dma_req *ahashdreq = >req.dma; - struct mv_cesa_op_ctx *op = NULL; - int ret; if (!creq->cache_ptr) - return NULL; - - ret = mv_cesa_dma_add_data_transfer(chain, - CESA_SA_DATA_SRAM_OFFSET, - ahashdreq->cache_dma, - creq->cache_ptr, - CESA_TDMA_DST_IN_SRAM, - flags); - if (ret) - return ERR_PTR(ret); - - if (!dma_iter->base.op_len) - op = mv_cesa_dma_add_frag(chain, >op_tmpl, - creq->cache_ptr, flags); - - return op; -} - -static struct mv_cesa_op_ctx * -mv_cesa_ahash_dma_add_data(struct mv_cesa_tdma_chain *chain, - struct mv_cesa_ahash_dma_iter *dma_iter, - struct mv_cesa_ahash_req *creq, - gfp_t flags) -{ - int ret; - - /* Add input transfers */ - ret = mv_cesa_dma_add_op_transfers(chain, _iter->base, - _iter->src, flags); - if (ret) - return ERR_PTR(ret); + return 0; - return mv_cesa_dma_add_frag(chain, >op_tmpl, dma_iter->base.op_len, - flags); + return mv_cesa_dma_add_data_transfer(chain, +CESA_SA_DATA_SRAM_OFFSET, +ahashdreq->cache_dma, +creq->cache_ptr, +CESA_TDMA_DST_IN_SRAM, +flags); } static struct mv_cesa_op_ctx * @@ -647,19 +619,34 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) mv_cesa_tdma_desc_iter_init(); mv_cesa_ahash_req_iter_init(, req); - op = mv_cesa_ahash_dma_add_cache(, , -creq, flags); - if (IS_ERR(op)) { - ret = PTR_ERR(op); + /* +* Add the cache (left-over data from a previous block) first. +* This will never overflow the SRAM size. +*/ + ret = mv_cesa_ahash_dma_add_cache(, , creq, flags); + if (ret) goto err_free_tdma; + + if (creq->cache_ptr && !iter.base.op_len) { + op = mv_cesa_dma_add_frag(, >op_tmpl, + creq->cache_ptr, flags); + if (IS_ERR(op)) { + ret = PTR_ERR(op); + goto err_free_tdma; + } } do { if (!iter.base.op_len) break; - op = mv_cesa_ahash_dma_add_data(, , - creq, flags); + ret = mv_cesa_dma_add_op_transfers(, , + , flags); + if (ret) + goto err_free_tdma; + + op = mv_cesa_dma_add_frag(, >op_tmpl, + iter.base.op_len, flags); if (IS_ERR(op)) { ret = PTR_ERR(op); goto err_free_tdma; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/18] crypto: marvell: rearrange last request handling
Move the test for the last request out of mv_cesa_ahash_dma_last_req() to its caller, and move the mv_cesa_dma_add_frag() down into this function. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 30 +++--- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index dc8ab343ad9f..b8ed0478031a 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -522,15 +522,21 @@ static struct mv_cesa_op_ctx * mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain, struct mv_cesa_ahash_dma_iter *dma_iter, struct mv_cesa_ahash_req *creq, - struct mv_cesa_op_ctx *op, - gfp_t flags) + unsigned int frag_len, gfp_t flags) { struct mv_cesa_ahash_dma_req *ahashdreq = >req.dma; unsigned int len, trailerlen, padoff = 0; + struct mv_cesa_op_ctx *op; int ret; - if (!creq->last_req) - return op; + if (frag_len) { + op = mv_cesa_dma_add_frag(chain, >op_tmpl, frag_len, + flags); + if (IS_ERR(op)) + return op; + } else { + op = NULL; + } if (op && creq->len <= CESA_SA_DESC_MAC_SRC_TOTAL_LEN_MAX) { u32 frag = CESA_SA_DESC_CFG_NOT_FRAG; @@ -657,16 +663,18 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) frag_len = iter.base.op_len; } - if (frag_len) { + /* +* At this point, frag_len indicates whether we have any data +* outstanding which needs an operation. Queue up the final +* operation, which depends whether this is the final request. +*/ + if (creq->last_req) + op = mv_cesa_ahash_dma_last_req(, , creq, frag_len, + flags); + else if (frag_len) op = mv_cesa_dma_add_frag(, >op_tmpl, frag_len, flags); - if (IS_ERR(op)) { - ret = PTR_ERR(op); - goto err_free_tdma; - } - } - op = mv_cesa_ahash_dma_last_req(, , creq, op, flags); if (IS_ERR(op)) { ret = PTR_ERR(op); goto err_free_tdma; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/18] crypto: marvell: fix the bit length endianness
The endianness of the bit length used in the final stage depends on the endianness of the algorithm - md5 hashes need it to be in little endian format, whereas SHA hashes need it in big endian format. Use the previously added algorithm endianness flag to control this. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index aa12274608ab..dc2c65b52d7a 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -179,7 +179,6 @@ static int mv_cesa_ahash_pad_len(struct mv_cesa_ahash_req *creq) static int mv_cesa_ahash_pad_req(struct mv_cesa_ahash_req *creq, u8 *buf) { - __be64 bits = cpu_to_be64(creq->len << 3); unsigned int index, padlen; buf[0] = 0x80; @@ -187,7 +186,14 @@ static int mv_cesa_ahash_pad_req(struct mv_cesa_ahash_req *creq, u8 *buf) index = creq->len & CESA_HASH_BLOCK_SIZE_MSK; padlen = mv_cesa_ahash_pad_len(creq); memset(buf + 1, 0, padlen - 1); - memcpy(buf + padlen, , sizeof(bits)); + + if (creq->algo_le) { + __le64 bits = cpu_to_le64(creq->len << 3); + memcpy(buf + padlen, , sizeof(bits)); + } else { + __be64 bits = cpu_to_be64(creq->len << 3); + memcpy(buf + padlen, , sizeof(bits)); + } return padlen + 8; } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/18] crypto: marvell: add flag to determine algorithm endianness
Rather than determining whether we're using a MD5 hash by looking at the digest size, switch to a cleaner solution using a per-request flag initialised by the method type. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/cesa.h | 1 + drivers/crypto/marvell/hash.c | 17 + 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h index e19302c9dec9..5d5b66ea2ceb 100644 --- a/drivers/crypto/marvell/cesa.h +++ b/drivers/crypto/marvell/cesa.h @@ -612,6 +612,7 @@ struct mv_cesa_ahash_req { u64 len; int src_nents; bool last_req; + bool algo_le; u32 state[8]; }; diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index f59faabcd34f..aa12274608ab 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -351,7 +351,7 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status) * Hardware's MD5 digest is in little endian format, but * SHA in big endian format */ - if (digsize == MD5_DIGEST_SIZE) { + if (creq->algo_le) { __le32 *result = (void *)ahashreq->result; for (i = 0; i < digsize / 4; i++) @@ -407,7 +407,7 @@ static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = { }; static int mv_cesa_ahash_init(struct ahash_request *req, - struct mv_cesa_op_ctx *tmpl) + struct mv_cesa_op_ctx *tmpl, bool algo_le) { struct mv_cesa_ahash_req *creq = ahash_request_ctx(req); @@ -421,6 +421,7 @@ static int mv_cesa_ahash_init(struct ahash_request *req, mv_cesa_set_mac_op_frag_len(tmpl, 0); creq->op_tmpl = *tmpl; creq->len = 0; + creq->algo_le = algo_le; return 0; } @@ -861,7 +862,7 @@ static int mv_cesa_md5_init(struct ahash_request *req) mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_MD5); - mv_cesa_ahash_init(req, ); + mv_cesa_ahash_init(req, , true); return 0; } @@ -924,7 +925,7 @@ static int mv_cesa_sha1_init(struct ahash_request *req) mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_SHA1); - mv_cesa_ahash_init(req, ); + mv_cesa_ahash_init(req, , false); return 0; } @@ -987,7 +988,7 @@ static int mv_cesa_sha256_init(struct ahash_request *req) mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_SHA256); - mv_cesa_ahash_init(req, ); + mv_cesa_ahash_init(req, , false); return 0; } @@ -1218,7 +1219,7 @@ static int mv_cesa_ahmac_md5_init(struct ahash_request *req) mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_HMAC_MD5); memcpy(tmpl.ctx.hash.iv, ctx->iv, sizeof(ctx->iv)); - mv_cesa_ahash_init(req, ); + mv_cesa_ahash_init(req, , true); return 0; } @@ -1288,7 +1289,7 @@ static int mv_cesa_ahmac_sha1_init(struct ahash_request *req) mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_HMAC_SHA1); memcpy(tmpl.ctx.hash.iv, ctx->iv, sizeof(ctx->iv)); - mv_cesa_ahash_init(req, ); + mv_cesa_ahash_init(req, , false); return 0; } @@ -1378,7 +1379,7 @@ static int mv_cesa_ahmac_sha256_init(struct ahash_request *req) mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_HMAC_SHA256); memcpy(tmpl.ctx.hash.iv, ctx->iv, sizeof(ctx->iv)); - mv_cesa_ahash_init(req, ); + mv_cesa_ahash_init(req, , false); return 0; } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/18] crypto: marvell: keep creq->state in CPU endian format at all times
Currently, we read/write the state in CPU endian, but on the final request, we convert its endian according to the requested algorithm. (md5 is little endian, SHA are big endian.) Always keep creq->state in CPU native endian format, and perform the necessary conversion when copying the hash to the result. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/cesa.h | 2 +- drivers/crypto/marvell/hash.c | 25 ++--- 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h index bc2a55bc35e4..e19302c9dec9 100644 --- a/drivers/crypto/marvell/cesa.h +++ b/drivers/crypto/marvell/cesa.h @@ -612,7 +612,7 @@ struct mv_cesa_ahash_req { u64 len; int src_nents; bool last_req; - __be32 state[8]; + u32 state[8]; }; /* CESA functions */ diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index d9cc49e9c43b..f59faabcd34f 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -347,18 +347,21 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status) ahashreq->nbytes - creq->cache_ptr); if (creq->last_req) { - for (i = 0; i < digsize / 4; i++) { - /* -* Hardware provides MD5 digest in a different -* endianness than SHA-1 and SHA-256 ones. -*/ - if (digsize == MD5_DIGEST_SIZE) - creq->state[i] = cpu_to_le32(creq->state[i]); - else - creq->state[i] = cpu_to_be32(creq->state[i]); - } + /* +* Hardware's MD5 digest is in little endian format, but +* SHA in big endian format +*/ + if (digsize == MD5_DIGEST_SIZE) { + __le32 *result = (void *)ahashreq->result; + + for (i = 0; i < digsize / 4; i++) + result[i] = cpu_to_le32(creq->state[i]); + } else { + __be32 *result = (void *)ahashreq->result; - memcpy(ahashreq->result, creq->state, digsize); + for (i = 0; i < digsize / 4; i++) + result[i] = cpu_to_be32(creq->state[i]); + } } return ret; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/18] crypto: marvell: ensure template operation is initialised
Ensure that the template operation is fully initialised, otherwise we end up loading data from the kernel stack into the engines, which can upset the hash results. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index dc2c65b52d7a..82d9e3d09331 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -864,7 +864,7 @@ static int mv_cesa_ahash_import(struct ahash_request *req, const void *hash, static int mv_cesa_md5_init(struct ahash_request *req) { - struct mv_cesa_op_ctx tmpl; + struct mv_cesa_op_ctx tmpl = { }; mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_MD5); @@ -927,7 +927,7 @@ struct ahash_alg mv_md5_alg = { static int mv_cesa_sha1_init(struct ahash_request *req) { - struct mv_cesa_op_ctx tmpl; + struct mv_cesa_op_ctx tmpl = { }; mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_SHA1); @@ -990,7 +990,7 @@ struct ahash_alg mv_sha1_alg = { static int mv_cesa_sha256_init(struct ahash_request *req) { - struct mv_cesa_op_ctx tmpl; + struct mv_cesa_op_ctx tmpl = { }; mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_SHA256); @@ -1220,7 +1220,7 @@ static int mv_cesa_ahmac_cra_init(struct crypto_tfm *tfm) static int mv_cesa_ahmac_md5_init(struct ahash_request *req) { struct mv_cesa_hmac_ctx *ctx = crypto_tfm_ctx(req->base.tfm); - struct mv_cesa_op_ctx tmpl; + struct mv_cesa_op_ctx tmpl = { }; mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_HMAC_MD5); memcpy(tmpl.ctx.hash.iv, ctx->iv, sizeof(ctx->iv)); @@ -1290,7 +1290,7 @@ struct ahash_alg mv_ahmac_md5_alg = { static int mv_cesa_ahmac_sha1_init(struct ahash_request *req) { struct mv_cesa_hmac_ctx *ctx = crypto_tfm_ctx(req->base.tfm); - struct mv_cesa_op_ctx tmpl; + struct mv_cesa_op_ctx tmpl = { }; mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_HMAC_SHA1); memcpy(tmpl.ctx.hash.iv, ctx->iv, sizeof(ctx->iv)); @@ -1380,7 +1380,7 @@ static int mv_cesa_ahmac_sha256_setkey(struct crypto_ahash *tfm, const u8 *key, static int mv_cesa_ahmac_sha256_init(struct ahash_request *req) { struct mv_cesa_hmac_ctx *ctx = crypto_tfm_ctx(req->base.tfm); - struct mv_cesa_op_ctx tmpl; + struct mv_cesa_op_ctx tmpl = { }; mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_HMAC_SHA256); memcpy(tmpl.ctx.hash.iv, ctx->iv, sizeof(ctx->iv)); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/18] crypto: marvell: always ensure mid-fragments after first-fragment
If we add a template first-fragment operation, always update the template to be a mid-fragment. This ensures that mid-fragments always follow on from a first fragment in every case. This means we can move the first to mid-fragment update code out of mv_cesa_ahash_dma_add_data(). Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index 8111e73ca848..f567243da005 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -491,6 +491,11 @@ mv_cesa_dma_add_frag(struct mv_cesa_tdma_chain *chain, if (ret) return ERR_PTR(ret); + if (mv_cesa_mac_op_is_first_frag(tmpl)) + mv_cesa_update_op_cfg(tmpl, + CESA_SA_DESC_CFG_MID_FRAG, + CESA_SA_DESC_CFG_FRAG_MSK); + return op; } @@ -529,7 +534,6 @@ mv_cesa_ahash_dma_add_data(struct mv_cesa_tdma_chain *chain, struct mv_cesa_ahash_req *creq, gfp_t flags) { - struct mv_cesa_op_ctx *op; int ret; /* Add input transfers */ @@ -538,17 +542,8 @@ mv_cesa_ahash_dma_add_data(struct mv_cesa_tdma_chain *chain, if (ret) return ERR_PTR(ret); - op = mv_cesa_dma_add_frag(chain, >op_tmpl, dma_iter->base.op_len, - flags); - if (IS_ERR(op)) - return op; - - if (mv_cesa_mac_op_is_first_frag(>op_tmpl)) - mv_cesa_update_op_cfg(>op_tmpl, - CESA_SA_DESC_CFG_MID_FRAG, - CESA_SA_DESC_CFG_FRAG_MSK); - - return op; + return mv_cesa_dma_add_frag(chain, >op_tmpl, dma_iter->base.op_len, + flags); } static struct mv_cesa_op_ctx * -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/18] crypto: marvell: avoid adding final operation within loop
Avoid adding the final operation within the loop, but instead add it outside. We combine this with the handling for the no-data case. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index b4da73d294af..dc8ab343ad9f 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -602,6 +602,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) struct mv_cesa_tdma_chain chain; struct mv_cesa_ahash_dma_iter iter; struct mv_cesa_op_ctx *op = NULL; + unsigned int frag_len; int ret; dreq->chain.first = NULL; @@ -631,25 +632,34 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) /* * Add all the new data, inserting an operation block and * launch command between each full SRAM block-worth of -* data. +* data. We intentionally do not add the final op block. */ - do { + while (true) { ret = mv_cesa_dma_add_op_transfers(, , , flags); if (ret) goto err_free_tdma; + frag_len = iter.base.op_len; + + if (!mv_cesa_ahash_req_iter_next_op()) + break; + op = mv_cesa_dma_add_frag(, >op_tmpl, - iter.base.op_len, flags); + frag_len, flags); if (IS_ERR(op)) { ret = PTR_ERR(op); goto err_free_tdma; } - } while (mv_cesa_ahash_req_iter_next_op()); - } else if (iter.base.op_len) { + } + } else { /* Account for the data that was in the cache. */ - op = mv_cesa_dma_add_frag(, >op_tmpl, - iter.base.op_len, flags); + frag_len = iter.base.op_len; + } + + if (frag_len) { + op = mv_cesa_dma_add_frag(, >op_tmpl, frag_len, + flags); if (IS_ERR(op)) { ret = PTR_ERR(op); goto err_free_tdma; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/18] crypto: marvell: factor out first fragment decisions to helper
Multiple locations in the driver test the operation context fragment type, checking whether it is a first fragment or not. Introduce a mv_cesa_mac_op_is_first_frag() helper, which returns true if the fragment operation is for a first fragment. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/cesa.h | 6 ++ drivers/crypto/marvell/hash.c | 9 +++-- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h index cd646d7c14e2..e9f732138ba3 100644 --- a/drivers/crypto/marvell/cesa.h +++ b/drivers/crypto/marvell/cesa.h @@ -686,6 +686,12 @@ static inline u32 mv_cesa_get_int_mask(struct mv_cesa_engine *engine) return engine->int_mask; } +static inline bool mv_cesa_mac_op_is_first_frag(const struct mv_cesa_op_ctx *op) +{ + return (mv_cesa_get_op_cfg(op) & CESA_SA_DESC_CFG_FRAG_MSK) == + CESA_SA_DESC_CFG_FIRST_FRAG; +} + int mv_cesa_queue_req(struct crypto_async_request *req); /* diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index 82d9e3d09331..938ecfeb8ffe 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -524,8 +524,7 @@ mv_cesa_ahash_dma_add_data(struct mv_cesa_tdma_chain *chain, mv_cesa_set_mac_op_frag_len(op, dma_iter->base.op_len); - if ((mv_cesa_get_op_cfg(>op_tmpl) & CESA_SA_DESC_CFG_FRAG_MSK) == - CESA_SA_DESC_CFG_FIRST_FRAG) + if (mv_cesa_mac_op_is_first_frag(>op_tmpl)) mv_cesa_update_op_cfg(>op_tmpl, CESA_SA_DESC_CFG_MID_FRAG, CESA_SA_DESC_CFG_FRAG_MSK); @@ -561,8 +560,7 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain, if (op && creq->len <= CESA_SA_DESC_MAC_SRC_TOTAL_LEN_MAX) { u32 frag = CESA_SA_DESC_CFG_NOT_FRAG; - if ((mv_cesa_get_op_cfg(op) & CESA_SA_DESC_CFG_FRAG_MSK) != - CESA_SA_DESC_CFG_FIRST_FRAG) + if (!mv_cesa_mac_op_is_first_frag(op)) frag = CESA_SA_DESC_CFG_LAST_FRAG; mv_cesa_update_op_cfg(op, frag, CESA_SA_DESC_CFG_FRAG_MSK); @@ -600,8 +598,7 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain, if (padoff >= trailerlen) return op; - if ((mv_cesa_get_op_cfg(>op_tmpl) & CESA_SA_DESC_CFG_FRAG_MSK) != - CESA_SA_DESC_CFG_FIRST_FRAG) + if (!mv_cesa_mac_op_is_first_frag(>op_tmpl)) mv_cesa_update_op_cfg(>op_tmpl, CESA_SA_DESC_CFG_MID_FRAG, CESA_SA_DESC_CFG_FRAG_MSK); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 17/18] crypto: marvell: fix first-fragment handling in mv_cesa_ahash_dma_last_req()
When adding the software padding, this must be done using the first/mid fragment mode, and any subsequent operation needs to be a mid-fragment. Fix this. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 8 1 file changed, 8 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index da541e59cc1d..34271e9eb3a5 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -575,20 +575,12 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain, if (IS_ERR(op)) return op; - mv_cesa_update_op_cfg(op, CESA_SA_DESC_CFG_MID_FRAG, - CESA_SA_DESC_CFG_FRAG_MSK); - if (len == trailerlen) return op; padoff += len; } - if (!mv_cesa_mac_op_is_first_frag(>op_tmpl)) - mv_cesa_update_op_cfg(>op_tmpl, - CESA_SA_DESC_CFG_MID_FRAG, - CESA_SA_DESC_CFG_FRAG_MSK); - ret = mv_cesa_dma_add_data_transfer(chain, CESA_SA_DATA_SRAM_OFFSET, ahashdreq->padding_dma + -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/18] crypto: marvell: const-ify argument to mv_cesa_get_op_cfg()
mv_cesa_get_op_cfg() does not write to its argument, it only reads. So, let's make it const. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/cesa.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h index 5d5b66ea2ceb..cd646d7c14e2 100644 --- a/drivers/crypto/marvell/cesa.h +++ b/drivers/crypto/marvell/cesa.h @@ -627,7 +627,7 @@ static inline void mv_cesa_update_op_cfg(struct mv_cesa_op_ctx *op, op->desc.config |= cpu_to_le32(cfg); } -static inline u32 mv_cesa_get_op_cfg(struct mv_cesa_op_ctx *op) +static inline u32 mv_cesa_get_op_cfg(const struct mv_cesa_op_ctx *op) { return le32_to_cpu(op->desc.config); } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/18] crypto: marvell: use presence of scatterlist to determine data load
Use the presence of the scatterlist to determine whether we should load any new user data to the engine. The following shall always be true at this point: iter.base.op_len == 0 === iter.src.sg In doing so, we can: 1. eliminate the test for iter.base.op_len inside the loop, which makes the loop operation more obvious and understandable. 2. move the operation generation for the cache-only case. This prepares the code for the next step in its transformation, and also uncovers a bug that will be fixed in the next patch. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 39 +-- 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index 787cec1715ed..c4693321fdc8 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -627,7 +627,27 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) if (ret) goto err_free_tdma; - if (creq->cache_ptr && !iter.base.op_len) { + if (iter.src.sg) { + /* +* Add all the new data, inserting an operation block and +* launch command between each full SRAM block-worth of +* data. +*/ + do { + ret = mv_cesa_dma_add_op_transfers(, , + , flags); + if (ret) + goto err_free_tdma; + + op = mv_cesa_dma_add_frag(, >op_tmpl, + iter.base.op_len, flags); + if (IS_ERR(op)) { + ret = PTR_ERR(op); + goto err_free_tdma; + } + } while (mv_cesa_ahash_req_iter_next_op()); + } else if (creq->cache_ptr) { + /* Account for the data that was in the cache. */ op = mv_cesa_dma_add_frag(, >op_tmpl, creq->cache_ptr, flags); if (IS_ERR(op)) { @@ -636,23 +656,6 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) } } - do { - if (!iter.base.op_len) - break; - - ret = mv_cesa_dma_add_op_transfers(, , - , flags); - if (ret) - goto err_free_tdma; - - op = mv_cesa_dma_add_frag(, >op_tmpl, - iter.base.op_len, flags); - if (IS_ERR(op)) { - ret = PTR_ERR(op); - goto err_free_tdma; - } - } while (mv_cesa_ahash_req_iter_next_op()); - op = mv_cesa_ahash_dma_last_req(, , creq, op, flags); if (IS_ERR(op)) { ret = PTR_ERR(op); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/18] crypto: marvell: rearrange handling for hw finished hashes
Rearrange the last request handling for hardware finished hashes by moving the generation of the fragment operation into this path. This results in a simplified sequence to handle this case, and allows us to move the software padded case further down into the function. Add comments describing these parts. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 35 --- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index b8ed0478031a..d2265beaaa6b 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -529,32 +529,45 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain, struct mv_cesa_op_ctx *op; int ret; - if (frag_len) { + /* +* If the transfer is smaller than our maximum length, and we have +* some data outstanding, we can ask the engine to finish the hash. +*/ + if (creq->len <= CESA_SA_DESC_MAC_SRC_TOTAL_LEN_MAX && frag_len) { op = mv_cesa_dma_add_frag(chain, >op_tmpl, frag_len, flags); if (IS_ERR(op)) return op; - } else { - op = NULL; - } - - if (op && creq->len <= CESA_SA_DESC_MAC_SRC_TOTAL_LEN_MAX) { - u32 frag = CESA_SA_DESC_CFG_NOT_FRAG; - - if (!mv_cesa_mac_op_is_first_frag(op)) - frag = CESA_SA_DESC_CFG_LAST_FRAG; - mv_cesa_update_op_cfg(op, frag, CESA_SA_DESC_CFG_FRAG_MSK); + mv_cesa_set_mac_op_total_len(op, creq->len); + mv_cesa_update_op_cfg(op, mv_cesa_mac_op_is_first_frag(op) ? + CESA_SA_DESC_CFG_NOT_FRAG : + CESA_SA_DESC_CFG_LAST_FRAG, + CESA_SA_DESC_CFG_FRAG_MSK); return op; } + /* +* The request is longer than the engine can handle, or we have +* no data outstanding. Manually generate the padding, adding it +* as a "mid" fragment. +*/ ret = mv_cesa_ahash_dma_alloc_padding(ahashdreq, flags); if (ret) return ERR_PTR(ret); trailerlen = mv_cesa_ahash_pad_req(creq, ahashdreq->padding); + if (frag_len) { + op = mv_cesa_dma_add_frag(chain, >op_tmpl, frag_len, + flags); + if (IS_ERR(op)) + return op; + } else { + op = NULL; + } + if (op) { len = min(CESA_SA_SRAM_PAYLOAD_SIZE - dma_iter->base.op_len, trailerlen); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 18/18] crypto: marvell/cesa: fix memory leak
From: Boris Brezillon <boris.brezil...@free-electrons.com> To: Boris Brezillon <boris.brezil...@free-electrons.com>,Arnaud Ebalard <a...@natisbad.org>,Thomas Petazzoni <thomas.petazz...@free-electrons.com>,Jason Cooper <ja...@lakedaemon.net> The local chain variable is not cleaned up if an error occurs in the middle of DMA chain creation. Fix that by dropping the local chain variable and using the dreq->chain field which will be cleaned up by mv_cesa_dma_cleanup() in case of errors. Signed-off-by: Boris Brezillon <boris.brezil...@free-electrons.com> Reported-by: Thomas Petazzoni <thomas.petazz...@free-electrons.com> Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index 34271e9eb3a5..7cd0f0decf6c 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -602,7 +602,6 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) GFP_KERNEL : GFP_ATOMIC; struct mv_cesa_ahash_dma_req *ahashdreq = >req.dma; struct mv_cesa_tdma_req *dreq = >base; - struct mv_cesa_tdma_chain chain; struct mv_cesa_ahash_dma_iter iter; struct mv_cesa_op_ctx *op = NULL; unsigned int frag_len; @@ -620,14 +619,14 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) } } - mv_cesa_tdma_desc_iter_init(); + mv_cesa_tdma_desc_iter_init(>chain); mv_cesa_ahash_req_iter_init(, req); /* * Add the cache (left-over data from a previous block) first. * This will never overflow the SRAM size. */ - ret = mv_cesa_ahash_dma_add_cache(, , creq, flags); + ret = mv_cesa_ahash_dma_add_cache(>chain, , creq, flags); if (ret) goto err_free_tdma; @@ -638,7 +637,8 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) * data. We intentionally do not add the final op block. */ while (true) { - ret = mv_cesa_dma_add_op_transfers(, , + ret = mv_cesa_dma_add_op_transfers(>chain, + , , flags); if (ret) goto err_free_tdma; @@ -648,7 +648,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) if (!mv_cesa_ahash_req_iter_next_op()) break; - op = mv_cesa_dma_add_frag(, >op_tmpl, + op = mv_cesa_dma_add_frag(>chain, >op_tmpl, frag_len, flags); if (IS_ERR(op)) { ret = PTR_ERR(op); @@ -666,11 +666,11 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) * operation, which depends whether this is the final request. */ if (creq->last_req) - op = mv_cesa_ahash_dma_last_req(, , creq, frag_len, - flags); + op = mv_cesa_ahash_dma_last_req(>chain, , creq, + frag_len, flags); else if (frag_len) - op = mv_cesa_dma_add_frag(, >op_tmpl, frag_len, - flags); + op = mv_cesa_dma_add_frag(>chain, >op_tmpl, + frag_len, flags); if (IS_ERR(op)) { ret = PTR_ERR(op); @@ -679,7 +679,7 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) if (op) { /* Add dummy desc to wait for crypto operation end */ - ret = mv_cesa_dma_add_dummy_end(, flags); + ret = mv_cesa_dma_add_dummy_end(>chain, flags); if (ret) goto err_free_tdma; } @@ -690,8 +690,6 @@ static int mv_cesa_ahash_dma_req_init(struct ahash_request *req) else creq->cache_ptr = 0; - dreq->chain = chain; - return 0; err_free_tdma: -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 16/18] crypto: marvell: rearrange handling for sw padded hashes
Rearrange the last request handling for hashes which require software padding. We prepare the padding to be appended, and then append as much of the padding to any existing data that's already queued up, adding an operation block and launching the operation. Any remainder is then appended as a separate operation. This ensures that the hardware only ever sees multiples of the hash block size to be operated on for software padded hashes, thus ensuring that the engine always indicates that it has finished the calculation. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 44 ++- 1 file changed, 18 insertions(+), 26 deletions(-) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index d2265beaaa6b..da541e59cc1d 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -559,38 +559,30 @@ mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain, trailerlen = mv_cesa_ahash_pad_req(creq, ahashdreq->padding); - if (frag_len) { - op = mv_cesa_dma_add_frag(chain, >op_tmpl, frag_len, - flags); - if (IS_ERR(op)) - return op; - } else { - op = NULL; - } - - if (op) { - len = min(CESA_SA_SRAM_PAYLOAD_SIZE - dma_iter->base.op_len, - trailerlen); - if (len) { - ret = mv_cesa_dma_add_data_transfer(chain, + len = min(CESA_SA_SRAM_PAYLOAD_SIZE - frag_len, trailerlen); + if (len) { + ret = mv_cesa_dma_add_data_transfer(chain, CESA_SA_DATA_SRAM_OFFSET + - dma_iter->base.op_len, + frag_len, ahashdreq->padding_dma, len, CESA_TDMA_DST_IN_SRAM, flags); - if (ret) - return ERR_PTR(ret); + if (ret) + return ERR_PTR(ret); - mv_cesa_update_op_cfg(op, CESA_SA_DESC_CFG_MID_FRAG, - CESA_SA_DESC_CFG_FRAG_MSK); - mv_cesa_set_mac_op_frag_len(op, - dma_iter->base.op_len + len); - padoff += len; - } - } + op = mv_cesa_dma_add_frag(chain, >op_tmpl, frag_len + len, + flags); + if (IS_ERR(op)) + return op; - if (padoff >= trailerlen) - return op; + mv_cesa_update_op_cfg(op, CESA_SA_DESC_CFG_MID_FRAG, + CESA_SA_DESC_CFG_FRAG_MSK); + + if (len == trailerlen) + return op; + + padoff += len; + } if (!mv_cesa_mac_op_is_first_frag(>op_tmpl)) mv_cesa_update_op_cfg(>op_tmpl, -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/6] Fix CAAM hash driver
The following series fixes the CAAM hash driver, allowing it to work with the previously merged "crypto: ahash - ensure statesize is non- zero" patch. This is non-trivial, because CAAM exports a huge 1600 bytes of data, which, if we set .statesize to this, still results in the core code rejecting the driver. So, we need to shrink the amount of state exported. The first, most obvious one to get rid of is the export of the caam_hash_ctx structure, which is shared between the socket being exported from and imported to - copying it away and back again was a complete no-op. The second is that we don't need to export both pending-bytes buffers. Only one will be in use at any time. A problem was encountered while testing, where the size of the pending bytes buffer was not added to the scatterlist with the correct length. This is also fixed in this series, by patch 3. This bug was introduced by a prior commit trying to fix a tcrypt error. However, the change is wrong, but I have to question whether the test was correct or not - the backtrace contains a function "test_ahash_pnum" which doesn't seem to exist in mainline, nor does it seem to exist in any previous mainline kernel. Version 2 of this series addresses a mismerge in patch 5 of the original series, and adds further information to patch 3. Further testing with tcrypt showed up a problem identified by the DMA API debugging (different to the original one referred to in patch 3) where we leak DMA mappings. drivers/crypto/caam/caamhash.c | 101 + 1 file changed, 71 insertions(+), 30 deletions(-) -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] crypto: caam: print errno code when hash registration fails
Print the errno code when hash registration fails, so we know why the failure occurred. This aids debugging. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 94433b9fc200..2faf71ccbd43 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1952,8 +1952,9 @@ static int __init caam_algapi_hash_init(void) err = crypto_register_ahash(_alg->ahash_alg); if (err) { - pr_warn("%s alg registration failed\n", - t_alg->ahash_alg.halg.base.cra_driver_name); + pr_warn("%s alg registration failed: %d\n", + t_alg->ahash_alg.halg.base.cra_driver_name, + err); kfree(t_alg); } else list_add_tail(_alg->entry, _list); @@ -1968,8 +1969,9 @@ static int __init caam_algapi_hash_init(void) err = crypto_register_ahash(_alg->ahash_alg); if (err) { - pr_warn("%s alg registration failed\n", - t_alg->ahash_alg.halg.base.cra_driver_name); + pr_warn("%s alg registration failed: %d\n", + t_alg->ahash_alg.halg.base.cra_driver_name, + err); kfree(t_alg); } else list_add_tail(_alg->entry, _list); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] crypto: caam: fix non-block aligned hash calculation
caam does not properly calculate the size of the retained state when non-block aligned hashes are requested - it uses the wrong buffer sizes, which results in errors such as: caam_jr 2102000.jr1: 4501: DECO: desc idx 5: SGT Length Error. The descriptor is trying to read more data than is contained in the SGT table. We end up here with: in_len 0x46 blocksize 0x40 last_bufsize 0x0 next_bufsize 0x6 to_hash 0x40 ctx_len 0x28 nbytes 0x20 which results in a job descriptor of: jobdesc@889: ed03d918: b0861c08 3daa0080 f140 3d03d938 jobdesc@889: ed03d928: 0068 f840 3cde2a40 0028 where the word at 0xed03d928 is the expected data size (0x68), and a scatterlist containing: sg@892: ed03d938: 3cde2a40 0028 sg@892: ed03d948: 3d03d100 0006 sg@892: ed03d958: 7e8aa700 4020 0x68 comes from 0x28 (the context size) plus the "in_len" rounded down to a block size (0x40). in_len comes from 0x26 bytes of unhashed data from the previous operation, plus the 0x20 bytes from the latest operation. The fixed version would create: sg@892: ed03d938: 3cde2a40 0028 sg@892: ed03d948: 3d03d100 0026 sg@892: ed03d958: 7e8aa700 4020 which replaces the 0x06 length with the correct 0x26 bytes of previously unhashed data. This fixes a previous commit which erroneously "fixed" this due to a DMA-API bug report; that commit indicates that the bug was caused via a test_ahash_pnum() function in the tcrypt module. No such function has ever existed in the mainline kernel. Given that the change in this commit has been tested with DMA API debug enabled and shows no issue, I can only conclude that test_ahash_pnum() was triggering that bad behaviour by CAAM. Fixes: 7d5196aba3c8 ("crypto: caam - Correct DMA unmap size in ahash_update_ctx()") Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 3ce6083c2e43..dcee360065f3 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -829,7 +829,7 @@ static int ahash_update_ctx(struct ahash_request *req) state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg + 1, buf, state->buf_dma, - *next_buflen, *buflen); + *buflen, last_buflen); if (src_nents) { src_map_to_sec4_sg(jrdev, req->src, src_nents, -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] crypto: caam: avoid needlessly saving and restoring caam_hash_ctx
When exporting and importing the hash state, we will only export and import into hashes which share the same struct crypto_ahash pointer. (See hash_accept->af_alg_accept->hash_accept_parent.) This means that saving the caam_hash_ctx structure on export, and restoring it on import is a waste of resources. So, remove this code. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 2faf71ccbd43..3ce6083c2e43 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1575,24 +1575,20 @@ static int ahash_final(struct ahash_request *req) static int ahash_export(struct ahash_request *req, void *out) { struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); - struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); struct caam_hash_state *state = ahash_request_ctx(req); - memcpy(out, ctx, sizeof(struct caam_hash_ctx)); - memcpy(out + sizeof(struct caam_hash_ctx), state, - sizeof(struct caam_hash_state)); + memcpy(out, state, sizeof(struct caam_hash_state)); + return 0; } static int ahash_import(struct ahash_request *req, const void *in) { struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); - struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); struct caam_hash_state *state = ahash_request_ctx(req); - memcpy(ctx, in, sizeof(struct caam_hash_ctx)); - memcpy(state, in + sizeof(struct caam_hash_ctx), - sizeof(struct caam_hash_state)); + memcpy(state, in, sizeof(struct caam_hash_state)); + return 0; } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] crypto: caam: fix indentation of close braces
The kernel's coding style suggests that closing braces for initialisers should not be aligned to the open brace column. The CodingStyle doc shows how this should be done. Remove the additional tab. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 7dd80b0b3f51..f30c93840bba 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1653,8 +1653,8 @@ static struct caam_hash_template driver_hash[] = { .halg = { .digestsize = SHA1_DIGEST_SIZE, .statesize = sizeof(struct caam_export_state), - }, }, + }, .alg_type = OP_ALG_ALGSEL_SHA1, .alg_op = OP_ALG_ALGSEL_SHA1 | OP_ALG_AAI_HMAC, }, { @@ -1675,8 +1675,8 @@ static struct caam_hash_template driver_hash[] = { .halg = { .digestsize = SHA224_DIGEST_SIZE, .statesize = sizeof(struct caam_export_state), - }, }, + }, .alg_type = OP_ALG_ALGSEL_SHA224, .alg_op = OP_ALG_ALGSEL_SHA224 | OP_ALG_AAI_HMAC, }, { @@ -1697,8 +1697,8 @@ static struct caam_hash_template driver_hash[] = { .halg = { .digestsize = SHA256_DIGEST_SIZE, .statesize = sizeof(struct caam_export_state), - }, }, + }, .alg_type = OP_ALG_ALGSEL_SHA256, .alg_op = OP_ALG_ALGSEL_SHA256 | OP_ALG_AAI_HMAC, }, { @@ -1719,8 +1719,8 @@ static struct caam_hash_template driver_hash[] = { .halg = { .digestsize = SHA384_DIGEST_SIZE, .statesize = sizeof(struct caam_export_state), - }, }, + }, .alg_type = OP_ALG_ALGSEL_SHA384, .alg_op = OP_ALG_ALGSEL_SHA384 | OP_ALG_AAI_HMAC, }, { @@ -1741,8 +1741,8 @@ static struct caam_hash_template driver_hash[] = { .halg = { .digestsize = SHA512_DIGEST_SIZE, .statesize = sizeof(struct caam_export_state), - }, }, + }, .alg_type = OP_ALG_ALGSEL_SHA512, .alg_op = OP_ALG_ALGSEL_SHA512 | OP_ALG_AAI_HMAC, }, { @@ -1763,8 +1763,8 @@ static struct caam_hash_template driver_hash[] = { .halg = { .digestsize = MD5_DIGEST_SIZE, .statesize = sizeof(struct caam_export_state), - }, }, + }, .alg_type = OP_ALG_ALGSEL_MD5, .alg_op = OP_ALG_ALGSEL_MD5 | OP_ALG_AAI_HMAC, }, -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] crypto: caam: only export the state we really need to export
Avoid exporting lots of state by only exporting what we really require, which is the buffer containing the set of pending bytes to be hashed, number of pending bytes, the context buffer, and the function pointer state. This reduces down the exported state size to 216 bytes from 576 bytes. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 44 ++ 1 file changed, 40 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index dcee360065f3..7dd80b0b3f51 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -134,6 +134,15 @@ struct caam_hash_state { int current_buf; }; +struct caam_export_state { + u8 buf[CAAM_MAX_HASH_BLOCK_SIZE]; + u8 caam_ctx[MAX_CTX_LEN]; + int buflen; + int (*update)(struct ahash_request *req); + int (*final)(struct ahash_request *req); + int (*finup)(struct ahash_request *req); +}; + /* Common job descriptor seq in/out ptr routines */ /* Map state->caam_ctx, and append seq_out_ptr command that points to it */ @@ -1574,20 +1583,41 @@ static int ahash_final(struct ahash_request *req) static int ahash_export(struct ahash_request *req, void *out) { - struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); struct caam_hash_state *state = ahash_request_ctx(req); + struct caam_export_state *export = out; + int len; + u8 *buf; - memcpy(out, state, sizeof(struct caam_hash_state)); + if (state->current_buf) { + buf = state->buf_1; + len = state->buflen_1; + } else { + buf = state->buf_0; + len = state->buflen_1; + } + + memcpy(export->buf, buf, len); + memcpy(export->caam_ctx, state->caam_ctx, sizeof(export->caam_ctx)); + export->buflen = len; + export->update = state->update; + export->final = state->final; + export->finup = state->finup; return 0; } static int ahash_import(struct ahash_request *req, const void *in) { - struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); struct caam_hash_state *state = ahash_request_ctx(req); + const struct caam_export_state *export = in; - memcpy(state, in, sizeof(struct caam_hash_state)); + memset(state, 0, sizeof(*state)); + memcpy(state->buf_0, export->buf, export->buflen); + memcpy(state->caam_ctx, export->caam_ctx, sizeof(state->caam_ctx)); + state->buflen_0 = export->buflen; + state->update = export->update; + state->final = export->final; + state->finup = export->finup; return 0; } @@ -1622,6 +1652,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA1_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA1, @@ -1643,6 +1674,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA224_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA224, @@ -1664,6 +1696,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA256_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA256, @@ -1685,6 +1718,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA384_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA384, @@ -1706,6 +1740,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA512_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA512, @@ -1727,6 +1762,7 @@ static struct caam_hash_templ
[PATCH 6/6] crypto: caam: fix DMA API leak
caamhash contains this weird code: src_nents = sg_count(req->src, req->nbytes, ); dma_map_sg_chained(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE, chained); ... edesc->src_nents = src_nents; sg_count() returns zero when __sg_count() returns zero or one. This is used to mean that we don't need to use a hardware scatterlist. However, setting src_nents to zero causes problems when we unmap: if (edesc->src_nents) dma_unmap_sg_chained(dev, req->src, edesc->src_nents, DMA_TO_DEVICE, edesc->chained); as zero here means that we have no entries to unmap. This can be fixed in two ways: either by writing the number of entries that were requested of dma_map_sg_chained(), or by reworking the "no SG required" case. We adopt the re-work solution here - we replace sg_count() with __sg_count(), so src_nents now contains the real number of scatterlist entries, and we then change the test for using the hardware scatterlist to src_nents > 1 rather than just non-zero. This change passes my sshd, openssl tests hashing /bin and tcrypt tests. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 25 - 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index f30c93840bba..28434ad08cb4 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1095,10 +1095,13 @@ static int ahash_digest(struct ahash_request *req) u32 options; int sh_len; - src_nents = sg_count(req->src, req->nbytes, ); - dma_map_sg_chained(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE, + src_nents = __sg_count(req->src, req->nbytes, ); + dma_map_sg_chained(jrdev, req->src, src_nents, DMA_TO_DEVICE, chained); - sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); + if (src_nents > 1) + sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); + else + sec4_sg_bytes = 0; /* allocate space for base edesc and hw desc commands, link tables */ edesc = kzalloc(sizeof(*edesc) + sec4_sg_bytes + DESC_JOB_IO_LEN, @@ -1117,7 +1120,7 @@ static int ahash_digest(struct ahash_request *req) desc = edesc->hw_desc; init_job_desc_shared(desc, ptr, sh_len, HDR_SHARE_DEFER | HDR_REVERSE); - if (src_nents) { + if (src_nents > 1) { sg_to_sec4_sg_last(req->src, src_nents, edesc->sec4_sg, 0); edesc->sec4_sg_dma = dma_map_single(jrdev, edesc->sec4_sg, sec4_sg_bytes, DMA_TO_DEVICE); @@ -1447,11 +1450,15 @@ static int ahash_update_first(struct ahash_request *req) to_hash = req->nbytes - *next_buflen; if (to_hash) { - src_nents = sg_count(req->src, req->nbytes - (*next_buflen), -); - dma_map_sg_chained(jrdev, req->src, src_nents ? : 1, + src_nents = __sg_count(req->src, req->nbytes - (*next_buflen), + ); + dma_map_sg_chained(jrdev, req->src, src_nents, DMA_TO_DEVICE, chained); - sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry); + if (src_nents > 1) + sec4_sg_bytes = src_nents * + sizeof(struct sec4_sg_entry); + else + sec4_sg_bytes = 0; /* * allocate space for base edesc and hw desc commands, @@ -1472,7 +1479,7 @@ static int ahash_update_first(struct ahash_request *req) DESC_JOB_IO_LEN; edesc->dst_dma = 0; - if (src_nents) { + if (src_nents > 1) { sg_to_sec4_sg_last(req->src, src_nents, edesc->sec4_sg, 0); edesc->sec4_sg_dma = dma_map_single(jrdev, -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] crypto: marvell: fix missing cpu_to_le32() in mv_cesa_dma_add_op()
When tdma->src is freed in mv_cesa_dma_cleanup(), we convert the DMA address from a little-endian value prior to calling dma_pool_free(). However, mv_cesa_dma_add_op() assigns tdma->src without first converting the DMA address to little endian. Fix this. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/tdma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c index 2738e24bcc59..06bdeac27e3d 100644 --- a/drivers/crypto/marvell/tdma.c +++ b/drivers/crypto/marvell/tdma.c @@ -140,7 +140,7 @@ struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain, tdma = chain->last; tdma->op = op; tdma->byte_cnt = (skip_ctx ? sizeof(op->desc) : sizeof(*op)) | BIT(31); - tdma->src = dma_handle; + tdma->src = cpu_to_le32(dma_handle); tdma->flags = CESA_TDMA_DST_IN_SRAM | CESA_TDMA_OP; return op; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] crypto: marvell: use __le32 for hardware descriptors
Much of the driver uses cpu_to_le32() to convert values for descriptors to little endian before writing. Use __le32 to define the hardware- accessed parts of the descriptors, and ensure most places where it's reasonable to do so use cpu_to_le32() when assigning to these. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/cesa.h | 32 drivers/crypto/marvell/tdma.c | 9 ++--- 2 files changed, 22 insertions(+), 19 deletions(-) diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h index 752131ae1ef2..bd985e72520b 100644 --- a/drivers/crypto/marvell/cesa.h +++ b/drivers/crypto/marvell/cesa.h @@ -174,19 +174,19 @@ #define CESA_SA_DESC_MAC_DATA(offset) \ cpu_to_le32(CESA_SA_DATA_SRAM_OFFSET + (offset)) -#define CESA_SA_DESC_MAC_DATA_MSK GENMASK(15, 0) +#define CESA_SA_DESC_MAC_DATA_MSK cpu_to_le32(GENMASK(15, 0)) #define CESA_SA_DESC_MAC_TOTAL_LEN(total_len) cpu_to_le32((total_len) << 16) -#define CESA_SA_DESC_MAC_TOTAL_LEN_MSK GENMASK(31, 16) +#define CESA_SA_DESC_MAC_TOTAL_LEN_MSK cpu_to_le32(GENMASK(31, 16)) #define CESA_SA_DESC_MAC_SRC_TOTAL_LEN_MAX 0x #define CESA_SA_DESC_MAC_DIGEST(offset) \ cpu_to_le32(CESA_SA_MAC_DIG_SRAM_OFFSET + (offset)) -#define CESA_SA_DESC_MAC_DIGEST_MSKGENMASK(15, 0) +#define CESA_SA_DESC_MAC_DIGEST_MSKcpu_to_le32(GENMASK(15, 0)) #define CESA_SA_DESC_MAC_FRAG_LEN(frag_len)cpu_to_le32((frag_len) << 16) -#define CESA_SA_DESC_MAC_FRAG_LEN_MSK GENMASK(31, 16) +#define CESA_SA_DESC_MAC_FRAG_LEN_MSK cpu_to_le32(GENMASK(31, 16)) #define CESA_SA_DESC_MAC_IV(offset)\ cpu_to_le32((CESA_SA_MAC_IIV_SRAM_OFFSET + (offset)) | \ @@ -219,14 +219,14 @@ * to be executed. */ struct mv_cesa_sec_accel_desc { - u32 config; - u32 enc_p; - u32 enc_len; - u32 enc_key_p; - u32 enc_iv; - u32 mac_src_p; - u32 mac_digest; - u32 mac_iv; + __le32 config; + __le32 enc_p; + __le32 enc_len; + __le32 enc_key_p; + __le32 enc_iv; + __le32 mac_src_p; + __le32 mac_digest; + __le32 mac_iv; }; /** @@ -293,10 +293,10 @@ struct mv_cesa_op_ctx { * operation. */ struct mv_cesa_tdma_desc { - u32 byte_cnt; - u32 src; - u32 dst; - u32 next_dma; + __le32 byte_cnt; + __le32 src; + __le32 dst; + __le32 next_dma; /* Software state */ dma_addr_t cur_dma; diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c index 06bdeac27e3d..76427981275b 100644 --- a/drivers/crypto/marvell/tdma.c +++ b/drivers/crypto/marvell/tdma.c @@ -126,6 +126,7 @@ struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain, struct mv_cesa_tdma_desc *tdma; struct mv_cesa_op_ctx *op; dma_addr_t dma_handle; + unsigned int size; tdma = mv_cesa_dma_add_desc(chain, flags); if (IS_ERR(tdma)) @@ -137,9 +138,11 @@ struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain, *op = *op_templ; + size = skip_ctx ? sizeof(op->desc) : sizeof(*op); + tdma = chain->last; tdma->op = op; - tdma->byte_cnt = (skip_ctx ? sizeof(op->desc) : sizeof(*op)) | BIT(31); + tdma->byte_cnt = cpu_to_le32(size | BIT(31)); tdma->src = cpu_to_le32(dma_handle); tdma->flags = CESA_TDMA_DST_IN_SRAM | CESA_TDMA_OP; @@ -156,7 +159,7 @@ int mv_cesa_dma_add_data_transfer(struct mv_cesa_tdma_chain *chain, if (IS_ERR(tdma)) return PTR_ERR(tdma); - tdma->byte_cnt = size | BIT(31); + tdma->byte_cnt = cpu_to_le32(size | BIT(31)); tdma->src = src; tdma->dst = dst; @@ -185,7 +188,7 @@ int mv_cesa_dma_add_dummy_end(struct mv_cesa_tdma_chain *chain, gfp_t flags) if (IS_ERR(tdma)) return PTR_ERR(tdma); - tdma->byte_cnt = BIT(31); + tdma->byte_cnt = cpu_to_le32(BIT(31)); return 0; } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] crypto: marvell: use dma_addr_t for cur_dma
cur_dma is part of the software state, not read by the hardware. Storing it in LE32 format is wrong, use dma_addr_t for this. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/cesa.h | 4 +++- drivers/crypto/marvell/tdma.c | 6 +++--- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h index e19877402ec9..dbe4970d6c64 100644 --- a/drivers/crypto/marvell/cesa.h +++ b/drivers/crypto/marvell/cesa.h @@ -297,7 +297,9 @@ struct mv_cesa_tdma_desc { u32 src; u32 dst; u32 next_dma; - u32 cur_dma; + + /* Software state */ + dma_addr_t cur_dma; struct mv_cesa_tdma_desc *next; union { struct mv_cesa_op_ctx *op; diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c index e8e8a7f7659b..c8256f5916b0 100644 --- a/drivers/crypto/marvell/tdma.c +++ b/drivers/crypto/marvell/tdma.c @@ -69,7 +69,7 @@ void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq) tdma = tdma->next; dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma, - le32_to_cpu(old_tdma->cur_dma)); + old_tdma->cur_dma); } dreq->chain.first = NULL; @@ -105,9 +105,9 @@ mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags) return ERR_PTR(-ENOMEM); memset(new_tdma, 0, sizeof(*new_tdma)); - new_tdma->cur_dma = cpu_to_le32(dma_handle); + new_tdma->cur_dma = dma_handle; if (chain->last) { - chain->last->next_dma = new_tdma->cur_dma; + chain->last->next_dma = cpu_to_le32(dma_handle); chain->last->next = new_tdma; } else { chain->first = new_tdma; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] crypto: marvell: use memcpy_fromio()/memcpy_toio()
Use the IO memcpy() functions when copying from/to MMIO memory. These locations were found via sparse. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/cipher.c | 11 ++- drivers/crypto/marvell/hash.c | 16 2 files changed, 14 insertions(+), 13 deletions(-) diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c index 4db2d632204f..6edae64bb387 100644 --- a/drivers/crypto/marvell/cipher.c +++ b/drivers/crypto/marvell/cipher.c @@ -98,10 +98,10 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req) /* FIXME: only update enc_len field */ if (!sreq->skip_ctx) { - memcpy(engine->sram, >op, sizeof(sreq->op)); + memcpy_toio(engine->sram, >op, sizeof(sreq->op)); sreq->skip_ctx = true; } else { - memcpy(engine->sram, >op, sizeof(sreq->op.desc)); + memcpy_toio(engine->sram, >op, sizeof(sreq->op.desc)); } mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE); @@ -145,8 +145,9 @@ static int mv_cesa_ablkcipher_process(struct crypto_async_request *req, if (ret) return ret; - memcpy(ablkreq->info, engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, - crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq))); + memcpy_fromio(ablkreq->info, + engine->sram + CESA_SA_CRYPT_IV_SRAM_OFFSET, + crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(ablkreq))); return 0; } @@ -181,7 +182,7 @@ mv_cesa_ablkcipher_std_prepare(struct ablkcipher_request *req) sreq->size = 0; sreq->offset = 0; mv_cesa_adjust_op(engine, >op); - memcpy(engine->sram, >op, sizeof(sreq->op)); + memcpy_toio(engine->sram, >op, sizeof(sreq->op)); } static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req, diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index 84ddc4cbfa9d..8330815d72fc 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -209,8 +209,8 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req) size_t len; if (creq->cache_ptr) - memcpy(engine->sram + CESA_SA_DATA_SRAM_OFFSET, creq->cache, - creq->cache_ptr); + memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET, + creq->cache, creq->cache_ptr); len = min_t(size_t, req->nbytes + creq->cache_ptr - sreq->offset, CESA_SA_SRAM_PAYLOAD_SIZE); @@ -251,10 +251,10 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req) if (len + trailerlen > CESA_SA_SRAM_PAYLOAD_SIZE) { len &= CESA_HASH_BLOCK_SIZE_MSK; new_cache_ptr = 64 - trailerlen; - memcpy(creq->cache, - engine->sram + - CESA_SA_DATA_SRAM_OFFSET + len, - new_cache_ptr); + memcpy_fromio(creq->cache, + engine->sram + + CESA_SA_DATA_SRAM_OFFSET + len, + new_cache_ptr); } else { len += mv_cesa_ahash_pad_req(creq, engine->sram + len + @@ -272,7 +272,7 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req) mv_cesa_update_op_cfg(op, frag_mode, CESA_SA_DESC_CFG_FRAG_MSK); /* FIXME: only update enc_len field */ - memcpy(engine->sram, op, sizeof(*op)); + memcpy_toio(engine->sram, op, sizeof(*op)); if (frag_mode == CESA_SA_DESC_CFG_FIRST_FRAG) mv_cesa_update_op_cfg(op, CESA_SA_DESC_CFG_MID_FRAG, @@ -312,7 +312,7 @@ static void mv_cesa_ahash_std_prepare(struct ahash_request *req) sreq->offset = 0; mv_cesa_adjust_op(engine, >op_tmpl); - memcpy(engine->sram, >op_tmpl, sizeof(creq->op_tmpl)); + memcpy_toio(engine->sram, >op_tmpl, sizeof(creq->op_tmpl)); } static void mv_cesa_ahash_step(struct crypto_async_request *req) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] crypto: marvell: use gfp_t for gfp flags
Use gfp_t not u32 for the GFP flags. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/cesa.h | 6 ++ drivers/crypto/marvell/tdma.c | 5 ++--- 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h index dbe4970d6c64..752131ae1ef2 100644 --- a/drivers/crypto/marvell/cesa.h +++ b/drivers/crypto/marvell/cesa.h @@ -798,10 +798,8 @@ int mv_cesa_dma_add_data_transfer(struct mv_cesa_tdma_chain *chain, dma_addr_t dst, dma_addr_t src, u32 size, u32 flags, gfp_t gfp_flags); -int mv_cesa_dma_add_dummy_launch(struct mv_cesa_tdma_chain *chain, -u32 flags); - -int mv_cesa_dma_add_dummy_end(struct mv_cesa_tdma_chain *chain, u32 flags); +int mv_cesa_dma_add_dummy_launch(struct mv_cesa_tdma_chain *chain, gfp_t flags); +int mv_cesa_dma_add_dummy_end(struct mv_cesa_tdma_chain *chain, gfp_t flags); int mv_cesa_dma_add_op_transfers(struct mv_cesa_tdma_chain *chain, struct mv_cesa_dma_iter *dma_iter, diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c index c8256f5916b0..2738e24bcc59 100644 --- a/drivers/crypto/marvell/tdma.c +++ b/drivers/crypto/marvell/tdma.c @@ -166,8 +166,7 @@ int mv_cesa_dma_add_data_transfer(struct mv_cesa_tdma_chain *chain, return 0; } -int mv_cesa_dma_add_dummy_launch(struct mv_cesa_tdma_chain *chain, -u32 flags) +int mv_cesa_dma_add_dummy_launch(struct mv_cesa_tdma_chain *chain, gfp_t flags) { struct mv_cesa_tdma_desc *tdma; @@ -178,7 +177,7 @@ int mv_cesa_dma_add_dummy_launch(struct mv_cesa_tdma_chain *chain, return 0; } -int mv_cesa_dma_add_dummy_end(struct mv_cesa_tdma_chain *chain, u32 flags) +int mv_cesa_dma_add_dummy_end(struct mv_cesa_tdma_chain *chain, gfp_t flags) { struct mv_cesa_tdma_desc *tdma; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] crypto: marvell: use readl_relaxed()/writel_relaxed()
Use relaxed IO accessors where appropriate. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/cesa.h | 2 +- drivers/crypto/marvell/cipher.c | 2 +- drivers/crypto/marvell/hash.c | 7 +++ drivers/crypto/marvell/tdma.c | 20 ++-- 4 files changed, 15 insertions(+), 16 deletions(-) diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h index e9f732138ba3..e19877402ec9 100644 --- a/drivers/crypto/marvell/cesa.h +++ b/drivers/crypto/marvell/cesa.h @@ -677,7 +677,7 @@ static inline void mv_cesa_set_int_mask(struct mv_cesa_engine *engine, if (int_mask == engine->int_mask) return; - writel(int_mask, engine->regs + CESA_SA_INT_MSK); + writel_relaxed(int_mask, engine->regs + CESA_SA_INT_MSK); engine->int_mask = int_mask; } diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c index 3df2f4e7adb2..4db2d632204f 100644 --- a/drivers/crypto/marvell/cipher.c +++ b/drivers/crypto/marvell/cipher.c @@ -105,7 +105,7 @@ static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req) } mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE); - writel(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG); + writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG); writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD); } diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index 7cd0f0decf6c..84ddc4cbfa9d 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -281,7 +281,7 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req) creq->cache_ptr = new_cache_ptr; mv_cesa_set_int_mask(engine, CESA_SA_INT_ACCEL0_DONE); - writel(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG); + writel_relaxed(CESA_SA_CFG_PARA_DIS, engine->regs + CESA_SA_CFG); writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD); } @@ -344,7 +344,7 @@ static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status) digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq)); for (i = 0; i < digsize / 4; i++) - creq->state[i] = readl(engine->regs + CESA_IVDIG(i)); + creq->state[i] = readl_relaxed(engine->regs + CESA_IVDIG(i)); if (creq->cache_ptr) sg_pcopy_to_buffer(ahashreq->src, creq->src_nents, @@ -390,8 +390,7 @@ static void mv_cesa_ahash_prepare(struct crypto_async_request *req, digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq)); for (i = 0; i < digsize / 4; i++) - writel(creq->state[i], - engine->regs + CESA_IVDIG(i)); + writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i)); } static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req) diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c index 64a366c50174..e8e8a7f7659b 100644 --- a/drivers/crypto/marvell/tdma.c +++ b/drivers/crypto/marvell/tdma.c @@ -41,18 +41,18 @@ void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq) { struct mv_cesa_engine *engine = dreq->base.engine; - writel(0, engine->regs + CESA_SA_CFG); + writel_relaxed(0, engine->regs + CESA_SA_CFG); mv_cesa_set_int_mask(engine, CESA_SA_INT_ACC0_IDMA_DONE); - writel(CESA_TDMA_DST_BURST_128B | CESA_TDMA_SRC_BURST_128B | - CESA_TDMA_NO_BYTE_SWAP | CESA_TDMA_EN, - engine->regs + CESA_TDMA_CONTROL); - - writel(CESA_SA_CFG_ACT_CH0_IDMA | CESA_SA_CFG_MULTI_PKT | - CESA_SA_CFG_CH0_W_IDMA | CESA_SA_CFG_PARA_DIS, - engine->regs + CESA_SA_CFG); - writel(dreq->chain.first->cur_dma, - engine->regs + CESA_TDMA_NEXT_ADDR); + writel_relaxed(CESA_TDMA_DST_BURST_128B | CESA_TDMA_SRC_BURST_128B | + CESA_TDMA_NO_BYTE_SWAP | CESA_TDMA_EN, + engine->regs + CESA_TDMA_CONTROL); + + writel_relaxed(CESA_SA_CFG_ACT_CH0_IDMA | CESA_SA_CFG_MULTI_PKT | + CESA_SA_CFG_CH0_W_IDMA | CESA_SA_CFG_PARA_DIS, + engine->regs + CESA_SA_CFG); + writel_relaxed(dreq->chain.first->cur_dma, + engine->regs + CESA_TDMA_NEXT_ADDR); writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD); } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] Sparse related fixes
Continuing on from the previous set of 18 patches, I also fixed a number of sparse problems and other cleanups. I don't deem these suitable for -rc merging, especially now that we're basically at -rc6. The first patch switches the driver over to appropriately using the relaxed IO accessors - this avoids calling out to the heavy barrier on every read and write operation, but only calling out on those which really matter. We switch to using dma_addr_t for DMA addresses which are not accessed by hardware, and using gfp_t for the get_free_page flags. String-based MMIO accesses are used instead of plain memcpy()/memset() which prevents us potentially stumbling over GCC optimisations that it thinks it may make with these functions. We convert as much of the hardware state to __le32 endian markings, and use cpu_to_le32() as appropriate. A number of places are left unfixed, as we temporarily store CPU native endian values at these locations; these warnings should not be fixed (basically, only appropriate sparse warnings should be fixed without penalising code.) drivers/crypto/marvell/cesa.h | 44 - drivers/crypto/marvell/cipher.c | 13 ++-- drivers/crypto/marvell/hash.c | 23 +++-- drivers/crypto/marvell/tdma.c | 42 --- 4 files changed, 62 insertions(+), 60 deletions(-) -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: caam - Add support for hashing export and import functions
On Sat, Oct 17, 2015 at 11:52:54AM +0100, Russell King - ARM Linux wrote: > Now, with that change, and with your change to buf_0/buf_1, I see > (before the import/export functions are used) several of these errors: > > caam_jr 2101000.jr0: 4501: DECO: desc idx 5: SGT Length Error. The > descriptor is trying to read more data than is contained in the SGT table. > > when checking using openssh. They don't occur when testing with openssl > performing normal hashes (as per my previously posted script) but only > with openssh. The hardware seems to be quite justified in complaining - I've added code to dump out the scatter list as well: jobdesc@879: e4bd7c18: b0861c08 3cd29080 f140 34bd7c38 ...<..@.8|.4 jobdesc@879: e4bd7c28: 0068 f840 3ccf3640 0028 h.@.@6.<(... sg@882: e4bd7c38: 3ccf3640 0028 sg@882: e4bd7c48: 34bd7300 0006 sg@882: e4bd7c58: 7528a070 4020 If I'm interpreting this correctly: f140 34bd7c38 0068 tells the CAAM to read 0x68 bytes using the scatterlist at 0x34bd7c38. The scatterlist here contains three entries, which have lengths 0x28, 0x6 and 0x20, which total up to 0x4e bytes, not 0x68. Now, looking at how the scatterlist is created, I have yet more questions. 1. What's this dma_map_sg_chained() and dma_unmap_sg_chained() stuff? What's wrong with just calling dma_map_sg() on the scatterlist, giving it the number of entries we want to map? dma_map_sg() is supposed to deal with chained scatterlists, if it doesn't, it's buggy. 2. src_map_to_sec4_sg() fails to check whether dma_map_sg_chained() failed. if it failed, we continue anyway, poking invalid addresses into the hardware scatterlist, thereby causing memory corruption. 3. __sg_count() - what's going on with this. This seems to assume that scatterlists aren't chained as: if (!sg_is_last(sg) && (sg + 1)->length == 0) sg + 1 may _not_ be sg_next(sg) - and this will walk off the end of the array. 4. Is the try_buf_map_to_sec4_sg() call correct? Shouldn't it be: @@ -838,7 +838,7 @@ static int ahash_update_ctx(struct ahash_request *req) state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg + 1, buf, state->buf_dma, - *next_buflen, *buflen); + *buflen, last_buflen); if (src_nents) { src_map_to_sec4_sg(jrdev, req->src, src_nents, Indeed, with this change, my ssh test works, and the DECO errors are gone. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: caam - Add support for hashing export and import functions
On Fri, Oct 16, 2015 at 04:19:33PM -0700, Victoria Milhoan wrote: > @@ -1569,6 +1601,10 @@ static int ahash_import(struct ahash_request *req, > const void *in) > struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); > struct caam_hash_state *state = ahash_request_ctx(req); > > + /* Allocate new data buffers to use for this request */ > + state->buf_0 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA); > + state->buf_1 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA); > + I'm really not sure you can do this at all. What if the previous digest calculation prior to the accept() cloning the state was for a non-hash-block aligned size of data. The above will lose that state, and produce an incorrect hash result. Herbert, can you confirm please? -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/5] crypto: ensure algif_hash does not pass a zero-sized state
On Fri, Oct 16, 2015 at 04:24:54PM -0700, Victoria Milhoan wrote: > On Thu, 15 Oct 2015 21:13:38 +0800 > Herbert Xu <herb...@gondor.apana.org.au> wrote: > > > On Thu, Oct 15, 2015 at 01:59:44PM +0100, Russell King - ARM Linux wrote: > > > > > > I think the CAAM driver is pretty unfixable from a trivial point of > > > view. This driver exports a huge amount of state - it contains both a > > > struct caam_hash_ctx and a struct caam_hash_state, which totals up to > > > 1600 bytes. This fails the: > > > > Right just dumping the state out as is not going to work. This > > is not supposed to be how export works anyway. But it doesn't > > look too bad as most of that 1600 is consumed by the hardware > > program descriptor which can easily be regenerated upon import. > > > > The only things that need to be exported AFAICS are key and buf_X. > > I just pushed out a patch for export/import functions in the CAAM driver. The > patch has been through testing with OpenSSL and the AF_ALG plugin on the MX6. Be careful with that. There's two ways to test: 1. Checking hash output. Preparation - copy openssl.cnf and add this to openssl.cnf: openssl_conf= openssl_def [openssl_def] engines = engine_section [engine_section] af_alg = af_alg_engine [af_alg_engine] CIPHERS=aes-128-cbc aes-192-cbc aes-256-cbc des-cbc des-ede3-cbc DIGESTS=md5 sha1 sha256 sha512 # Putting this last means we register the above as the default algorithms default_algorithms = ALL Then: #!/bin/sh for type in md5 sha1 sha256 sha512; do echo -n "Checking $type hash:" for file in /bin/*; do echo -n " $file" if ! OPENSSL_CONF=./openssl.cnf openssl dgst -$type < $file | sed "s,(stdin)= ,,;s,\$,\t$file," | ${type}sum -c > /dev/null; then echo " ... failed" echo -n "Openssl says: " >&2 OPENSSL_CONF=./openssl.cnf openssl dgst -$type < $file | sed "s,(stdin)= ,,;s,\$,\t$file," >&2 echo -n "${type}sum says: " >&2 ${type}sum $file >&2 exit 1 fi done echo " ... ok" done echo "All hashes passed" The above will verify that the hashes are producing the correct answers for a range of files. This does _not_ test the export/import paths. 2. Backup the existing openssl.cnf and replace it with the above modified version. Then try to ssh into the platform. This will verify the export/import side of things. If ssh fails to connect to the target, you know that the crypto drivers for SHA1 are broken, probably due to export/import. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: caam - Add support for hashing export and import functions
On Fri, Oct 16, 2015 at 04:19:33PM -0700, Victoria Milhoan wrote: > @@ -1569,6 +1601,10 @@ static int ahash_import(struct ahash_request *req, > const void *in) > struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); > struct caam_hash_state *state = ahash_request_ctx(req); > > + /* Allocate new data buffers to use for this request */ > + state->buf_0 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA); > + state->buf_1 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA); > + > memcpy(ctx, in, sizeof(struct caam_hash_ctx)); > memcpy(state, in + sizeof(struct caam_hash_ctx), > sizeof(struct caam_hash_state)); This is buggy. You loose the reference to the two buffers you've just allocated because this memcpy() overwrites it. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: caam - Add support for hashing export and import functions
On Fri, Oct 16, 2015 at 04:19:33PM -0700, Victoria Milhoan wrote: > @@ -319,7 +319,7 @@ static int ahash_set_sh_desc(struct crypto_ahash *ahash) > have_key = OP_ALG_AAI_HMAC_PRECOMP; > > /* ahash_update shared descriptor */ > - desc = ctx->sh_desc_update; > + desc = kmalloc(DESC_HASH_MAX_USED_BYTES, GFP_KERNEL | GFP_DMA); What if kmalloc() fails? Should this really oops the kernel? Ditto for every other kmalloc you've added below. > @@ -1557,6 +1575,20 @@ static int ahash_export(struct ahash_request *req, > void *out) > struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); > struct caam_hash_state *state = ahash_request_ctx(req); > > + /* > + * Do not export the data buffers. New buffers are > + * allocated during import. > + */ > + kfree(state->buf_0); > + kfree(state->buf_1); > + state->buf_0 = NULL; > + state->buf_1 = NULL; > + > + state->current_buf = 0; > + state->buf_dma = 0; > + state->buflen_0 = 0; > + state->buflen_1 = 0; > + So what if I export, and then continue using _this_ context later? > @@ -1605,6 +1641,8 @@ static struct caam_hash_template driver_hash[] = { > .setkey = ahash_setkey, > .halg = { > .digestsize = SHA1_DIGEST_SIZE, > + .statesize = sizeof(struct caam_hash_ctx) + > + sizeof(struct caam_hash_state), Much prefer to have a 'struct caam_hash_export_state' thing rather than litter the code with the knowledge that these two go together. > }, > }, > .alg_type = OP_ALG_ALGSEL_SHA1, > @@ -1626,6 +1664,8 @@ static struct caam_hash_template driver_hash[] = { > .setkey = ahash_setkey, > .halg = { > .digestsize = SHA224_DIGEST_SIZE, > + .statesize = sizeof(struct caam_hash_ctx) + > + sizeof(struct caam_hash_state), > }, > }, > .alg_type = OP_ALG_ALGSEL_SHA224, > @@ -1647,6 +1687,8 @@ static struct caam_hash_template driver_hash[] = { > .setkey = ahash_setkey, > .halg = { > .digestsize = SHA256_DIGEST_SIZE, > + .statesize = sizeof(struct caam_hash_ctx) + > + sizeof(struct caam_hash_state), > }, > }, > .alg_type = OP_ALG_ALGSEL_SHA256, > @@ -1668,6 +1710,8 @@ static struct caam_hash_template driver_hash[] = { > .setkey = ahash_setkey, > .halg = { > .digestsize = SHA384_DIGEST_SIZE, > + .statesize = sizeof(struct caam_hash_ctx) + > + sizeof(struct caam_hash_state), > }, > }, > .alg_type = OP_ALG_ALGSEL_SHA384, > @@ -1689,6 +1733,8 @@ static struct caam_hash_template driver_hash[] = { > .setkey = ahash_setkey, > .halg = { > .digestsize = SHA512_DIGEST_SIZE, > + .statesize = sizeof(struct caam_hash_ctx) + > + sizeof(struct caam_hash_state), > }, > }, > .alg_type = OP_ALG_ALGSEL_SHA512, > @@ -1710,6 +1756,8 @@ static struct caam_hash_template driver_hash[] = { > .setkey = ahash_setkey, > .halg = { > .digestsize = MD5_DIGEST_SIZE, > + .statesize = sizeof(struct caam_hash_ctx) + > + sizeof(struct caam_hash_state), > }, > }, > .alg_type = OP_ALG_ALGSEL_MD5, > @@ -1796,6 +1844,12 @@ static void caam_hash_cra_exit(struct crypto_tfm *tfm) > dma_unmap_single(ctx->jrdev, ctx->sh_desc_finup_dma, >desc_bytes(ctx->sh_desc_finup), DMA_TO_DEVICE); > > + kfree(ctx->sh_desc_update); > + kfree(ctx->sh_desc_update_first); > + kfree(ctx->sh_desc_fin); > + kfree(ctx->sh_desc_digest); > + kfree(ctx->sh_desc_finup); > + What happens to these when ahash_import() overwrites all the context state? Doesn't this mean we end up with double-kfree()s ? Also, did you test this DMA debug enabled? -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org
Re: [PATCH] crypto: caam - Add support for hashing export and import functions
On Fri, Oct 16, 2015 at 04:19:33PM -0700, Victoria Milhoan wrote: > @@ -1557,6 +1575,20 @@ static int ahash_export(struct ahash_request *req, > void *out) > struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); > struct caam_hash_state *state = ahash_request_ctx(req); > > + /* > + * Do not export the data buffers. New buffers are > + * allocated during import. > + */ > + kfree(state->buf_0); > + kfree(state->buf_1); > + state->buf_0 = NULL; > + state->buf_1 = NULL; > + > + state->current_buf = 0; > + state->buf_dma = 0; > + state->buflen_0 = 0; > + state->buflen_1 = 0; > + > memcpy(out, ctx, sizeof(struct caam_hash_ctx)); > memcpy(out + sizeof(struct caam_hash_ctx), state, > sizeof(struct caam_hash_state)); > @@ -1569,6 +1601,10 @@ static int ahash_import(struct ahash_request *req, > const void *in) > struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); > struct caam_hash_state *state = ahash_request_ctx(req); > > + /* Allocate new data buffers to use for this request */ > + state->buf_0 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA); > + state->buf_1 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA); > + > memcpy(ctx, in, sizeof(struct caam_hash_ctx)); > memcpy(state, in + sizeof(struct caam_hash_ctx), > sizeof(struct caam_hash_state)); Why are we even saving and restoring the caam_hash_ctx here? That comes from: crypto_tfm_ctx(crypto_ahash_tfm(tfm)), which, tracing that through the inline functions comes out as: tfm->base.__crt_ctx tfm comes from: __crypto_ahash_cast(req->base.tfm), where req->base.tfm is a pointer to struct crypto_ahash's base member. When a hash is accept()ed, req->base.tfm is copied from the original socket handle to the new socket handle (see ahash_request_set_tfm(), called from hash_accept_parent(), which is in turn called from af_alg_accept() in hash_accept()). So both the exporting and importing hashes end up with the same struct caam_hash_ctx. We can see this if we add debug print: [156147.994219] ahash_export(ctx=ed087080,state=e2a6e600) [156147.994257] ahash_import(ctx=ed087080,state=e2a6d600) [156147.998256] ahash_export(ctx=ed083080,state=e2a6c600) [156147.998294] ahash_import(ctx=ed083080,state=edb9ee00) [156147.998659] ahash_export(ctx=ed087080,state=e2a6e600) [156147.998700] ahash_import(ctx=ed087080,state=e2a6fa00) [156147.999002] ahash_export(ctx=ed080880,state=e2a6d200) [156147.999040] ahash_import(ctx=ed080880,state=e2a6d600) Notice that the state changes between each export and import, but the context doesn't. So, I think we can (and should) completely drop the saving and restoring of context in these functions. Now, with that change, and with your change to buf_0/buf_1, I see (before the import/export functions are used) several of these errors: caam_jr 2101000.jr0: 4501: DECO: desc idx 5: SGT Length Error. The descriptor is trying to read more data than is contained in the SGT table. when checking using openssh. They don't occur when testing with openssl performing normal hashes (as per my previously posted script) but only with openssh. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: caam - Add support for hashing export and import functions
On Sat, Oct 17, 2015 at 06:57:44AM -0700, Victoria Milhoan wrote: > Correct - this was apparently the wrong patch I pushed out. The one I'm > actively using has this fixed (this is the only difference). I will make > this change in v2 after reviewing your other comments. Thanks Victoria, but please perform the tests I've suggested in one of my previous emails - merely testing with openssl's built-in tests aren't sufficient. Also, openssl is non-obvious with what it's testing, especially if you're using "openssl speed". I'm afraid it's a case of "you don't know what's being tested unless you've read the openssl source" :( However, while trying to debug the DECO stuff, I've stumbled across this: static inline void append_ptr(u32 *desc, dma_addr_t ptr) { dma_addr_t *offset = (dma_addr_t *)desc_end(desc); *offset = ptr; (*desc) += CAAM_PTR_SZ / CAAM_CMD_SZ; } where I think 'desc' points to memory used for the descriptor, which is read by hardware, correct? If that's true, why are we using dma_addr_t here? dma_addr_t can be either 32-bit or 64-bit depending on the kernel configuration, and I suspect if this driver is built with dma_addr_t set to 64-bit, things will go awry. Also, what endianness are the hardware descriptors? Are they always native CPU endian, or are they fixed as little endian? If they're always little endian, is there any reason not to use __le32 and cpu_to_le32()/ le32_to_cpu() as appropriate to ensure that it works on big endian systems? -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] crypto: caam: print errno code when hash registration fails
Print the errno code when hash registration fails, so we know why the failure occurred. This aids debugging. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 94433b9fc200..2faf71ccbd43 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1952,8 +1952,9 @@ static int __init caam_algapi_hash_init(void) err = crypto_register_ahash(_alg->ahash_alg); if (err) { - pr_warn("%s alg registration failed\n", - t_alg->ahash_alg.halg.base.cra_driver_name); + pr_warn("%s alg registration failed: %d\n", + t_alg->ahash_alg.halg.base.cra_driver_name, + err); kfree(t_alg); } else list_add_tail(_alg->entry, _list); @@ -1968,8 +1969,9 @@ static int __init caam_algapi_hash_init(void) err = crypto_register_ahash(_alg->ahash_alg); if (err) { - pr_warn("%s alg registration failed\n", - t_alg->ahash_alg.halg.base.cra_driver_name); + pr_warn("%s alg registration failed: %d\n", + t_alg->ahash_alg.halg.base.cra_driver_name, + err); kfree(t_alg); } else list_add_tail(_alg->entry, _list); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] crypto: caam: only export the state we really need to export
Avoid exporting lots of state by only exporting what we really require, which is the buffer containing the set of pending bytes to be hashed, number of pending bytes, the context buffer, and the function pointer state. This reduces down the exported state size to 216 bytes from 576 bytes. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 44 ++ 1 file changed, 40 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index dcee360065f3..7dd80b0b3f51 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -134,6 +134,15 @@ struct caam_hash_state { int current_buf; }; +struct caam_export_state { + u8 buf[CAAM_MAX_HASH_BLOCK_SIZE]; + u8 caam_ctx[MAX_CTX_LEN]; + int buflen; + int (*update)(struct ahash_request *req); + int (*final)(struct ahash_request *req); + int (*finup)(struct ahash_request *req); +}; + /* Common job descriptor seq in/out ptr routines */ /* Map state->caam_ctx, and append seq_out_ptr command that points to it */ @@ -1574,20 +1583,41 @@ static int ahash_final(struct ahash_request *req) static int ahash_export(struct ahash_request *req, void *out) { - struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); struct caam_hash_state *state = ahash_request_ctx(req); + struct caam_export_state *export = out; + int len; + u8 *buf; - memcpy(out, state, sizeof(struct caam_hash_state)); + if (state->current_buf) { + buf = state->buf_1; + len = state->buflen_1; + } else { + buf = state->buf_0; + len = state->buflen_1; + } + + memcpy(export->buf, buf, len); + memcpy(export->caam_ctx, state->caam_ctx, sizeof(export->caam_ctx)); + export->buflen = len; + export->update = state->update; + export->final = state->final; + export->finup = state->finup; return 0; } static int ahash_import(struct ahash_request *req, const void *in) { - struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); struct caam_hash_state *state = ahash_request_ctx(req); + const struct caam_export_state *export = in; - memcpy(state, in, sizeof(struct caam_hash_state)); + memset(state, 0, sizeof(*state)); + memcpy(state->buf_0, export->buf, export->buflen); + memcpy(state->caam_ctx, export->caam_ctx, sizeof(state->caam_ctx)); + state->buflen_0 = export->buflen; + state->update = export->update; + state->final = export->final; + state->finup = export->finup; return 0; } @@ -1622,6 +1652,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA1_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA1, @@ -1643,6 +1674,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA224_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA224, @@ -1664,6 +1696,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA256_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA256, @@ -1685,6 +1718,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA384_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA384, @@ -1706,6 +1740,7 @@ static struct caam_hash_template driver_hash[] = { .setkey = ahash_setkey, .halg = { .digestsize = SHA512_DIGEST_SIZE, + .statesize = sizeof(struct caam_export_state), }, }, .alg_type = OP_ALG_ALGSEL_SHA512, @@ -1727,6 +1762,7 @@ static struct caam_hash_templ
[PATCH 3/5] crypto: caam: fix non-block aligned hash calculation
caam does not properly calculate the size of the retained state when non-block aligned hashes are requested - it uses the wrong buffer sizes, which results in errors such as: caam_jr 2102000.jr1: 4501: DECO: desc idx 5: SGT Length Error. The descriptor is trying to read more data than is contained in the SGT table. We end up here with: in_len 0x46 blocksize 0x40 last_bufsize 0x0 next_bufsize 0x6 to_hash 0x40 ctx_len 0x28 nbytes 0x20 which results in a job descriptor of: jobdesc@889: ed03d918: b0861c08 3daa0080 f140 3d03d938 jobdesc@889: ed03d928: 0068 f840 3cde2a40 0028 where the word at 0xed03d928 is the expected data size (0x68), and a scatterlist containing: sg@892: ed03d938: 3cde2a40 0028 sg@892: ed03d948: 3d03d100 0006 sg@892: ed03d958: 7e8aa700 4020 0x68 comes from 0x28 (the context size) plus the "in_len" rounded down to a block size (0x40). in_len comes from 0x26 bytes of unhashed data from the previous operation, plus the 0x20 bytes from the latest operation. The fixed version would create: sg@892: ed03d938: 3cde2a40 0028 sg@892: ed03d948: 3d03d100 0026 sg@892: ed03d958: 7e8aa700 4020 which replaces the 0x06 length with the correct 0x26 bytes of previously unhashed data. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 3ce6083c2e43..dcee360065f3 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -829,7 +829,7 @@ static int ahash_update_ctx(struct ahash_request *req) state->buf_dma = try_buf_map_to_sec4_sg(jrdev, edesc->sec4_sg + 1, buf, state->buf_dma, - *next_buflen, *buflen); + *buflen, last_buflen); if (src_nents) { src_map_to_sec4_sg(jrdev, req->src, src_nents, -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] crypto: caam: avoid needlessly saving and restoring caam_hash_ctx
When exporting and importing the hash state, we will only export and import into hashes which share the same struct crypto_ahash pointer. (See hash_accept->af_alg_accept->hash_accept_parent.) This means that saving the caam_hash_ctx structure on export, and restoring it on import is a waste of resources. So, remove this code. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/caam/caamhash.c | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c index 2faf71ccbd43..3ce6083c2e43 100644 --- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1575,24 +1575,20 @@ static int ahash_final(struct ahash_request *req) static int ahash_export(struct ahash_request *req, void *out) { struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); - struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); struct caam_hash_state *state = ahash_request_ctx(req); - memcpy(out, ctx, sizeof(struct caam_hash_ctx)); - memcpy(out + sizeof(struct caam_hash_ctx), state, - sizeof(struct caam_hash_state)); + memcpy(out, state, sizeof(struct caam_hash_state)); + return 0; } static int ahash_import(struct ahash_request *req, const void *in) { struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); - struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); struct caam_hash_state *state = ahash_request_ctx(req); - memcpy(ctx, in, sizeof(struct caam_hash_ctx)); - memcpy(state, in + sizeof(struct caam_hash_ctx), - sizeof(struct caam_hash_state)); + memcpy(state, in, sizeof(struct caam_hash_state)); + return 0; } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/5] Fix CAAM hash driver
The following series fixes the CAAM hash driver, allowing it to work with the previously merged "crypto: ahash - ensure statesize is non- zero" patch. This is non-trivial, because CAAM exports a huge 1600 bytes of data, which, if we set .statesize to this, still results in the core code rejecting the driver. So, we need to shrink the amount of state exported. The first, most obvious one to get rid of is the export of the caam_hash_ctx structure, which is shared between the socket being exported from and imported to - copying it away and back again was a complete no-op. The second is that we don't need to export both pending-bytes buffers. Only one will be in use at any time. A problem was encountered while testing, where the size of the pending bytes buffer was not added to the scatterlist with the correct length. This is also fixed in this series, by patch 3. drivers/crypto/caam/caamhash.c | 79 +++--- 1 file changed, 59 insertions(+), 20 deletions(-) -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Fix CAAM hash driver
On Sat, Oct 17, 2015 at 07:50:55PM +0100, Russell King - ARM Linux wrote: > The following series fixes the CAAM hash driver, allowing it to work > with the previously merged "crypto: ahash - ensure statesize is non- > zero" patch. > > This is non-trivial, because CAAM exports a huge 1600 bytes of data, > which, if we set .statesize to this, still results in the core code > rejecting the driver. So, we need to shrink the amount of state > exported. > > The first, most obvious one to get rid of is the export of the > caam_hash_ctx structure, which is shared between the socket being > exported from and imported to - copying it away and back again was > a complete no-op. > > The second is that we don't need to export both pending-bytes buffers. > Only one will be in use at any time. > > A problem was encountered while testing, where the size of the pending > bytes buffer was not added to the scatterlist with the correct length. > This is also fixed in this series, by patch 3. > > drivers/crypto/caam/caamhash.c | 79 > +++--- > 1 file changed, 59 insertions(+), 20 deletions(-) This series has been tested using openssl to validate the hashes against md5sum, sha1sum, sha224sum, sha256sum and sha512sum programs of every program in my /bin directory, which checks correct functioning of the hashing. It has also been tested with openssh, where the crypto hardware is used via openssl by sshd, which makes use of accept() and hence tests the export/import paths. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/5] crypto: ensure algif_hash does not pass a zero-sized state
On Thu, Oct 15, 2015 at 05:41:47PM +0800, Herbert Xu wrote: > On Thu, Oct 15, 2015 at 10:39:30AM +0100, Russell King - ARM Linux wrote: > > > > The CAAM driver is similarly buggy - it has export/import functions in > > its ahash drivers, but zero statesize. > > > > User exploitable kernel stack smashing... I'd suggest putting this patch > > into stable kernels as high priority, as I'm pretty sure this could be > > I agree. It should already be on its way to stable as Linus has > pulled it into his tree and it carries a stable cc. Thanks. I think the CAAM driver is pretty unfixable from a trivial point of view. This driver exports a huge amount of state - it contains both a struct caam_hash_ctx and a struct caam_hash_state, which totals up to 1600 bytes. This fails the: alg->halg.statesize > PAGE_SIZE / 8 in ahash_prepare_alg() if we set .statesize. For ARM, this places a limit of 512 bytes on the state size. The CAAM authors need to come up with a better solution (and quickly, as caamhash is going to fail in all kernels soon), or we need to support larger exported states. BTW, I can't find a MAINTAINERS entry for CAAM, so I've just grabbed a couple of addresses from recent git history in the hope they'll know who's responsible. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/5] crypto: ensure algif_hash does not pass a zero-sized state
On Tue, Oct 13, 2015 at 10:33:12PM +0800, Herbert Xu wrote: > On Fri, Oct 09, 2015 at 08:43:33PM +0100, Russell King wrote: > > If the algorithm passed a zero statesize, do not pass a valid pointer > > into the export/import functions. Passing a valid pointer covers up > > bugs in driver code which then go on to smash the kernel stack. > > Instead, pass NULL, which will cause any attempt to write to the > > pointer to fail. > > > > Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> > > Patch applied without the shash hunk. I also replaced your commit > message as it no longer makes any sense: > > crypto: ahash - ensure statesize is non-zero > > Unlike shash algorithms, ahash drivers must implement export > and import as their descriptors may contain hardware state and > cannot be exported as is. Unfortunately some ahash drivers did > not provide them and end up causing crashes with algif_hash. > > This patch adds a check to prevent these drivers from registering > ahash algorithms until they are fixed. > > Thanks, There will be fallout from this. The CAAM driver is similarly buggy - it has export/import functions in its ahash drivers, but zero statesize. User exploitable kernel stack smashing... I'd suggest putting this patch into stable kernels as high priority, as I'm pretty sure this could be used to gain privileges via carefully crafted md5 hashes. I've not proven it, but given that the md5 hash and state data get copied over the kernel stack, it's highly likely that it _is_ exploitable from any user that can create an AF_ALG socket. Yes, it means regressions in the form of various hw crypto no longer being loadable, but I think that's preferable to the security hole here. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions
On Tue, Oct 13, 2015 at 09:00:48PM +0800, Herbert Xu wrote: > On Sun, Oct 11, 2015 at 02:55:05PM +0800, Herbert Xu wrote: > > On Sat, Oct 10, 2015 at 05:17:54PM +0100, Russell King - ARM Linux wrote: > > > > > > I _also_ notice that despite having the ARM assembly crypto functions > > > enabled and built-in to the kernel, they don't appear in /proc/crypto - > > > and this is because of patch 1 in this series, which blocks out any > > > crypto driver which has a zero statesize (as the ARM crypto functions > > > appear to have.) I found this by rebuilding the ARM crypto stuff as > > > modules, and then trying to insert them: > > > > They're buggy and unfortunately this wasn't caught during the > > review process. The import/export functions are required and > > certainly not optional. > > OK it looks like I was confused. Yes the import and export functions > are required but for shash algorithms we provide a simple default. > This means that implementations can simply leave it NULL and they > will then use the default import/export functions which exports > the shash_desc as the state. > > So Russell what I'll do is apply your patch without the hunk > for shash_prepare_alg. IOW we will block any ahash algorithms > that leave state as zero since ahash does not provide a default > import/export function (it can't because it may involve hardware > state). shash algorithms on the other hand won't be affected. Okay. I've a larger queue of patches building at the moment for the Marvell driver - I've found that it's touch-and-go whether it produces the correct hash result, and previous hashes can affect subsequent hashes in some circumstances. I'm currently at another 7 patches, plus a "big" as-yet uncommitted patch which is virtually a rewrite of the DMA preparation - diffstat wise, in total it's looking like: drivers/crypto/marvell/cesa.h | 8 +- drivers/crypto/marvell/hash.c | 258 ++ drivers/crypto/marvell/tdma.c | 7 ++ 3 files changed, 151 insertions(+), 122 deletions(-) and there's still a few more cases that need solving: zero bytes of input should not give a zero hash, and an input which is the multiple of the blocksize causes the hardware to hang - though I'm not yet sure whether that's a result of my above changes. At least with my above changes, I now get stable and correct hash values for sizes which do not hit any of those remaining bugs. Given it's size so far, I'm not sure it makes much sense to backport any of these fixes to stable; I think maybe we should just say "okay, before these fixes, it's just broken." -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions
On Tue, Oct 13, 2015 at 09:57:22PM +0800, Herbert Xu wrote: > On Tue, Oct 13, 2015 at 02:55:18PM +0100, Russell King - ARM Linux wrote: > > > > Given it's size so far, I'm not sure it makes much sense to backport > > any of these fixes to stable; I think maybe we should just say "okay, > > before these fixes, it's just broken." > > OK as long as your first patch goes into stable it should be fine > as it'll essentially disable the marvell driver. It will disable the Marvell driver, along with the SW implementations in arch/arm/crypto/ which don't set .statesize, .import and .export. I would not be surprised if it also affects others too. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
hash import/export
Herbert, I wonder if you can clear something up about the hash export/import functionality. In: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/344120.html you seem to imply that the exported and imported state can't be defined by the driver. Boris tells me, "AFAIR, crypto users are expecting md5/sha1/sha256_state depending on the crypto req state they are exporting". >From what I can see, there is only one place in the core crypto code where hash state is exported and imported, and that's in hash_accept(), and that's always done with the same driver. The only other place is in the marvell cesa driver itself when initialising the hmac state. Is there any reason a driver can't define its own structure to be exported here which can be shared between each of the different methods it supports? Thanks. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hash import/export
On Sun, Oct 11, 2015 at 08:34:27PM +0100, Russell King - ARM Linux wrote: > Herbert, > > I wonder if you can clear something up about the hash export/import > functionality. In: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/344120.html > > you seem to imply that the exported and imported state can't be defined > by the driver. > > Boris tells me, "AFAIR, crypto users are expecting md5/sha1/sha256_state > depending on the crypto req state they are exporting". > > From what I can see, there is only one place in the core crypto code > where hash state is exported and imported, and that's in hash_accept(), > and that's always done with the same driver. The only other place is > in the marvell cesa driver itself when initialising the hmac state. > > Is there any reason a driver can't define its own structure to be > exported here which can be shared between each of the different methods > it supports? A further question: it looks like struct md5_state's hash member is always supposed to be in little endian format, whereas struct sha*_state's state is always big endian format. Is there a reason why these are typed 'u32', rather than __le32 and __be32? This would make the intention here more obvious, and should also allow the opportunity for sparse to verify that we're getting the endianness correct everywhere. Thanks. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/5] crypto: ensure algif_hash does not pass a zero-sized state
On Sat, Oct 10, 2015 at 06:46:07PM +0200, Boris Brezillon wrote: > On Fri, 09 Oct 2015 20:43:33 +0100 > Russell King <rmk+ker...@arm.linux.org.uk> wrote: > > > If the algorithm passed a zero statesize, do not pass a valid pointer > > into the export/import functions. Passing a valid pointer covers up > > bugs in driver code which then go on to smash the kernel stack. > > Instead, pass NULL, which will cause any attempt to write to the > > pointer to fail. > > > > Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> > > --- > > crypto/ahash.c | 3 ++- > > crypto/shash.c | 3 ++- > > 2 files changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/crypto/ahash.c b/crypto/ahash.c > > index 8acb886032ae..9c1dc8d6106a 100644 > > --- a/crypto/ahash.c > > +++ b/crypto/ahash.c > > @@ -544,7 +544,8 @@ static int ahash_prepare_alg(struct ahash_alg *alg) > > struct crypto_alg *base = >halg.base; > > > > if (alg->halg.digestsize > PAGE_SIZE / 8 || > > - alg->halg.statesize > PAGE_SIZE / 8) > > + alg->halg.statesize > PAGE_SIZE / 8 || > > + alg->halg.statesize == 0) > > Just read Russel's answer to the cover letter, and I wonder if the > following test wouldn't fix the problem: > > (alg->halg.statesize == 0 && (alg->import || alg->export)) > > I mean, the only valid case where statesize can be zero is when you > don't have any state associated to the crypto algorithm, and if that's > the case, ->import() and ->export() functions are useless, isn't ? However, that brings up a question. If you're using AF_ALG, and you attach to (say) the ARM Neon SHA512 implementation through it, and then use accept() to duplicate it's state, what prevents the kernel from oopsing when hash_accept() calls crypto_ahash_export(), which then dereferences the NULL alg->export function pointer? -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions
On Sat, Oct 10, 2015 at 12:29:25PM +0100, Russell King - ARM Linux wrote: > On Sat, Oct 10, 2015 at 12:31:29PM +0200, Arnaud Ebalard wrote: > > Russell King - ARM Linux <li...@arm.linux.org.uk> writes: > > > Software: > > > The 'numbers' are in 1000s of bytes per second processed. > > > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 > > > bytes > > > md5 13948.89k42477.61k 104619.41k 165140.82k > > > 199273.13k > > > sha1 13091.91k36463.89k75393.88k 103893.33k > > > 117104.50k > > > sha256 13573.92k30492.25k52700.33k64247.81k > > > 68722.69k > > > > > > Hardware: > > > The 'numbers' are in 1000s of bytes per second processed. > > > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 > > > bytes > > > md5 3964.55k13782.11k43181.71k 180263.38k > > > 1446616.18k > > > sha1 4609.16k 8922.35k35422.87k 333575.31k > > > 2122547.20k > > > sha256 13519.62k30484.10k52547.47k64285.21k > > > 68530.60k > > Okay, the reason for the difference in SHA256 speed is because the > "openssl speed" code *totally* *bypasses* the engine support, whereas > the md5 and sha1 do not. It even bypasses the normal method used to > get hold of the sha256 implementation (EVP_sha256), and goes straight > to using SHA256() directly in openssl/crypto/sha/sha256.c. It looks > like the same goes for the AES tests too. Okay, what's required is openssl speed -evp <cipher|digest> and it can only do one at a time. That's really confusing and non-intuitive, given that some of the non-evp options end up testing the EVP methods. So, running: $ openssl speed -evp aes-128-cbc I get the kernel spitting out a number of these warnings during its run: DRBG: could not allocate digest TFM handle: hmac(sha512) I also notice is that 50% of CPU time is spent in the kernel, presumably because the lack of hardware sha512 means that we end up having to do sha512 in software in the kernel. I _also_ notice that despite having the ARM assembly crypto functions enabled and built-in to the kernel, they don't appear in /proc/crypto - and this is because of patch 1 in this series, which blocks out any crypto driver which has a zero statesize (as the ARM crypto functions appear to have.) I found this by rebuilding the ARM crypto stuff as modules, and then trying to insert them: # modprobe sha512-arm modprobe: ERROR: could not insert 'sha512_arm': Invalid argument So, I think it's best if this patch series is *NOT* merged until someone who knows the kernel crypto code gets to grips with what's supposed to be done in various parts of the code. Yes, I know that Herbert suggested the approach in patch 1, but that _will_ cause regressions like the above when if it's merged. Either the ARM crypto code in arch/arm/crypto is buggy, or the approach in patch 1 is buggy. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions
On Sat, Oct 10, 2015 at 12:31:29PM +0200, Arnaud Ebalard wrote: > Hi Russel, ^ > Russell King - ARM Linux <li...@arm.linux.org.uk> writes: > > Software: > > The 'numbers' are in 1000s of bytes per second processed. > > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 > > bytes > > md5 13948.89k42477.61k 104619.41k 165140.82k > > 199273.13k > > sha1 13091.91k36463.89k75393.88k 103893.33k > > 117104.50k > > sha256 13573.92k30492.25k52700.33k64247.81k > > 68722.69k > > > > Hardware: > > The 'numbers' are in 1000s of bytes per second processed. > > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 > > bytes > > md5 3964.55k13782.11k43181.71k 180263.38k > > 1446616.18k > > sha1 4609.16k 8922.35k35422.87k 333575.31k > > 2122547.20k > > sha256 13519.62k30484.10k52547.47k64285.21k > > 68530.60k Okay, the reason for the difference in SHA256 speed is because the "openssl speed" code *totally* *bypasses* the engine support, whereas the md5 and sha1 do not. It even bypasses the normal method used to get hold of the sha256 implementation (EVP_sha256), and goes straight to using SHA256() directly in openssl/crypto/sha/sha256.c. It looks like the same goes for the AES tests too. > I had a lot of performance results at various levels (tcrypt module on > variations of the drivers (tasklet, threaded irq, full polling, etc), > IPsec tunnel and transport mode through to see how it behaves w/ two > mvneta instances also eating CPU cycles for incoming/outgoing packets) > but those where done on an encryption use case. Some are provided > in [2]. In an early (read dirty) polling-based version of the driver, > the CESA on an Armada 370 (mirabox) was verified to be capable of near > 100MB/s on buffers of 1500+ bytes for AES CBC encryption. Current > version of the driver is not as good (say half that value) but it > behaves better. A Mirabox can easily route 1500 bytes packets at 100MB/s > between its two interfaces but when you mix both using IPsec in tunnel > mode on one side, you end up w/ perfs between 10 to 15MB/s, IIRC. I > think it's interesting to see where it ends up w/ the engine exposed to > userland consumers (e.g. sth like SSH). > > I cannot promise a huge amount of time but I'll try and find some to > play w/ AF_ALG using openssl and CESA in the coming weeks. I think what we draw from my investigation is that "openssl speed" is utterly crap - you don't actually know what's being tested there. Some things test the engine, others bypass the engine infrastructure totally and test the openssl software implementation instead. So, if you think "openssl speed" is a good way to measure the speed of digests and ciphers that openssl supplies to applications, *think again*. It doesn't. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions
On Sat, Oct 10, 2015 at 12:37:33AM +0200, Arnaud Ebalard wrote: > Hi Russel, > > Russell King <rmk+ker...@arm.linux.org.uk> writes: > > > As all the import functions and export functions are virtually > > identical, factor out their common parts into a generic > > mv_cesa_ahash_import() and mv_cesa_ahash_export() respectively. This > > performs the actual import or export, and we pass the data pointers and > > length into these functions. > > > > We have to switch a % const operation to do_div() in the common import > > function to avoid provoking gcc to use the expensive 64-bit by 64-bit > > modulus operation. > > > > Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> > > Thanks for the refactoring and for the fixes. All patches look good to > me. Out of curiosity, can I ask what perf you get w/ openssh or openssl > using AF_ALG and the CESA? I would do, but it seems this AF_ALG plugin for openssl isn't actually using it for encryption. When I try: openssl speed -engine af_alg aes-128-cbc I get results for using openssl's software implementation. If I do: openssl speed -engine af_alg md5 then I get results from using the kernel's MD5. Hence, I think the only thing that I think openssh is using it for is the digest stuff, not the crypto itself. I can't be certain about that though. I've tried debugging the af_alg engine plugin, but I'm not getting very far (I'm not an openssl hacker!) I see it registering the function to get the ciphers (via ENGINE_set_ciphers), and I see this called several times, returning a list of NID_xxx values describing the methods it supports, which includes aes-128-cbc. However, unlike the equivalent digest function, I never see it called requesting any of the ciphers. Maybe it's an openssl bug, or a "feature" preventing hardware crypto? Maybe something is missing from its initialisation? I've no idea yet. It seems I'm not alone in this - this report from April 2015 is exactly what I'm seeing: https://mta.openssl.org/pipermail/openssl-users/2015-April/001124.html However, I'm coming to the conclusion that AF_ALG with openssl is a dead project, and the only interface that everyone is using for that is cryptodev - probably contary to Herbert and/or DaveM's wishes. For example, the openwrt guys seem to only support cryptodev, according to their wiki page on the subject of hardware crypto: http://wiki.openwrt.org/doc/hardware/cryptographic.hardware.accelerators Here's the references to code for AF_ALG with openssl I've found so far: Original af_alg plugin (dead): http://src.carnivore.it/users/common/af_alg/ 3rd party "maintained" af_alg openssl plugin, derived from commit 1851bbb010c38878c83729be844f168192059189 in the above repo but with no history: https://github.com/RidgeRun/af-alg-rr and that doesn't contain any changes to the C code originally committed. Whether this C code contains changes or not is anyone's guess: there's no way to refer back to the original repository. Anyway, here's the digest results: Software: The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes md5 13948.89k42477.61k 104619.41k 165140.82k 199273.13k sha1 13091.91k36463.89k75393.88k 103893.33k 117104.50k sha256 13573.92k30492.25k52700.33k64247.81k68722.69k Hardware: The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes md5 3964.55k13782.11k43181.71k 180263.38k 1446616.18k sha1 4609.16k 8922.35k35422.87k 333575.31k 2122547.20k sha256 13519.62k30484.10k52547.47k64285.21k68530.60k There's actually something suspicious while running these tests: Doing md5 for 3s on 16 size blocks: 32212 md5's in 0.13s Doing md5 for 3s on 64 size blocks: 23688 md5's in 0.11s Doing md5 for 3s on 256 size blocks: 23615 md5's in 0.14s Doing md5 for 3s on 1024 size blocks: 22885 md5's in 0.13s Doing md5 for 3s on 8192 size blocks: 15893 md5's in 0.09s Doing sha1 for 3s on 16 size blocks: 31688 sha1's in 0.11s Doing sha1 for 3s on 64 size blocks: 23700 sha1's in 0.17s Doing sha1 for 3s on 256 size blocks: 23523 sha1's in 0.17s Doing sha1 for 3s on 1024 size blocks: 22803 sha1's in 0.07s Doing sha1 for 3s on 8192 size blocks: 15546 sha1's in 0.06s Doing sha256 for 3s on 16 size blocks: 2518030 sha256's in 2.98s Doing sha256 for 3s on 64 size blocks: 1419416 sha256's in 2.98s Doing sha256 for 3s on 256 size blocks: 613738 sha256's in 2.99s Doing sha256 for 3s on 1024 size blocks: 187080 sha256's in 2.98s Doing sha256 for 3s on 8192 size blocks: 25013 sha256's in 2.99s from the hardware - note the "in" figures are rediculously low, yet t
[PATCH 3/3] crypto: marvell: initialise struct mv_cesa_ahash_req
When a AF_ALG fd is accepted a second time (hence hash_accept() is used), hash_accept_parent() allocates a new private context using sock_kmalloc(). This context is uninitialised. After use of the new fd, we eventually end up with the kernel complaining: marvell-cesa f109.crypto: dma_pool_free cesa_padding, c0627770/0 (bad dma) where c0627770 is a random address. Poisoning the memory allocated by the above sock_kmalloc() produces kernel oopses within the marvell hash code, particularly the interrupt handling. The following simplfied call sequence occurs: hash_accept() crypto_ahash_export() marvell hash export function af_alg_accept() hash_accept_parent()<== allocates uninitialised struct hash_ctx crypto_ahash_import() marvell hash import function hash_ctx contains the struct mv_cesa_ahash_req in its req.__ctx member, and, as the marvell hash import function only partially initialises this structure, we end up with a lot of members which are left with whatever data was in memory prior to sock_kmalloc(). Add zero-initialisation of this structure. Signed-off-by: Russell King <rmk+ker...@arm.linux.org.uk> --- drivers/crypto/marvell/hash.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c index a259aced3b42..699839816505 100644 --- a/drivers/crypto/marvell/hash.c +++ b/drivers/crypto/marvell/hash.c @@ -831,6 +831,8 @@ static int mv_cesa_md5_import(struct ahash_request *req, const void *in) unsigned int cache_ptr; int ret; + memset(creq, 0, sizeof(*creq)); + creq->len = in_state->byte_count; memcpy(creq->state, in_state->hash, digsize); creq->cache_ptr = 0; @@ -921,6 +923,8 @@ static int mv_cesa_sha1_import(struct ahash_request *req, const void *in) unsigned int cache_ptr; int ret; + memset(creq, 0, sizeof(*creq)); + creq->len = in_state->count; memcpy(creq->state, in_state->state, digsize); creq->cache_ptr = 0; @@ -1022,6 +1026,8 @@ static int mv_cesa_sha256_import(struct ahash_request *req, const void *in) unsigned int cache_ptr; int ret; + memset(creq, 0, sizeof(*creq)); + creq->len = in_state->count; memcpy(creq->state, in_state->state, digsize); creq->cache_ptr = 0; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html