Re: [PATCH v9 00/20] simplify crypto wait for async op

2017-10-17 Thread Russell King - ARM Linux
On Sun, Oct 15, 2017 at 10:19:45AM +0100, Gilad Ben-Yossef wrote:
> Many users of kernel async. crypto services have a pattern of
> starting an async. crypto op and than using a completion
> to wait for it to end.
> 
> This patch set simplifies this common use case in two ways:
> 
> First, by separating the return codes of the case where a
> request is queued to a backlog due to the provider being
> busy (-EBUSY) from the case the request has failed due
> to the provider being busy and backlogging is not enabled
> (-EAGAIN).
> 
> Next, this change is than built on to create a generic API
> to wait for a async. crypto operation to complete.
> 
> The end result is a smaller code base and an API that is
> easier to use and more difficult to get wrong.
> 
> The patch set was boot tested on x86_64 and arm64 which
> at the very least tests the crypto users via testmgr and
> tcrypt but I do note that I do not have access to some
> of the HW whose drivers are modified nor do I claim I was
> able to test all of the corner cases.
> 
> The patch set is based upon linux-next release tagged
> next-20171013.

Has there been any performance impact analysis of these changes?  I
ended up with patches for one of the crypto drivers which converted
its interrupt handling to threaded interrupts being reverted because
it caused a performance degredation.

Moving code to latest APIs to simplify it is not always beneficial.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: [BISECT] ARM build errors on GCC v6.2 (crypto: arm/aes - replace scalar AES cipher)

2017-01-14 Thread Russell King - ARM Linux
On Sat, Jan 14, 2017 at 04:24:35PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> allyesconfig and multi_v7_defconfig fail to build on recent linux-next
> on GCC 6.2.0.
> 
> Errors:
> ../arch/arm/crypto/aes-cipher-core.S: Assembler messages:
> ../arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not 
> support `tt .req ip' in ARM mode
> ../arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- 
> `movw tt,#:lower16:crypto_ft_tab'
> ../arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- 
> `movt tt,#:upper16:crypto_ft_tab'
> 
> Compiler: arm-linux-gnueabi-gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005

It's worth pointing out that _assembler_ messages come from _binutils_
rather than _gcc_.   Please advise which version of _binutils_ you
are using.  Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] ARM: wire up HWCAP2 feature bits to the CPU modalias

2017-01-02 Thread Russell King - ARM Linux
On Mon, Jan 02, 2017 at 09:06:04PM +, Ard Biesheuvel wrote:
> On 31 October 2016 at 16:13, Russell King - ARM Linux
> <li...@armlinux.org.uk> wrote:
> > On Sat, Oct 29, 2016 at 11:08:36AM +0100, Ard Biesheuvel wrote:
> >> On 18 October 2016 at 11:52, Ard Biesheuvel <ard.biesheu...@linaro.org> 
> >> wrote:
> >> > Wire up the generic support for exposing CPU feature bits via the
> >> > modalias in /sys/device/system/cpu. This allows udev to automatically
> >> > load modules for things like crypto algorithms that are implemented
> >> > using optional instructions.
> >> >
> >> > Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
> >> > ---
> >> >  arch/arm/Kconfig  |  1 +
> >> >  arch/arm/include/asm/cpufeature.h | 32 
> >> >  2 files changed, 33 insertions(+)
> >> >
> >>
> >> Russell,
> >>
> >> do you have any concerns regarding this patch? If not, I will drop it
> >> into the patch system.
> >
> > It's still something I need to look at... I've been offline last week,
> > and sort-of offline the previous week, so I'm catching up.
> >
> 
> Hi Russell,
> 
> Any thoughts yet?

None, and the patch is well buried now that it'll take me a while to
find... back in mid-October?  Yea, I'll have to drop everything and
go digging through my mailboxes to find it... and I'm just catching
up (again) after a week and a bit's time offline - yep, it's wonderful
timing.  Sorry, no time to look at it right now, you're not the only
one wanting my attention at the moment.

Please try again in about a week's time - don't leave it a few months,
and please include the patch.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/14] Revert "crypto: caam - get rid of tasklet"

2016-11-09 Thread Russell King - ARM Linux
Please include Thomas in this.

On Wed, Nov 09, 2016 at 10:46:21AM +0200, Horia Geantă wrote:
> This reverts commit 66d2e2028091a074aa1290d2eeda5ddb1a6c329c.
> 
> Quoting from Russell's findings:
> https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg21136.html
> 
> [quote]
> Okay, I've re-tested, using a different way of measuring, because using
> openssl speed is impractical for off-loaded engines.  I've decided to
> use this way to measure the performance:
> 
> dd if=/dev/zero bs=1048576 count=128 | /usr/bin/time openssl dgst -md5
> 
> For the threaded IRQs case gives:
> 
> 0.05user 2.74system 0:05.30elapsed 52%CPU (0avgtext+0avgdata 2400maxresident)k
> 0.06user 2.52system 0:05.18elapsed 49%CPU (0avgtext+0avgdata 2404maxresident)k
> 0.12user 2.60system 0:05.61elapsed 48%CPU (0avgtext+0avgdata 2460maxresident)k
>   => 5.36s => 25.0MB/s
> 
> and the tasklet case:
> 
> 0.08user 2.53system 0:04.83elapsed 54%CPU (0avgtext+0avgdata 2468maxresident)k
> 0.09user 2.47system 0:05.16elapsed 49%CPU (0avgtext+0avgdata 2368maxresident)k
> 0.10user 2.51system 0:04.87elapsed 53%CPU (0avgtext+0avgdata 2460maxresident)k
>   => 4.95 => 27.1MB/s
> 
> which corresponds to an 8% slowdown for the threaded IRQ case.  So,
> tasklets are indeed faster than threaded IRQs.
> 
> [...]
> 
> I think I've proven from the above that this patch needs to be reverted
> due to the performance regression, and that there _is_ most definitely
> a deterimental effect of switching from tasklets to threaded IRQs.
> [/quote]
> 
> Signed-off-by: Horia Geantă 
> ---
>  drivers/crypto/caam/intern.h |  1 +
>  drivers/crypto/caam/jr.c | 25 -
>  2 files changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h
> index 5d4c05074a5c..e2bcacc1a921 100644
> --- a/drivers/crypto/caam/intern.h
> +++ b/drivers/crypto/caam/intern.h
> @@ -41,6 +41,7 @@ struct caam_drv_private_jr {
>   struct device   *dev;
>   int ridx;
>   struct caam_job_ring __iomem *rregs;/* JobR's register space */
> + struct tasklet_struct irqtask;
>   int irq;/* One per queue */
>  
>   /* Number of scatterlist crypt transforms active on the JobR */
> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
> index 7331ea734f37..c8604dfadbf5 100644
> --- a/drivers/crypto/caam/jr.c
> +++ b/drivers/crypto/caam/jr.c
> @@ -73,6 +73,8 @@ static int caam_jr_shutdown(struct device *dev)
>  
>   ret = caam_reset_hw_jr(dev);
>  
> + tasklet_kill(>irqtask);
> +
>   /* Release interrupt */
>   free_irq(jrp->irq, dev);
>  
> @@ -128,7 +130,7 @@ static irqreturn_t caam_jr_interrupt(int irq, void 
> *st_dev)
>  
>   /*
>* Check the output ring for ready responses, kick
> -  * the threaded irq if jobs done.
> +  * tasklet if jobs done.
>*/
>   irqstate = rd_reg32(>rregs->jrintstatus);
>   if (!irqstate)
> @@ -150,13 +152,18 @@ static irqreturn_t caam_jr_interrupt(int irq, void 
> *st_dev)
>   /* Have valid interrupt at this point, just ACK and trigger */
>   wr_reg32(>rregs->jrintstatus, irqstate);
>  
> - return IRQ_WAKE_THREAD;
> + preempt_disable();
> + tasklet_schedule(>irqtask);
> + preempt_enable();
> +
> + return IRQ_HANDLED;
>  }
>  
> -static irqreturn_t caam_jr_threadirq(int irq, void *st_dev)
> +/* Deferred service handler, run as interrupt-fired tasklet */
> +static void caam_jr_dequeue(unsigned long devarg)
>  {
>   int hw_idx, sw_idx, i, head, tail;
> - struct device *dev = st_dev;
> + struct device *dev = (struct device *)devarg;
>   struct caam_drv_private_jr *jrp = dev_get_drvdata(dev);
>   void (*usercall)(struct device *dev, u32 *desc, u32 status, void *arg);
>   u32 *userdesc, userstatus;
> @@ -230,8 +237,6 @@ static irqreturn_t caam_jr_threadirq(int irq, void 
> *st_dev)
>  
>   /* reenable / unmask IRQs */
>   clrsetbits_32(>rregs->rconfig_lo, JRCFG_IMSK, 0);
> -
> - return IRQ_HANDLED;
>  }
>  
>  /**
> @@ -389,10 +394,11 @@ static int caam_jr_init(struct device *dev)
>  
>   jrp = dev_get_drvdata(dev);
>  
> + tasklet_init(>irqtask, caam_jr_dequeue, (unsigned long)dev);
> +
>   /* Connect job ring interrupt handler. */
> - error = request_threaded_irq(jrp->irq, caam_jr_interrupt,
> -  caam_jr_threadirq, IRQF_SHARED,
> -  dev_name(dev), dev);
> + error = request_irq(jrp->irq, caam_jr_interrupt, IRQF_SHARED,
> + dev_name(dev), dev);
>   if (error) {
>   dev_err(dev, "can't connect JobR %d interrupt (%d)\n",
>   jrp->ridx, jrp->irq);
> @@ -454,6 +460,7 @@ static int caam_jr_init(struct device *dev)
>  out_free_irq:
>   free_irq(jrp->irq, dev);
>  out_kill_deq:
> + tasklet_kill(>irqtask);
>   return error;
>  }
>  
> -- 
> 

Re: [PATCH 1/5] ARM: wire up HWCAP2 feature bits to the CPU modalias

2016-10-31 Thread Russell King - ARM Linux
On Sat, Oct 29, 2016 at 11:08:36AM +0100, Ard Biesheuvel wrote:
> On 18 October 2016 at 11:52, Ard Biesheuvel  wrote:
> > Wire up the generic support for exposing CPU feature bits via the
> > modalias in /sys/device/system/cpu. This allows udev to automatically
> > load modules for things like crypto algorithms that are implemented
> > using optional instructions.
> >
> > Signed-off-by: Ard Biesheuvel 
> > ---
> >  arch/arm/Kconfig  |  1 +
> >  arch/arm/include/asm/cpufeature.h | 32 
> >  2 files changed, 33 insertions(+)
> >
> 
> Russell,
> 
> do you have any concerns regarding this patch? If not, I will drop it
> into the patch system.

It's still something I need to look at... I've been offline last week,
and sort-of offline the previous week, so I'm catching up.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND][PATCH] crypto: caam: add support for iMX6UL

2016-10-31 Thread Russell King - ARM Linux
On Mon, Oct 17, 2016 at 01:28:00PM +0200, Marcus Folkesson wrote:
> i.MX6UL does only require three clocks to enable CAAM module.
> 
> Signed-off-by: Marcus Folkesson 
> Acked-by: Rob Herring 
> Reviewed-by: Horia Geantă 
> ---
>  .../devicetree/bindings/crypto/fsl-sec4.txt| 20 +
>  drivers/crypto/caam/ctrl.c | 35 
> --
>  2 files changed, 40 insertions(+), 15 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt 
> b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
> index adeca34..10a425f 100644
> --- a/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
> +++ b/Documentation/devicetree/bindings/crypto/fsl-sec4.txt
> @@ -123,6 +123,9 @@ PROPERTIES
>  
>  
>  EXAMPLE
> +
> +iMX6QDL/SX requires four clocks
> +
>   crypto@30 {
>   compatible = "fsl,sec-v4.0";
>   fsl,sec-era = <2>;
> @@ -139,6 +142,23 @@ EXAMPLE
>   clock-names = "mem", "aclk", "ipg", "emi_slow";
>   };
>  
> +
> +iMX6UL does only require three clocks
> +
> + crypto: caam@214 {
> + compatible = "fsl,sec-v4.0";
> + #address-cells = <1>;
> + #size-cells = <1>;
> + reg = <0x214 0x3c000>;
> + ranges = <0 0x214 0x3c000>;
> + interrupts = ;
> +
> + clocks = < IMX6UL_CLK_CAAM_MEM>,
> +  < IMX6UL_CLK_CAAM_ACLK>,
> +  < IMX6UL_CLK_CAAM_IPG>;
> + clock-names = "mem", "aclk", "ipg";
> + };
> +
>  =
>  Job Ring (JR) Node
>  
> diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
> index 0ec112e..5abaf37 100644
> --- a/drivers/crypto/caam/ctrl.c
> +++ b/drivers/crypto/caam/ctrl.c
> @@ -329,8 +329,8 @@ static int caam_remove(struct platform_device *pdev)
>   clk_disable_unprepare(ctrlpriv->caam_ipg);
>   clk_disable_unprepare(ctrlpriv->caam_mem);
>   clk_disable_unprepare(ctrlpriv->caam_aclk);
> - clk_disable_unprepare(ctrlpriv->caam_emi_slow);
> -
> + if (!of_machine_is_compatible("fsl,imx6ul"))
> + clk_disable_unprepare(ctrlpriv->caam_emi_slow);

There is no need to re-lookup the platform here.  Can't you check the
validity of ctrlpriv->caam_emi_slow ?

>   return 0;
>  }
>  
> @@ -481,14 +481,16 @@ static int caam_probe(struct platform_device *pdev)
>   }
>   ctrlpriv->caam_aclk = clk;
>  
> - clk = caam_drv_identify_clk(>dev, "emi_slow");
> - if (IS_ERR(clk)) {
> - ret = PTR_ERR(clk);
> - dev_err(>dev,
> - "can't identify CAAM emi_slow clk: %d\n", ret);
> - return ret;
> + if (!of_machine_is_compatible("fsl,imx6ul")) {
> + clk = caam_drv_identify_clk(>dev, "emi_slow");
> + if (IS_ERR(clk)) {
> + ret = PTR_ERR(clk);
> + dev_err(>dev,
> + "can't identify CAAM emi_slow clk: %d\n", ret);
> + return ret;
> + }
> + ctrlpriv->caam_emi_slow = clk;
>   }
> - ctrlpriv->caam_emi_slow = clk;
>  
>   ret = clk_prepare_enable(ctrlpriv->caam_ipg);
>   if (ret < 0) {
> @@ -509,11 +511,13 @@ static int caam_probe(struct platform_device *pdev)
>   goto disable_caam_mem;
>   }
>  
> - ret = clk_prepare_enable(ctrlpriv->caam_emi_slow);
> - if (ret < 0) {
> - dev_err(>dev, "can't enable CAAM emi slow clock: %d\n",
> - ret);
> - goto disable_caam_aclk;
> + if (!of_machine_is_compatible("fsl,imx6ul")) {

Same here.

> + ret = clk_prepare_enable(ctrlpriv->caam_emi_slow);
> + if (ret < 0) {
> + dev_err(>dev, "can't enable CAAM emi slow clock: 
> %d\n",
> + ret);
> + goto disable_caam_aclk;
> + }
>   }
>  
>   /* Get configuration properties from device tree */
> @@ -829,7 +833,8 @@ caam_remove:
>  iounmap_ctrl:
>   iounmap(ctrl);
>  disable_caam_emi_slow:
> - clk_disable_unprepare(ctrlpriv->caam_emi_slow);
> + if (!of_machine_is_compatible("fsl,imx6ul"))
> + clk_disable_unprepare(ctrlpriv->caam_emi_slow);

and here.

>  disable_caam_aclk:
>   clk_disable_unprepare(ctrlpriv->caam_aclk);
>  disable_caam_mem:
> -- 
> 2.8.0
> 

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: crypto: caam from tasklet to threadirq

2016-09-20 Thread Russell King - ARM Linux
Okay, I've re-tested, using a different way of measuring, because using
openssl speed is impractical for off-loaded engines.  I've decided to
use this way to measure the performance:

dd if=/dev/zero bs=1048576 count=128 | /usr/bin/time openssl dgst -md5

For the threaded IRQs case gives:

0.05user 2.74system 0:05.30elapsed 52%CPU (0avgtext+0avgdata 2400maxresident)k
0.06user 2.52system 0:05.18elapsed 49%CPU (0avgtext+0avgdata 2404maxresident)k
0.12user 2.60system 0:05.61elapsed 48%CPU (0avgtext+0avgdata 2460maxresident)k
=> 5.36s => 25.0MB/s

and the tasklet case:

0.08user 2.53system 0:04.83elapsed 54%CPU (0avgtext+0avgdata 2468maxresident)k
0.09user 2.47system 0:05.16elapsed 49%CPU (0avgtext+0avgdata 2368maxresident)k
0.10user 2.51system 0:04.87elapsed 53%CPU (0avgtext+0avgdata 2460maxresident)k
=> 4.95 => 27.1MB/s

which corresponds to an 8% slowdown for the threaded IRQ case.  So,
tasklets are indeed faster than threaded IRQs.

I guess the reason is that tasklets are much simpler, being able to
run just before we return to userspace without involving scheduler
overheads, but that's speculation.

I've tried to perf it, but...

Samples: 31K of event 'cycles', Event count (approx.): 3552246846
  Overhead  Command  Shared Object Symbol
+   33.22%  kworker/0:1  [kernel.vmlinux]  [k] __do_softirq
+   15.78%  irq/311-2101000  [kernel.vmlinux]  [k] __do_softirq
+7.49%  irqbalance   [kernel.vmlinux]  [k] __do_softirq
+7.26%  openssl  [kernel.vmlinux]  [k] __do_softirq
+5.71%  ksoftirqd/0  [kernel.vmlinux]  [k] __do_softirq
+3.64%  kworker/0:2  [kernel.vmlinux]  [k] __do_softirq
+3.52%  swapper  [kernel.vmlinux]  [k] __do_softirq
+3.14%  kworker/0:1  [kernel.vmlinux]  [k] _raw_spin_unlock_irq

I was going to try to get the threaded IRQ case, but I've ended up with
perf getting buggered because of the iMX6 SMP perf disfunctionality:

[ 3448.810416] irq 24: nobody cared (try booting with the "irqpoll" option)
...
[ 3448.824528] Disabling IRQ #24

caused by FSL's utterly brain-dead idea of routing all the perf
interrupts to single non-CPU local interrupt input, and the refusal of
kernel folk to find an acceptable solution to support this.

So, sorry, I'm not going to bother trying to get any further with this.
If the job was not made harder stupid hardware design and kernel
politics, then I might be more inclined to do deeper investigation, but
right now I'm finding that I'm not interested in trying to jump through
these stupid hoops.

I think I've proven from the above that this patch needs to be reverted
due to the performance regression, and that there _is_ most definitely
a deterimental effect of switching from tasklets to threaded IRQs.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: crypto: caam from tasklet to threadirq

2016-09-16 Thread Russell King - ARM Linux
On Fri, Sep 16, 2016 at 02:01:00PM +, Cata Vasile wrote:
> Hi,
> 
> We've tried to test and benchmark your submitted work[1].
> 
> Cryptographic offloading is also used in IPsec in the Linux Kernel. In
> heavy traffic scenarios, the NIC driver competes with the crypto device
> driver. Most NICs use the NAPI context, which is one of the most
> prioritized context types. In IPsec scenarios  the performance is
> trashed because, although raw data gets in to device, the data is
> encrypted/decrypted and the dequeue code in CAAM driver has a hard time
> being scheduled to actually call the callback to notify the networking
> stack it can continue working with  that data.

Having received a reply from Thomas Gleixner today, there appears to be
some disagreement with your findings, and a suggestion that the problem
needs proper and more in-depth investigation.

Thomas indicates that the NAPI processing shows an improvement when
moved to the same context that threaded interrupts run in, as opposed
to the current softirq context - which also would run the tasklets.

What I would say is that if threaded IRQs are causing harm, then there
seems to be something very wrong somewhere.

> Being this scenario, at heavy load, the Kernel warns on rcu stalls and
> the forwarding path has a lot of latency.  Have you tried benchmarking
> the board you used for testing?

It's way too long ago for me to remember - these patches were created
almost a year ago - October 20th 2015, which is when I'd have tested
them.  So, I'm afraid I can't help very much at this point, apart from
trying to re-run some benchmarks.

I'd suggest testing the openssl (with AF_ALG support), which is probably
what I tested and benchmarked.  However, as I say, it's far too long
ago for me to really remember at this point.

> I have ran some on our other platforms. The after benchmark fails to
> run at the top level of the before results.

Sorry, that last sentence doesn't make any sense to me.

I don't have the bandwidth to look at this, and IPsec doesn't interest
me one bit - I've never been able to work out how to setup IPsec
locally.  From what I remember when I looked into it many years ago,
you had to have significant information about ipsec to get it up and
running.  Maybe things have changed since then, I don't know.

If you want me to reproduce it, please send me a step-by-step idiots
guide on setting up a working test scenario which reproduces your
problem.

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


AF_ALG zero-size hash fails

2016-08-09 Thread Russell King - ARM Linux
Hi,

While testing AF_ALG with openssl af-alg-rr, I've found that:

OPENSSL_CONF=/shared/crypto/openssl-imx.cnf openssl dgst -sha1 http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AF_ALG broken?

2016-08-09 Thread Russell King - ARM Linux
On Tue, Aug 09, 2016 at 03:14:02PM +0800, Herbert Xu wrote:
> On Tue, Aug 09, 2016 at 08:08:59AM +0100, Russell King - ARM Linux wrote:
> > 
> > I thought I gave the commands and link to your example code.  The
> > openssl case is md5, though sha* also gives the same result.  Your
> > example code was sha1 iirc.  I guess none of these would be using
> > HMAC - the openssl cases used to give results compatible with the
> > md5sum/ sha1sum etc userspace commands.
> > 
> > /proc/crypto:
> > 
> > name : md5
> > driver   : md5-caam
> 
> Right, caam is providing a setkey function for md5, which leads the
> API to think that a key is required.  We should fix it so that setkey
> is only set for the HMAC-variant.

Thanks, that works nicely again, and passes my tests.

8<
From: Russell King <rmk+ker...@armlinux.org.uk>
Subject: [PATCH] crypto: caam - fix non-hmac hashes

Since 6de62f15b581 ("crypto: algif_hash - Require setkey before
accept(2)"), the AF_ALG interface requires userspace to provide a key
to any algorithm that has a setkey method.  However, the non-HMAC
algorithms are not keyed, so setting a key is unnecessary.

Fix this by removing the setkey method from the non-keyed hash
algorithms.

Fixes: 6de62f15b581 ("crypto: algif_hash - Require setkey before accept(2)")
Cc: <sta...@vger.kernel.org>
Signed-off-by: Russell King <rmk+ker...@armlinux.org.uk>
---
 drivers/crypto/caam/caamhash.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c
index ea284e3909ef..9d7fc9ec0b7e 100644
--- a/drivers/crypto/caam/caamhash.c
+++ b/drivers/crypto/caam/caamhash.c
@@ -1950,6 +1950,7 @@ caam_hash_alloc(struct caam_hash_template *template,
 template->name);
snprintf(alg->cra_driver_name, CRYPTO_MAX_ALG_NAME, "%s",
 template->driver_name);
+   t_alg->ahash_alg.setkey = NULL;
}
alg->cra_module = THIS_MODULE;
alg->cra_init = caam_hash_cra_init;
-- 
2.1.0

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AF_ALG broken?

2016-08-08 Thread Russell King - ARM Linux
On Mon, Aug 08, 2016 at 01:47:33PM -0400, Jeffrey Walton wrote:
> > When trying to use the openssl AF_ALG module with 4.8-rc1 with imx
> > caam, I get this:
> >
> > $ OPENSSL_CONF=/shared/crypto/openssl-imx.cnf strace openssl dgst -md5 
> >  > ...
> > socket(PF_ALG, SOCK_SEQPACKET, 0)   = 3
> > close(3)= 0
> > socket(PF_ALG, SOCK_SEQPACKET, 0)   = 3
> > bind(3, {sa_family=AF_ALG, sa_data="hash\0\0\0\0\0\0\0\0\0\0"}, 88) = 0
> > accept(3, 0, NULL)  = 4
> > fstat64(0, {st_mode=S_IFREG|0755, st_size=666864, ...}) = 0
> > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
> > 0xb6fab000
> > read(0, 
> > "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0(\0\1\0\0\0\21'\2\0004\0\0\0"..., 8192) 
> > = 8192
> > send(4, 
> > "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0(\0\1\0\0\0\21'\2\0004\0\0\0"..., 8192, 
> > MSG_MORE) = -1 ENOKEY (Required key not available)
> 
> As far as I know from testing on x86, it has never worked as expected.
> I believe you have to use 'sendto' and 'recvfrom' because 'send' and
> 'recv' use default structures, and they configure the object
> incorrectly.

This used to work, because that's how I've tested my previous CAAM
and Marvell CESA patches.  So, this is not a case of "never worked"
but is definitely a regression caused by some kernel change.

There's also people's presentations that illustrate example code:

https://events.linuxfoundation.org/sites/events/files/slides/lcj-2014-crypto-user.pdf

which can also be found at:

https://lwn.net/Articles/410833/

and that example code comes from Herbert, so one would assume that
it's tested and was working.  Note that it doesn't use any send*,
but uses write().

Testing that code on 4.8-rc (and 4.7 fwiw) gives:

socket(PF_ALG, SOCK_SEQPACKET, 0)   = 3
bind(3, {sa_family=AF_ALG, sa_data="hash\0\0\0\0\0\0\0\0\0\0"}, 88) = 0
accept(3, 0, NULL)  = 4
write(4, "abc", 3)  = -1 ENOKEY (Required key not available)
read(4, 0xbec50508, 20) = -1 ENOKEY (Required key not available)

IOW, the same problem - and it seems not to be a recent regression.

Since the last time I tested CESA or CAAM was back in 4.4 times,
it's got to be something between 4.4 and 4.7.

Looking at the history, my guess would be the setkey changes -
crypto: algif_skcipher - Require setkey before accept(2)
crypto: af_alg - Disallow bind/setkey/... after accept(2)
crypto: af_alg - Add nokey compatibility path
crypto: hash - Add crypto_ahash_has_setkey
crypto: algif_hash - Require setkey before accept(2)

-- 
Rmk's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/11] Further iMX CAAM updates

2016-08-08 Thread Russell King - ARM Linux
This is a re-post (with hopefully bugs fixed from December's review).
Untested, because AF_ALG appears to be broken in 4.8-rc1.  Maybe
someone can provide some hints how to test using tcrypt please?

Here are further imx-caam updates that I've had since before the
previous merge window.  Please review and (I guess) if Freescale
folk can provide acks etc that would be nice.  Thanks.

 drivers/crypto/caam/caamhash.c | 540 ++---
 drivers/crypto/caam/intern.h   |   1 -
 drivers/crypto/caam/jr.c   |  25 +-
 3 files changed, 305 insertions(+), 261 deletions(-)

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] scatterlist: add sg_alloc_table_from_buf() helper

2016-03-31 Thread Russell King - ARM Linux
On Thu, Mar 31, 2016 at 04:45:57PM +0200, Boris Brezillon wrote:
> Hi Russell,
> 
> On Thu, 31 Mar 2016 15:14:13 +0100
> Russell King - ARM Linux <li...@arm.linux.org.uk> wrote:
> 
> > On Thu, Mar 31, 2016 at 02:29:42PM +0200, Boris Brezillon wrote:
> > > sg_alloc_table_from_buf() provides an easy solution to create an sg_table
> > > from a virtual address pointer. This function takes care of dealing with
> > > vmallocated buffers, buffer alignment, or DMA engine limitations (maximum
> > > DMA transfer size).
> > 
> > Please note that the DMA API does not take account of coherency of memory
> > regions other than non-high/lowmem - there are specific extensions to
> > deal with this.
> 
> Ok, you said 'non-high/lowmem', this means vmalloced and kmapped buffers
> already fall in this case, right?
> 
> Could you tell me more about those specific extensions?

I was slightly confused - the extensions I was thinking of are those
listed at the bottom of Documentation/cachetlb.txt, which have nothing
to do with DMA.

However, it's probably worth reading Documentation/DMA-API-HOWTO.txt
to read up on what kinds of memory are considered to be DMA-able in
the kernel.

> > What this means is that having an API that takes any virtual address
> > pointer, converts it to a scatterlist which is then DMA mapped, is
> > unsafe.
> 
> Which means some implementations already get this wrong (see
> spi_map_buf(), and I'm pretty sure it's not the only one).

Quite possible, but that is driver stuff, and driver stuff gets things
wrong all the time. :)

> > It'll be okay for PIPT and non-aliasing VIPT cache architectures, but
> > for other cache architectures this will hide this problem and make
> > review harder.
> > 
> 
> Ok, you lost me. I'll have to do my homework and try to understand what
> this means :).

P = physical address
V = virtual address
I = indexed
T = tag

The tag is held in each cache line.  When a location is looked up in the
cache, an index is used to locate a set of cache lines and the tag is
compared to check which cache line in the set is the correct one (or
whether the address even exists in the cache.)

How the index and tag are derived varies between cache architectures.

PIPT = indexed by physical address, tagged with physical address.  Never
aliases with itself in the presence of multiple virtual mappings.

VIPT = indexed by virtual address, tagged with physical address.  If the
bits from the virtual address do not overlap the MMU page size, it is
also alias free, otherwise aliases can exist, but can be eliminated by
"cache colouring" - ensuring that a physical address is always mapped
with the same overlapping bits.

VIVT = indexed by virtual address, tagged with virtual address.  The
worst kind of cache, since every different mapping of the same physical
address is guaranteed by design to alias with other mappings.

There is little cache colouring between different kernel mappings (eg,
between lowmem and vmalloc space.)

What this means is that, while the DMA API takes care of DMA aliases
in the lowmem mappings, an alias-prone VIPT cache will remain incoherent
with DMA if it is remapped into vmalloc space, and the mapping happens
to have a different cache colour.  In other words, this is a data
corruption issue.

Hence, taking a range of vmalloc() addresses, converting them into a
scatterlist, then using the DMA API on the scatterlist _only_ guarantees
that the lowmem (and kmap'd highmem mappings) are coherent with DMA.
There is no way for the DMA API to know that other mappings exist, and
obviously flushing every possible cache line just because a mapping might
exist multiplies the expense of the DMA API: not only in terms of time
spent running through all the possibilities, which doubles for every
aliasing bit of VIPT, but also TLB pressure since you'd have to create
a mapping for each alias and tear it back down.

VIVT is even worse, since there is no other virtual mapping which is
coherent, would need to be known, and each mapping would need to be
individually flushed.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] scatterlist: add sg_alloc_table_from_buf() helper

2016-03-31 Thread Russell King - ARM Linux
On Thu, Mar 31, 2016 at 02:29:42PM +0200, Boris Brezillon wrote:
> sg_alloc_table_from_buf() provides an easy solution to create an sg_table
> from a virtual address pointer. This function takes care of dealing with
> vmallocated buffers, buffer alignment, or DMA engine limitations (maximum
> DMA transfer size).

Please note that the DMA API does not take account of coherency of memory
regions other than non-high/lowmem - there are specific extensions to
deal with this.

What this means is that having an API that takes any virtual address
pointer, converts it to a scatterlist which is then DMA mapped, is
unsafe.

It'll be okay for PIPT and non-aliasing VIPT cache architectures, but
for other cache architectures this will hide this problem and make
review harder.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] crypto: marvell/cesa - replace dma_to_phys with dma_map_single

2016-03-19 Thread Russell King - ARM Linux
On Thu, Mar 17, 2016 at 07:17:24PM -0400, ok...@codeaurora.org wrote:
> What is the correct way? I don't want to write engine->sram_dma = sram

Well, what the driver _is_ wanting to do is to go from a CPU physical
address to a device DMA address.  phys_to_dma() looks like the correct
thing there to me, but I guess that's just an offset and doesn't take
account of any IOMMU that may be in the way.

If you have an IOMMU, then the whole phys_to_dma() thing is a total
failure as it only does a linear translation, and there are no
interfaces in the kernel to take account of an IOMMU in the way.  So,
it needs something designed for the job, implemented and discussed by
the normal methods of proposing a new cross-arch interface for drivers
to use.

What I'm certain of, though, is that the change proposed in this patch
will break current users of this driver: virt_to_page() on an address
returned by ioremap() is completely undefined, and will result in
either a kernel oops, or if not poking at memory which isn't a struct
page, ultimately resulting in something that isn't SRAM being pointed
to by "engine->sram_dma".

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] crypto: marvell/cesa - replace dma_to_phys with dma_map_single

2016-03-19 Thread Russell King - ARM Linux
On Thu, Mar 17, 2016 at 06:02:15PM -0400, Sinan Kaya wrote:
> Getting ready to remove dma_to_phys API. Drivers should not be
> using this API for DMA operations. Instead, they should go
> through the dma_map or dma_alloc APIs.
> 
> Signed-off-by: Sinan Kaya 
> ---
>  drivers/crypto/marvell/cesa.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
> index c0656e7..52ddfa4 100644
> --- a/drivers/crypto/marvell/cesa.c
> +++ b/drivers/crypto/marvell/cesa.c
> @@ -350,8 +350,8 @@ static int mv_cesa_get_sram(struct platform_device *pdev, 
> int idx)
>   if (IS_ERR(engine->sram))
>   return PTR_ERR(engine->sram);
>  
> - engine->sram_dma = phys_to_dma(cesa->dev,
> -(phys_addr_t)res->start);
> + engine->sram_dma = dma_map_single(cesa->dev, engine->sram,
> +   DMA_TO_DEVICE);

Documentation/DMA-API.txt

dma_addr_t
dma_map_single(struct device *dev, void *cpu_addr, size_t size,
  enum dma_data_direction direction)

Notes:  Not all memory regions in a machine can be mapped by this API.
Further, contiguous kernel virtual space may not be contiguous as
physical memory.  Since this API does not provide any scatter/gather
capability, it will fail if the user tries to map a non-physically
contiguous piece of memory.  For this reason, memory to be mapped by
this API should be obtained from sources which guarantee it to be
physically contiguous (like kmalloc).

Specifically, ioremapped memory will *not* work as you expect with
dma_map_single().

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 06/11] crypto: caam: ensure that we clean up after an error

2015-12-09 Thread Russell King - ARM Linux
On Wed, Dec 09, 2015 at 05:08:41PM +0200, Horia Geantă wrote:
> On 12/7/2015 9:12 PM, Russell King wrote:
> > Ensure that we clean up allocations and DMA mappings after encountering
> > an error rather than just giving up and leaking memory and resources.
> > 
> > Signed-off-by: Russell King 
> 
> I guess the error cleanup code should be grouped under an "err" label,
> instead of duplicating it.

It'd be useful if you could quote the code you were commenting on please.
I now need to dig out the patch from my git tree to work out the relevance
of your comment is.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 07/11] crypto: caam: check and use dma_map_sg() return code

2015-12-09 Thread Russell King - ARM Linux
On Wed, Dec 09, 2015 at 05:20:45PM +0200, Horia Geantă wrote:
> On 12/7/2015 9:12 PM, Russell King wrote:
> > Strictly, dma_map_sg() may coalesce SG entries, but in practise on iMX
> > hardware, this will never happen.  However, dma_map_sg() can fail, and
> > we completely fail to check its return value.  So, fix this properly.
> > 
> > Arrange the code to map the scatterlist early, so we know how many
> > scatter table entries to allocate, and then fill them in.  This allows
> > us to keep relatively simple error cleanup paths.
> > 
> > Signed-off-by: Russell King 
> 
> Some tcrypt tests fail - looks like those with zero plaintext:
> caam_jr ffe301000.jr: unable to map source for DMA
> alg: hash: digest failed on test 1 for sha1-caam: ret=12
> [...]
> 
> Need to be careful, dma_map_sg() returning zero is an error only if ptxt
> is not null (alternatively src_nents returned by sg_nents_for_len()
> could be checked).
> 
> > @@ -1091,7 +1102,7 @@ static int ahash_digest(struct ahash_request *req)
> > u32 *sh_desc = ctx->sh_desc_digest, *desc;
> > dma_addr_t ptr = ctx->sh_desc_digest_dma;
> > int digestsize = crypto_ahash_digestsize(ahash);
> > -   int src_nents, sec4_sg_bytes;
> > +   int src_nents, mapped_nents, sec4_sg_bytes;
> > dma_addr_t src_dma;
> > struct ahash_edesc *edesc;
> > int ret = 0;
> > @@ -1099,9 +1110,14 @@ static int ahash_digest(struct ahash_request *req)
> > int sh_len;
> >  
> > src_nents = sg_nents_for_len(req->src, req->nbytes);
> > -   dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE);
> > -   if (src_nents > 1)
> > -   sec4_sg_bytes = src_nents * sizeof(struct sec4_sg_entry);
> > +   mapped_nents = dma_map_sg(jrdev, req->src, src_nents, DMA_TO_DEVICE);
> > +   if (mapped_nents == 0) {
> > +   dev_err(jrdev, "unable to map source for DMA\n");
> > +   return -ENOMEM;
> > +   }
> 
> This is at least one of the places where the error condition must change
> to smth. like (src_nents != 0 && mapped_nents == 0).

I guess we should avoid calling dma_map_sg() and dma_unmap_sg() when
src_nents is zero.  Thanks, I'll work that into the next patch revision.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 00/13] Further iMX CAAM updates

2015-12-09 Thread Russell King - ARM Linux
On Wed, Dec 09, 2015 at 05:06:03PM +0200, Horia Geantă wrote:
> On 12/7/2015 9:11 PM, Russell King - ARM Linux wrote:
> > Here are further imx-caam updates that I've had since before the
> > previous merge window.  Please review and (I guess) if Freescale
> > folk can provide acks etc that would be nice.  Thanks.
> 
> Thanks Russell.
> 
> Note that the patch set does not apply cleanly, please rebase on latest
> cryptodev-2.6 tree.

I'm afraid I'm not a fan of digging out random git trees, rebasing
patches on top of them, and then not being able to test them locally
without dragging the entire tree into my own tree.  I'd rather wait
another kernel cycle instead of polluting my own tree with commits
from other trees.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 00/13] Further iMX CAAM updates

2015-12-07 Thread Russell King - ARM Linux
Here are further imx-caam updates that I've had since before the
previous merge window.  Please review and (I guess) if Freescale
folk can provide acks etc that would be nice.  Thanks.

 drivers/crypto/caam/caamhash.c | 415 +++--
 drivers/crypto/caam/intern.h   |   1 -
 drivers/crypto/caam/jr.c   |  25 +--
 3 files changed, 246 insertions(+), 195 deletions(-)


-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/18] crypto: marvell: add flag to determine algorithm endianness

2015-10-19 Thread Russell King - ARM Linux
On Mon, Oct 19, 2015 at 03:04:51PM +, Jason Cooper wrote:
> Hey Russell,
> 
> On Sun, Oct 18, 2015 at 05:23:40PM +0100, Russell King wrote:
> >  static int mv_cesa_ahash_init(struct ahash_request *req,
> > - struct mv_cesa_op_ctx *tmpl)
> > + struct mv_cesa_op_ctx *tmpl, bool algo_le)
> 
> nit: Would it make more sense the do bool algo_endian, and then ...

I don't like "bool"s that don't self-document what they mean.
What does "if (algo_endian)" mean?  It's pretty clear what
"if (algo_le)" means in the context of endianness, though I'll
give you that "if (algo_little_endian)" would be a lot better.

> 
> > @@ -861,7 +862,7 @@ static int mv_cesa_md5_init(struct ahash_request *req)
> >  
> > mv_cesa_set_op_cfg(, CESA_SA_DESC_CFG_MACM_MD5);
> >  
> > -   mv_cesa_ahash_init(req, );
> > +   mv_cesa_ahash_init(req, , true);
> 
>   mv_cesa_ahash_init(req, , ALGO_ENDIAN_LITTLE);
> 
> 'true' doesn't seem as obvious.  But again, nit-picky.

I did think about:

enum {
ALGO_LITTLE_ENDIAN,
ALGO_BIG_ENDIAN,
};

and passing an int algo_endian around, but that seemed to be like too
much bloat... though if you want to insist, I could make that change.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/18] crypto: further fixes for Marvell CESA hash

2015-10-18 Thread Russell King - ARM Linux
Following on from the previous series, this series addresses further
problems with the Marvell CESA hash driver found while testing it my
openssl/ssh scenarios.

The first patch improves one from the previous series: we can get the
transform more directly using req->base.tfm rather than going round
the houses.

The next few patches rework the algorithm endianness conversions.
There are two things which depend on the algorithm endianness - the
format of the result, and the format of the bit length in the last
block.  We introduce a flag to convey this information, and keep
the creq->state format in CPU endian mode for consistency.

Some of the inconsistent hash results are down to the template
operation not being properly initialised - so we zero initialise all
template operations.

The following patches (from "factor out first fragment decisions to
helper") rework the digest handling to ensure that we always provide
the hardware with a complete block of data to hash, otherwise it can
be left mid-calculation, which then causes state to leak to
subsequent operations.  This requires a re-structure of the way we
put together the DMA entries, so it's done in relatively small steps.

This results in the CESA driver passing all tests I can throw at it
via the AF_ALG openssl plugin with the exception of asking for the
hash of /dev/null.  This returns an all zero result, rather than the
correct hash value.  This bug is pending further diagnosis, but it
is believed not to be a driver specific bug as iMX6 CAAM also behaves
the same.

Unfortunately, this is a large series, but the driver unfortunately
needs this level of bug fixing to work properly.

 drivers/crypto/marvell/cesa.h |  11 +-
 drivers/crypto/marvell/hash.c | 307 +-
 2 files changed, 164 insertions(+), 154 deletions(-)

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/6] Fix CAAM hash driver

2015-10-18 Thread Russell King - ARM Linux
The following series fixes the CAAM hash driver, allowing it to work
with the previously merged "crypto: ahash - ensure statesize is non-
zero" patch.

This is non-trivial, because CAAM exports a huge 1600 bytes of data,
which, if we set .statesize to this, still results in the core code
rejecting the driver.  So, we need to shrink the amount of state
exported.

The first, most obvious one to get rid of is the export of the
caam_hash_ctx structure, which is shared between the socket being
exported from and imported to - copying it away and back again was
a complete no-op.

The second is that we don't need to export both pending-bytes buffers.
Only one will be in use at any time.

A problem was encountered while testing, where the size of the pending
bytes buffer was not added to the scatterlist with the correct length.
This is also fixed in this series, by patch 3.  This bug was introduced
by a prior commit trying to fix a tcrypt error.  However, the change is
wrong, but I have to question whether the test was correct or not - the
backtrace contains a function "test_ahash_pnum" which doesn't seem to
exist in mainline, nor does it seem to exist in any previous mainline
kernel.

Version 2 of this series addresses a mismerge in patch 5 of the
original series, and adds further information to patch 3.

Further testing with tcrypt showed up a problem identified by the
DMA API debugging (different to the original one referred to in
patch 3) where we leak DMA mappings.

 drivers/crypto/caam/caamhash.c | 101 +
 1 file changed, 71 insertions(+), 30 deletions(-)

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/6] Sparse related fixes

2015-10-18 Thread Russell King - ARM Linux
Continuing on from the previous set of 18 patches, I also fixed a
number of sparse problems and other cleanups.  I don't deem these
suitable for -rc merging, especially now that we're basically at
-rc6.

The first patch switches the driver over to appropriately using
the relaxed IO accessors - this avoids calling out to the heavy
barrier on every read and write operation, but only calling out on
those which really matter.

We switch to using dma_addr_t for DMA addresses which are not accessed
by hardware, and using gfp_t for the get_free_page flags.  String-based
MMIO accesses are used instead of plain memcpy()/memset() which prevents
us potentially stumbling over GCC optimisations that it thinks it may
make with these functions.

We convert as much of the hardware state to __le32 endian markings,
and use cpu_to_le32() as appropriate.  A number of places are left
unfixed, as we temporarily store CPU native endian values at these
locations; these warnings should not be fixed (basically, only
appropriate sparse warnings should be fixed without penalising code.)

 drivers/crypto/marvell/cesa.h   | 44 -
 drivers/crypto/marvell/cipher.c | 13 ++--
 drivers/crypto/marvell/hash.c   | 23 +++--
 drivers/crypto/marvell/tdma.c   | 42 ---
 4 files changed, 62 insertions(+), 60 deletions(-)

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: caam - Add support for hashing export and import functions

2015-10-17 Thread Russell King - ARM Linux
On Sat, Oct 17, 2015 at 11:52:54AM +0100, Russell King - ARM Linux wrote:
> Now, with that change, and with your change to buf_0/buf_1, I see
> (before the import/export functions are used) several of these errors:
> 
> caam_jr 2101000.jr0: 4501: DECO: desc idx 5: SGT Length Error. The 
> descriptor is trying to read more data than is contained in the SGT table.
> 
> when checking using openssh.  They don't occur when testing with openssl
> performing normal hashes (as per my previously posted script) but only
> with openssh.

The hardware seems to be quite justified in complaining - I've added code
to dump out the scatter list as well:

jobdesc@879: e4bd7c18: b0861c08 3cd29080 f140 34bd7c38  ...<..@.8|.4
jobdesc@879: e4bd7c28: 0068 f840 3ccf3640 0028  h.@.@6.<(...
sg@882: e4bd7c38:  3ccf3640 0028 
sg@882: e4bd7c48:  34bd7300 0006 
sg@882: e4bd7c58:  7528a070 4020 

If I'm interpreting this correctly:

f140 34bd7c38 0068

tells the CAAM to read 0x68 bytes using the scatterlist at 0x34bd7c38.

The scatterlist here contains three entries, which have lengths 0x28,
0x6 and 0x20, which total up to 0x4e bytes, not 0x68.

Now, looking at how the scatterlist is created, I have yet more questions.

1. What's this dma_map_sg_chained() and dma_unmap_sg_chained() stuff?
   What's wrong with just calling dma_map_sg() on the scatterlist, giving
   it the number of entries we want to map?  dma_map_sg() is supposed to
   deal with chained scatterlists, if it doesn't, it's buggy.

2. src_map_to_sec4_sg() fails to check whether dma_map_sg_chained() failed.
   if it failed, we continue anyway, poking invalid addresses into the
   hardware scatterlist, thereby causing memory corruption.

3. __sg_count() - what's going on with this.  This seems to assume that
   scatterlists aren't chained as:

if (!sg_is_last(sg) && (sg + 1)->length == 0)

   sg + 1 may _not_ be sg_next(sg) - and this will walk off the end of
   the array.

4. Is the try_buf_map_to_sec4_sg() call correct?  Shouldn't it be:

@@ -838,7 +838,7 @@ static int ahash_update_ctx(struct ahash_request *req)
state->buf_dma = try_buf_map_to_sec4_sg(jrdev,
edesc->sec4_sg + 1,
buf, state->buf_dma,
-   *next_buflen, *buflen);
+   *buflen, last_buflen);
 
if (src_nents) {
src_map_to_sec4_sg(jrdev, req->src, src_nents,

   Indeed, with this change, my ssh test works, and the DECO errors are
   gone.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: caam - Add support for hashing export and import functions

2015-10-17 Thread Russell King - ARM Linux
On Fri, Oct 16, 2015 at 04:19:33PM -0700, Victoria Milhoan wrote:
> @@ -1569,6 +1601,10 @@ static int ahash_import(struct ahash_request *req, 
> const void *in)
>   struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash);
>   struct caam_hash_state *state = ahash_request_ctx(req);
>  
> + /* Allocate new data buffers to use for this request */
> + state->buf_0 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA);
> + state->buf_1 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA);
> +

I'm really not sure you can do this at all.  What if the previous
digest calculation prior to the accept() cloning the state was for
a non-hash-block aligned size of data.  The above will lose that
state, and produce an incorrect hash result.  Herbert, can you
confirm please?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/5] crypto: ensure algif_hash does not pass a zero-sized state

2015-10-17 Thread Russell King - ARM Linux
On Fri, Oct 16, 2015 at 04:24:54PM -0700, Victoria Milhoan wrote:
> On Thu, 15 Oct 2015 21:13:38 +0800
> Herbert Xu <herb...@gondor.apana.org.au> wrote:
> 
> > On Thu, Oct 15, 2015 at 01:59:44PM +0100, Russell King - ARM Linux wrote:
> > >
> > > I think the CAAM driver is pretty unfixable from a trivial point of
> > > view.  This driver exports a huge amount of state - it contains both a
> > > struct caam_hash_ctx and a struct caam_hash_state, which totals up to
> > > 1600 bytes.  This fails the:
> > 
> > Right just dumping the state out as is not going to work.  This
> > is not supposed to be how export works anyway.  But it doesn't
> > look too bad as most of that 1600 is consumed by the hardware
> > program descriptor which can easily be regenerated upon import.
> > 
> > The only things that need to be exported AFAICS are key and buf_X.
> 
> I just pushed out a patch for export/import functions in the CAAM driver. The
> patch has been through testing with OpenSSL and the AF_ALG plugin on the MX6.

Be careful with that.  There's two ways to test:

1. Checking hash output.

Preparation - copy openssl.cnf and add this to openssl.cnf:

openssl_conf= openssl_def

[openssl_def]
engines = engine_section

[engine_section]
af_alg  = af_alg_engine

[af_alg_engine]
CIPHERS=aes-128-cbc aes-192-cbc aes-256-cbc des-cbc des-ede3-cbc

DIGESTS=md5 sha1 sha256 sha512

# Putting this last means we register the above as the default algorithms
default_algorithms  = ALL

Then:

#!/bin/sh

for type in md5 sha1 sha256 sha512; do
   echo -n "Checking $type hash:"
   for file in /bin/*; do
  echo -n " $file"
  if ! OPENSSL_CONF=./openssl.cnf openssl dgst -$type < $file | sed 
"s,(stdin)= ,,;s,\$,\t$file," | ${type}sum -c > /dev/null; then
 echo " ... failed"
 echo -n "Openssl says: " >&2
 OPENSSL_CONF=./openssl.cnf openssl dgst -$type < $file | sed 
"s,(stdin)= ,,;s,\$,\t$file," >&2
 echo -n "${type}sum says: " >&2
 ${type}sum $file >&2
 exit 1
  fi
   done
   echo " ... ok"
done
echo "All hashes passed"

The above will verify that the hashes are producing the correct answers
for a range of files.  This does _not_ test the export/import paths.

2. Backup the existing openssl.cnf and replace it with the above modified
   version.  Then try to ssh into the platform.  This will verify the
   export/import side of things.  If ssh fails to connect to the target,
   you know that the crypto drivers for SHA1 are broken, probably due to
   export/import.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: caam - Add support for hashing export and import functions

2015-10-17 Thread Russell King - ARM Linux
On Fri, Oct 16, 2015 at 04:19:33PM -0700, Victoria Milhoan wrote:
> @@ -1569,6 +1601,10 @@ static int ahash_import(struct ahash_request *req, 
> const void *in)
>   struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash);
>   struct caam_hash_state *state = ahash_request_ctx(req);
>  
> + /* Allocate new data buffers to use for this request */
> + state->buf_0 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA);
> + state->buf_1 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA);
> +
>   memcpy(ctx, in, sizeof(struct caam_hash_ctx));
>   memcpy(state, in + sizeof(struct caam_hash_ctx),
>  sizeof(struct caam_hash_state));

This is buggy.  You loose the reference to the two buffers you've just
allocated because this memcpy() overwrites it.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: caam - Add support for hashing export and import functions

2015-10-17 Thread Russell King - ARM Linux
On Fri, Oct 16, 2015 at 04:19:33PM -0700, Victoria Milhoan wrote:
> @@ -319,7 +319,7 @@ static int ahash_set_sh_desc(struct crypto_ahash *ahash)
>   have_key = OP_ALG_AAI_HMAC_PRECOMP;
>  
>   /* ahash_update shared descriptor */
> - desc = ctx->sh_desc_update;
> + desc = kmalloc(DESC_HASH_MAX_USED_BYTES, GFP_KERNEL | GFP_DMA);

What if kmalloc() fails?  Should this really oops the kernel?  Ditto
for every other kmalloc you've added below.
 
> @@ -1557,6 +1575,20 @@ static int ahash_export(struct ahash_request *req, 
> void *out)
>   struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash);
>   struct caam_hash_state *state = ahash_request_ctx(req);
>  
> + /*
> +  * Do not export the data buffers. New buffers are
> +  * allocated during import.
> +  */
> + kfree(state->buf_0);
> + kfree(state->buf_1);
> + state->buf_0 = NULL;
> + state->buf_1 = NULL;
> +
> + state->current_buf = 0;
> + state->buf_dma = 0;
> + state->buflen_0 = 0;
> + state->buflen_1 = 0;
> +

So what if I export, and then continue using _this_ context later?

> @@ -1605,6 +1641,8 @@ static struct caam_hash_template driver_hash[] = {
>   .setkey = ahash_setkey,
>   .halg = {
>   .digestsize = SHA1_DIGEST_SIZE,
> + .statesize = sizeof(struct caam_hash_ctx) +
> +  sizeof(struct caam_hash_state),

Much prefer to have a 'struct caam_hash_export_state' thing rather than
litter the code with the knowledge that these two go together.

>   },
>   },
>   .alg_type = OP_ALG_ALGSEL_SHA1,
> @@ -1626,6 +1664,8 @@ static struct caam_hash_template driver_hash[] = {
>   .setkey = ahash_setkey,
>   .halg = {
>   .digestsize = SHA224_DIGEST_SIZE,
> + .statesize = sizeof(struct caam_hash_ctx) +
> +  sizeof(struct caam_hash_state),
>   },
>   },
>   .alg_type = OP_ALG_ALGSEL_SHA224,
> @@ -1647,6 +1687,8 @@ static struct caam_hash_template driver_hash[] = {
>   .setkey = ahash_setkey,
>   .halg = {
>   .digestsize = SHA256_DIGEST_SIZE,
> + .statesize = sizeof(struct caam_hash_ctx) +
> +  sizeof(struct caam_hash_state),
>   },
>   },
>   .alg_type = OP_ALG_ALGSEL_SHA256,
> @@ -1668,6 +1710,8 @@ static struct caam_hash_template driver_hash[] = {
>   .setkey = ahash_setkey,
>   .halg = {
>   .digestsize = SHA384_DIGEST_SIZE,
> + .statesize = sizeof(struct caam_hash_ctx) +
> +  sizeof(struct caam_hash_state),
>   },
>   },
>   .alg_type = OP_ALG_ALGSEL_SHA384,
> @@ -1689,6 +1733,8 @@ static struct caam_hash_template driver_hash[] = {
>   .setkey = ahash_setkey,
>   .halg = {
>   .digestsize = SHA512_DIGEST_SIZE,
> + .statesize = sizeof(struct caam_hash_ctx) +
> +  sizeof(struct caam_hash_state),
>   },
>   },
>   .alg_type = OP_ALG_ALGSEL_SHA512,
> @@ -1710,6 +1756,8 @@ static struct caam_hash_template driver_hash[] = {
>   .setkey = ahash_setkey,
>   .halg = {
>   .digestsize = MD5_DIGEST_SIZE,
> + .statesize = sizeof(struct caam_hash_ctx) +
> +  sizeof(struct caam_hash_state),
>   },
>   },
>   .alg_type = OP_ALG_ALGSEL_MD5,
> @@ -1796,6 +1844,12 @@ static void caam_hash_cra_exit(struct crypto_tfm *tfm)
>   dma_unmap_single(ctx->jrdev, ctx->sh_desc_finup_dma,
>desc_bytes(ctx->sh_desc_finup), DMA_TO_DEVICE);
>  
> + kfree(ctx->sh_desc_update);
> + kfree(ctx->sh_desc_update_first);
> + kfree(ctx->sh_desc_fin);
> + kfree(ctx->sh_desc_digest);
> + kfree(ctx->sh_desc_finup);
> +

What happens to these when ahash_import() overwrites all the context
state?  Doesn't this mean we end up with double-kfree()s ?

Also, did you test this DMA debug enabled?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org

Re: [PATCH] crypto: caam - Add support for hashing export and import functions

2015-10-17 Thread Russell King - ARM Linux
On Fri, Oct 16, 2015 at 04:19:33PM -0700, Victoria Milhoan wrote:
> @@ -1557,6 +1575,20 @@ static int ahash_export(struct ahash_request *req, 
> void *out)
>   struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash);
>   struct caam_hash_state *state = ahash_request_ctx(req);
>  
> + /*
> +  * Do not export the data buffers. New buffers are
> +  * allocated during import.
> +  */
> + kfree(state->buf_0);
> + kfree(state->buf_1);
> + state->buf_0 = NULL;
> + state->buf_1 = NULL;
> +
> + state->current_buf = 0;
> + state->buf_dma = 0;
> + state->buflen_0 = 0;
> + state->buflen_1 = 0;
> +
>   memcpy(out, ctx, sizeof(struct caam_hash_ctx));
>   memcpy(out + sizeof(struct caam_hash_ctx), state,
>  sizeof(struct caam_hash_state));
> @@ -1569,6 +1601,10 @@ static int ahash_import(struct ahash_request *req, 
> const void *in)
>   struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash);
>   struct caam_hash_state *state = ahash_request_ctx(req);
>  
> + /* Allocate new data buffers to use for this request */
> + state->buf_0 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA);
> + state->buf_1 = kmalloc(CAAM_MAX_HASH_BLOCK_SIZE, GFP_KERNEL | GFP_DMA);
> +
>   memcpy(ctx, in, sizeof(struct caam_hash_ctx));
>   memcpy(state, in + sizeof(struct caam_hash_ctx),
>  sizeof(struct caam_hash_state));

Why are we even saving and restoring the caam_hash_ctx here?

That comes from: crypto_tfm_ctx(crypto_ahash_tfm(tfm)), which, tracing
that through the inline functions comes out as:

tfm->base.__crt_ctx

tfm comes from: __crypto_ahash_cast(req->base.tfm), where req->base.tfm
is a pointer to struct crypto_ahash's base member.

When a hash is accept()ed, req->base.tfm is copied from the original
socket handle to the new socket handle (see ahash_request_set_tfm(),
called from hash_accept_parent(), which is in turn called from
af_alg_accept() in hash_accept()).  So both the exporting and importing
hashes end up with the same struct caam_hash_ctx.

We can see this if we add debug print:

[156147.994219] ahash_export(ctx=ed087080,state=e2a6e600)
[156147.994257] ahash_import(ctx=ed087080,state=e2a6d600)
[156147.998256] ahash_export(ctx=ed083080,state=e2a6c600)
[156147.998294] ahash_import(ctx=ed083080,state=edb9ee00)
[156147.998659] ahash_export(ctx=ed087080,state=e2a6e600)
[156147.998700] ahash_import(ctx=ed087080,state=e2a6fa00)
[156147.999002] ahash_export(ctx=ed080880,state=e2a6d200)
[156147.999040] ahash_import(ctx=ed080880,state=e2a6d600)

Notice that the state changes between each export and import, but the
context doesn't.

So, I think we can (and should) completely drop the saving and restoring
of context in these functions.

Now, with that change, and with your change to buf_0/buf_1, I see
(before the import/export functions are used) several of these errors:

caam_jr 2101000.jr0: 4501: DECO: desc idx 5: SGT Length Error. The 
descriptor is trying to read more data than is contained in the SGT table.

when checking using openssh.  They don't occur when testing with openssl
performing normal hashes (as per my previously posted script) but only
with openssh.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: caam - Add support for hashing export and import functions

2015-10-17 Thread Russell King - ARM Linux
On Sat, Oct 17, 2015 at 06:57:44AM -0700, Victoria Milhoan wrote:
> Correct - this was apparently the wrong patch I pushed out. The one I'm
> actively using has this fixed (this is the only difference). I will make
> this change in v2 after reviewing your other comments.

Thanks Victoria, but please perform the tests I've suggested in one of
my previous emails - merely testing with openssl's built-in tests aren't
sufficient.  Also, openssl is non-obvious with what it's testing,
especially if you're using "openssl speed".  I'm afraid it's a case of
"you don't know what's being tested unless you've read the openssl
source" :(

However, while trying to debug the DECO stuff, I've stumbled across this:

static inline void append_ptr(u32 *desc, dma_addr_t ptr)
{
dma_addr_t *offset = (dma_addr_t *)desc_end(desc);

*offset = ptr;

(*desc) += CAAM_PTR_SZ / CAAM_CMD_SZ;
}

where I think 'desc' points to memory used for the descriptor, which is
read by hardware, correct?

If that's true, why are we using dma_addr_t here?  dma_addr_t can be
either 32-bit or 64-bit depending on the kernel configuration, and I
suspect if this driver is built with dma_addr_t set to 64-bit, things
will go awry.

Also, what endianness are the hardware descriptors?  Are they always
native CPU endian, or are they fixed as little endian?  If they're always
little endian, is there any reason not to use __le32 and cpu_to_le32()/
le32_to_cpu() as appropriate to ensure that it works on big endian
systems?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5] Fix CAAM hash driver

2015-10-17 Thread Russell King - ARM Linux
The following series fixes the CAAM hash driver, allowing it to work
with the previously merged "crypto: ahash - ensure statesize is non-
zero" patch.

This is non-trivial, because CAAM exports a huge 1600 bytes of data,
which, if we set .statesize to this, still results in the core code
rejecting the driver.  So, we need to shrink the amount of state
exported.

The first, most obvious one to get rid of is the export of the
caam_hash_ctx structure, which is shared between the socket being
exported from and imported to - copying it away and back again was
a complete no-op.

The second is that we don't need to export both pending-bytes buffers.
Only one will be in use at any time.

A problem was encountered while testing, where the size of the pending
bytes buffer was not added to the scatterlist with the correct length.
This is also fixed in this series, by patch 3.

 drivers/crypto/caam/caamhash.c | 79 +++---
 1 file changed, 59 insertions(+), 20 deletions(-)

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] Fix CAAM hash driver

2015-10-17 Thread Russell King - ARM Linux
On Sat, Oct 17, 2015 at 07:50:55PM +0100, Russell King - ARM Linux wrote:
> The following series fixes the CAAM hash driver, allowing it to work
> with the previously merged "crypto: ahash - ensure statesize is non-
> zero" patch.
> 
> This is non-trivial, because CAAM exports a huge 1600 bytes of data,
> which, if we set .statesize to this, still results in the core code
> rejecting the driver.  So, we need to shrink the amount of state
> exported.
> 
> The first, most obvious one to get rid of is the export of the
> caam_hash_ctx structure, which is shared between the socket being
> exported from and imported to - copying it away and back again was
> a complete no-op.
> 
> The second is that we don't need to export both pending-bytes buffers.
> Only one will be in use at any time.
> 
> A problem was encountered while testing, where the size of the pending
> bytes buffer was not added to the scatterlist with the correct length.
> This is also fixed in this series, by patch 3.
> 
>  drivers/crypto/caam/caamhash.c | 79 
> +++---
>  1 file changed, 59 insertions(+), 20 deletions(-)

This series has been tested using openssl to validate the hashes against
md5sum, sha1sum, sha224sum, sha256sum and sha512sum programs of every
program in my /bin directory, which checks correct functioning of the
hashing.

It has also been tested with openssh, where the crypto hardware is used
via openssl by sshd, which makes use of accept() and hence tests the
export/import paths.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/5] crypto: ensure algif_hash does not pass a zero-sized state

2015-10-15 Thread Russell King - ARM Linux
On Thu, Oct 15, 2015 at 05:41:47PM +0800, Herbert Xu wrote:
> On Thu, Oct 15, 2015 at 10:39:30AM +0100, Russell King - ARM Linux wrote:
> >
> > The CAAM driver is similarly buggy - it has export/import functions in
> > its ahash drivers, but zero statesize.
> > 
> > User exploitable kernel stack smashing... I'd suggest putting this patch
> > into stable kernels as high priority, as I'm pretty sure this could be
> 
> I agree.  It should already be on its way to stable as Linus has
> pulled it into his tree and it carries a stable cc.

Thanks.

I think the CAAM driver is pretty unfixable from a trivial point of
view.  This driver exports a huge amount of state - it contains both a
struct caam_hash_ctx and a struct caam_hash_state, which totals up to
1600 bytes.  This fails the:

alg->halg.statesize > PAGE_SIZE / 8 

in ahash_prepare_alg() if we set .statesize.  For ARM, this places a
limit of 512 bytes on the state size.

The CAAM authors need to come up with a better solution (and quickly,
as caamhash is going to fail in all kernels soon), or we need to
support larger exported states.

BTW, I can't find a MAINTAINERS entry for CAAM, so I've just grabbed
a couple of addresses from recent git history in the hope they'll know
who's responsible.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/5] crypto: ensure algif_hash does not pass a zero-sized state

2015-10-15 Thread Russell King - ARM Linux
On Tue, Oct 13, 2015 at 10:33:12PM +0800, Herbert Xu wrote:
> On Fri, Oct 09, 2015 at 08:43:33PM +0100, Russell King wrote:
> > If the algorithm passed a zero statesize, do not pass a valid pointer
> > into the export/import functions.  Passing a valid pointer covers up
> > bugs in driver code which then go on to smash the kernel stack.
> > Instead, pass NULL, which will cause any attempt to write to the
> > pointer to fail.
> > 
> > Signed-off-by: Russell King 
> 
> Patch applied without the shash hunk.  I also replaced your commit
> message as it no longer makes any sense:
> 
> crypto: ahash - ensure statesize is non-zero
> 
> Unlike shash algorithms, ahash drivers must implement export
> and import as their descriptors may contain hardware state and
> cannot be exported as is.  Unfortunately some ahash drivers did
> not provide them and end up causing crashes with algif_hash.
> 
> This patch adds a check to prevent these drivers from registering
> ahash algorithms until they are fixed.
> 
> Thanks,

There will be fallout from this.

The CAAM driver is similarly buggy - it has export/import functions in
its ahash drivers, but zero statesize.

User exploitable kernel stack smashing... I'd suggest putting this patch
into stable kernels as high priority, as I'm pretty sure this could be
used to gain privileges via carefully crafted md5 hashes.  I've not
proven it, but given that the md5 hash and state data get copied over
the kernel stack, it's highly likely that it _is_ exploitable from any
user that can create an AF_ALG socket.

Yes, it means regressions in the form of various hw crypto no longer
being loadable, but I think that's preferable to the security hole here.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions

2015-10-13 Thread Russell King - ARM Linux
On Tue, Oct 13, 2015 at 09:00:48PM +0800, Herbert Xu wrote:
> On Sun, Oct 11, 2015 at 02:55:05PM +0800, Herbert Xu wrote:
> > On Sat, Oct 10, 2015 at 05:17:54PM +0100, Russell King - ARM Linux wrote:
> > >
> > > I _also_ notice that despite having the ARM assembly crypto functions
> > > enabled and built-in to the kernel, they don't appear in /proc/crypto -
> > > and this is because of patch 1 in this series, which blocks out any
> > > crypto driver which has a zero statesize (as the ARM crypto functions
> > > appear to have.)  I found this by rebuilding the ARM crypto stuff as
> > > modules, and then trying to insert them:
> > 
> > They're buggy and unfortunately this wasn't caught during the
> > review process.  The import/export functions are required and
> > certainly not optional.
> 
> OK it looks like I was confused.  Yes the import and export functions
> are required but for shash algorithms we provide a simple default.
> This means that implementations can simply leave it NULL and they
> will then use the default import/export functions which exports
> the shash_desc as the state.
> 
> So Russell what I'll do is apply your patch without the hunk
> for shash_prepare_alg.  IOW we will block any ahash algorithms
> that leave state as zero since ahash does not provide a default
> import/export function (it can't because it may involve hardware
> state).  shash algorithms on the other hand won't be affected.

Okay.

I've a larger queue of patches building at the moment for the Marvell
driver - I've found that it's touch-and-go whether it produces the
correct hash result, and previous hashes can affect subsequent hashes
in some circumstances.  I'm currently at another 7 patches, plus a
"big" as-yet uncommitted patch which is virtually a rewrite of the
DMA preparation - diffstat wise, in total it's looking like:

 drivers/crypto/marvell/cesa.h |   8 +-
 drivers/crypto/marvell/hash.c | 258 ++
 drivers/crypto/marvell/tdma.c |   7 ++
 3 files changed, 151 insertions(+), 122 deletions(-)

and there's still a few more cases that need solving: zero bytes of
input should not give a zero hash, and an input which is the multiple
of the blocksize causes the hardware to hang - though I'm not yet
sure whether that's a result of my above changes.  At least with my
above changes, I now get stable and correct hash values for sizes
which do not hit any of those remaining bugs.

Given it's size so far, I'm not sure it makes much sense to backport
any of these fixes to stable; I think maybe we should just say "okay,
before these fixes, it's just broken."

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions

2015-10-13 Thread Russell King - ARM Linux
On Tue, Oct 13, 2015 at 09:57:22PM +0800, Herbert Xu wrote:
> On Tue, Oct 13, 2015 at 02:55:18PM +0100, Russell King - ARM Linux wrote:
> >
> > Given it's size so far, I'm not sure it makes much sense to backport
> > any of these fixes to stable; I think maybe we should just say "okay,
> > before these fixes, it's just broken."
> 
> OK as long as your first patch goes into stable it should be fine
> as it'll essentially disable the marvell driver.

It will disable the Marvell driver, along with the SW implementations in
arch/arm/crypto/ which don't set .statesize, .import and .export.  I
would not be surprised if it also affects others too.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


hash import/export

2015-10-11 Thread Russell King - ARM Linux
Herbert,

I wonder if you can clear something up about the hash export/import
functionality.  In:

http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/344120.html

you seem to imply that the exported and imported state can't be defined
by the driver.

Boris tells me, "AFAIR, crypto users are expecting md5/sha1/sha256_state
depending on the crypto req state they are exporting".

>From what I can see, there is only one place in the core crypto code
where hash state is exported and imported, and that's in hash_accept(),
and that's always done with the same driver.  The only other place is
in the marvell cesa driver itself when initialising the hmac state.

Is there any reason a driver can't define its own structure to be
exported here which can be shared between each of the different methods
it supports?

Thanks.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hash import/export

2015-10-11 Thread Russell King - ARM Linux
On Sun, Oct 11, 2015 at 08:34:27PM +0100, Russell King - ARM Linux wrote:
> Herbert,
> 
> I wonder if you can clear something up about the hash export/import
> functionality.  In:
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/344120.html
> 
> you seem to imply that the exported and imported state can't be defined
> by the driver.
> 
> Boris tells me, "AFAIR, crypto users are expecting md5/sha1/sha256_state
> depending on the crypto req state they are exporting".
> 
> From what I can see, there is only one place in the core crypto code
> where hash state is exported and imported, and that's in hash_accept(),
> and that's always done with the same driver.  The only other place is
> in the marvell cesa driver itself when initialising the hmac state.
> 
> Is there any reason a driver can't define its own structure to be
> exported here which can be shared between each of the different methods
> it supports?

A further question: it looks like struct md5_state's hash member is
always supposed to be in little endian format, whereas struct
sha*_state's state is always big endian format.  Is there a reason
why these are typed 'u32', rather than __le32 and __be32?

This would make the intention here more obvious, and should also allow
the opportunity for sparse to verify that we're getting the endianness
correct everywhere.

Thanks.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/5] crypto: ensure algif_hash does not pass a zero-sized state

2015-10-10 Thread Russell King - ARM Linux
On Sat, Oct 10, 2015 at 06:46:07PM +0200, Boris Brezillon wrote:
> On Fri, 09 Oct 2015 20:43:33 +0100
> Russell King  wrote:
> 
> > If the algorithm passed a zero statesize, do not pass a valid pointer
> > into the export/import functions.  Passing a valid pointer covers up
> > bugs in driver code which then go on to smash the kernel stack.
> > Instead, pass NULL, which will cause any attempt to write to the
> > pointer to fail.
> > 
> > Signed-off-by: Russell King 
> > ---
> >  crypto/ahash.c | 3 ++-
> >  crypto/shash.c | 3 ++-
> >  2 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/crypto/ahash.c b/crypto/ahash.c
> > index 8acb886032ae..9c1dc8d6106a 100644
> > --- a/crypto/ahash.c
> > +++ b/crypto/ahash.c
> > @@ -544,7 +544,8 @@ static int ahash_prepare_alg(struct ahash_alg *alg)
> > struct crypto_alg *base = >halg.base;
> >  
> > if (alg->halg.digestsize > PAGE_SIZE / 8 ||
> > -   alg->halg.statesize > PAGE_SIZE / 8)
> > +   alg->halg.statesize > PAGE_SIZE / 8 ||
> > +   alg->halg.statesize == 0)
> 
> Just read Russel's answer to the cover letter, and I wonder if the
> following test wouldn't fix the problem:
> 
>   (alg->halg.statesize == 0 && (alg->import || alg->export))
> 
> I mean, the only valid case where statesize can be zero is when you
> don't have any state associated to the crypto algorithm, and if that's
> the case, ->import() and ->export() functions are useless, isn't ?

However, that brings up a question.

If you're using AF_ALG, and you attach to (say) the ARM Neon SHA512
implementation through it, and then use accept() to duplicate it's
state, what prevents the kernel from oopsing when hash_accept() calls
crypto_ahash_export(), which then dereferences the NULL alg->export
function pointer?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions

2015-10-10 Thread Russell King - ARM Linux
On Sat, Oct 10, 2015 at 12:29:25PM +0100, Russell King - ARM Linux wrote:
> On Sat, Oct 10, 2015 at 12:31:29PM +0200, Arnaud Ebalard wrote:
> > Russell King - ARM Linux <li...@arm.linux.org.uk> writes:
> > > Software:
> > > The 'numbers' are in 1000s of bytes per second processed.
> > > type 16 bytes 64 bytes256 bytes   1024 bytes   8192 
> > > bytes
> > > md5  13948.89k42477.61k   104619.41k   165140.82k   
> > > 199273.13k
> > > sha1 13091.91k36463.89k75393.88k   103893.33k   
> > > 117104.50k
> > > sha256   13573.92k30492.25k52700.33k64247.81k
> > > 68722.69k
> > >
> > > Hardware:
> > > The 'numbers' are in 1000s of bytes per second processed.
> > > type 16 bytes 64 bytes256 bytes   1024 bytes   8192 
> > > bytes
> > > md5   3964.55k13782.11k43181.71k   180263.38k  
> > > 1446616.18k
> > > sha1  4609.16k 8922.35k35422.87k   333575.31k  
> > > 2122547.20k
> > > sha256   13519.62k30484.10k52547.47k64285.21k
> > > 68530.60k
> 
> Okay, the reason for the difference in SHA256 speed is because the
> "openssl speed" code *totally* *bypasses* the engine support, whereas
> the md5 and sha1 do not.  It even bypasses the normal method used to
> get hold of the sha256 implementation (EVP_sha256), and goes straight
> to using SHA256() directly in openssl/crypto/sha/sha256.c.  It looks
> like the same goes for the AES tests too.

Okay, what's required is openssl speed -evp <cipher|digest> and it can
only do one at a time.  That's really confusing and non-intuitive, given
that some of the non-evp options end up testing the EVP methods.

So, running:

$ openssl speed -evp aes-128-cbc

I get the kernel spitting out a number of these warnings during its run:

DRBG: could not allocate digest TFM handle: hmac(sha512)

I also notice is that 50% of CPU time is spent in the kernel, presumably
because the lack of hardware sha512 means that we end up having to do
sha512 in software in the kernel.

I _also_ notice that despite having the ARM assembly crypto functions
enabled and built-in to the kernel, they don't appear in /proc/crypto -
and this is because of patch 1 in this series, which blocks out any
crypto driver which has a zero statesize (as the ARM crypto functions
appear to have.)  I found this by rebuilding the ARM crypto stuff as
modules, and then trying to insert them:

# modprobe sha512-arm
modprobe: ERROR: could not insert 'sha512_arm': Invalid argument

So, I think it's best if this patch series is *NOT* merged until someone
who knows the kernel crypto code gets to grips with what's supposed to be
done in various parts of the code.  Yes, I know that Herbert suggested
the approach in patch 1, but that _will_ cause regressions like the above
when if it's merged.

Either the ARM crypto code in arch/arm/crypto is buggy, or the approach
in patch 1 is buggy.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions

2015-10-10 Thread Russell King - ARM Linux
On Sat, Oct 10, 2015 at 12:31:29PM +0200, Arnaud Ebalard wrote:
> Hi Russel,
   ^

> Russell King - ARM Linux <li...@arm.linux.org.uk> writes:
> > Software:
> > The 'numbers' are in 1000s of bytes per second processed.
> > type 16 bytes 64 bytes256 bytes   1024 bytes   8192 
> > bytes
> > md5  13948.89k42477.61k   104619.41k   165140.82k   
> > 199273.13k
> > sha1 13091.91k36463.89k75393.88k   103893.33k   
> > 117104.50k
> > sha256   13573.92k30492.25k52700.33k64247.81k
> > 68722.69k
> >
> > Hardware:
> > The 'numbers' are in 1000s of bytes per second processed.
> > type 16 bytes 64 bytes256 bytes   1024 bytes   8192 
> > bytes
> > md5   3964.55k13782.11k43181.71k   180263.38k  
> > 1446616.18k
> > sha1  4609.16k 8922.35k35422.87k   333575.31k  
> > 2122547.20k
> > sha256   13519.62k30484.10k52547.47k64285.21k
> > 68530.60k

Okay, the reason for the difference in SHA256 speed is because the
"openssl speed" code *totally* *bypasses* the engine support, whereas
the md5 and sha1 do not.  It even bypasses the normal method used to
get hold of the sha256 implementation (EVP_sha256), and goes straight
to using SHA256() directly in openssl/crypto/sha/sha256.c.  It looks
like the same goes for the AES tests too.

> I had a lot of performance results at various levels (tcrypt module on
> variations of the drivers (tasklet, threaded irq, full polling, etc),
> IPsec tunnel and transport mode through to see how it behaves w/ two
> mvneta instances also eating CPU cycles for incoming/outgoing packets)
> but those where done on an encryption use case. Some are provided
> in [2]. In an early (read dirty) polling-based version of the driver,
> the CESA on an Armada 370 (mirabox) was verified to be capable of near
> 100MB/s on buffers of 1500+ bytes for AES CBC encryption. Current
> version of the driver is not as good (say half that value) but it
> behaves better. A Mirabox can easily route 1500 bytes packets at 100MB/s
> between its two interfaces but when you mix both using IPsec in tunnel
> mode on one side, you end up w/ perfs between 10 to 15MB/s, IIRC. I
> think it's interesting to see where it ends up w/ the engine exposed to
> userland consumers (e.g. sth like SSH).
> 
> I cannot promise a huge amount of time but I'll try and find some to
> play w/ AF_ALG using openssl and CESA in the coming weeks.

I think what we draw from my investigation is that "openssl speed" is
utterly crap - you don't actually know what's being tested there.  Some
things test the engine, others bypass the engine infrastructure totally
and test the openssl software implementation instead.

So, if you think "openssl speed" is a good way to measure the speed of
digests and ciphers that openssl supplies to applications, *think again*.
It doesn't.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3b 5/5] crypto: marvell: factor out common import/export functions

2015-10-09 Thread Russell King - ARM Linux
On Sat, Oct 10, 2015 at 12:37:33AM +0200, Arnaud Ebalard wrote:
> Hi Russel,
> 
> Russell King  writes:
> 
> > As all the import functions and export functions are virtually
> > identical, factor out their common parts into a generic
> > mv_cesa_ahash_import() and mv_cesa_ahash_export() respectively.  This
> > performs the actual import or export, and we pass the data pointers and
> > length into these functions.
> >
> > We have to switch a % const operation to do_div() in the common import
> > function to avoid provoking gcc to use the expensive 64-bit by 64-bit
> > modulus operation.
> >
> > Signed-off-by: Russell King 
> 
> Thanks for the refactoring and for the fixes. All patches look good to 
> me. Out of curiosity, can I ask what perf you get w/ openssh or openssl
> using AF_ALG and the CESA?

I would do, but it seems this AF_ALG plugin for openssl isn't
actually using it for encryption.  When I try:

openssl speed -engine af_alg aes-128-cbc

I get results for using openssl's software implementation.  If I do:

openssl speed -engine af_alg md5

then I get results from using the kernel's MD5.  Hence, I think the
only thing that I think openssh is using it for is the digest stuff,
not the crypto itself.  I can't be certain about that though.

I've tried debugging the af_alg engine plugin, but I'm not getting
very far (I'm not an openssl hacker!)  I see it registering the
function to get the ciphers (via ENGINE_set_ciphers), and I see this
called several times, returning a list of NID_xxx values describing
the methods it supports, which includes aes-128-cbc.  However,
unlike the equivalent digest function, I never see it called
requesting any of the ciphers.  Maybe it's an openssl bug, or a
"feature" preventing hardware crypto?  Maybe something is missing
from its initialisation?  I've no idea yet.  It seems I'm not alone
in this - this report from April 2015 is exactly what I'm seeing:

https://mta.openssl.org/pipermail/openssl-users/2015-April/001124.html

However, I'm coming to the conclusion that AF_ALG with openssl is a
dead project, and the only interface that everyone is using for that
is cryptodev - probably contary to Herbert and/or DaveM's wishes.  For
example, the openwrt guys seem to only support cryptodev, according to
their wiki page on the subject of hardware crypto:

http://wiki.openwrt.org/doc/hardware/cryptographic.hardware.accelerators

Here's the references to code for AF_ALG with openssl I've found so far:

Original af_alg plugin (dead):

http://src.carnivore.it/users/common/af_alg/

3rd party "maintained" af_alg openssl plugin, derived from commit
1851bbb010c38878c83729be844f168192059189 in the above repo but with
no history:

https://github.com/RidgeRun/af-alg-rr

and that doesn't contain any changes to the C code originally committed.
Whether this C code contains changes or not is anyone's guess: there's
no way to refer back to the original repository.

Anyway, here's the digest results:

Software:
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
md5  13948.89k42477.61k   104619.41k   165140.82k   199273.13k
sha1 13091.91k36463.89k75393.88k   103893.33k   117104.50k
sha256   13573.92k30492.25k52700.33k64247.81k68722.69k

Hardware:
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
md5   3964.55k13782.11k43181.71k   180263.38k  1446616.18k
sha1  4609.16k 8922.35k35422.87k   333575.31k  2122547.20k
sha256   13519.62k30484.10k52547.47k64285.21k68530.60k

There's actually something suspicious while running these tests:

Doing md5 for 3s on 16 size blocks: 32212 md5's in 0.13s
Doing md5 for 3s on 64 size blocks: 23688 md5's in 0.11s
Doing md5 for 3s on 256 size blocks: 23615 md5's in 0.14s
Doing md5 for 3s on 1024 size blocks: 22885 md5's in 0.13s
Doing md5 for 3s on 8192 size blocks: 15893 md5's in 0.09s
Doing sha1 for 3s on 16 size blocks: 31688 sha1's in 0.11s
Doing sha1 for 3s on 64 size blocks: 23700 sha1's in 0.17s
Doing sha1 for 3s on 256 size blocks: 23523 sha1's in 0.17s
Doing sha1 for 3s on 1024 size blocks: 22803 sha1's in 0.07s
Doing sha1 for 3s on 8192 size blocks: 15546 sha1's in 0.06s
Doing sha256 for 3s on 16 size blocks: 2518030 sha256's in 2.98s
Doing sha256 for 3s on 64 size blocks: 1419416 sha256's in 2.98s
Doing sha256 for 3s on 256 size blocks: 613738 sha256's in 2.99s
Doing sha256 for 3s on 1024 size blocks: 187080 sha256's in 2.98s
Doing sha256 for 3s on 8192 size blocks: 25013 sha256's in 2.99s

from the hardware - note the "in" figures are rediculously low, yet
they do wait 3s for each test.  Also, the sha256 results are close
enough to being the software version.

No ideas on any of this yet... 

[PATCH 0/3] crypto: fixes for Marvell hash

2015-10-09 Thread Russell King - ARM Linux
This small series of patches addresses oopses seen when trying to use
the AF_ALG interface via openssl with openssh.  This series does not
address all problems, but merely stops the kernel from smashing its
kernel stack and oopsing.

With these fixes in place, the kernel no longer oopses.  However, with
the digests enabled in openssl, openssh refuses to work, producing the
following when attempting to connect to the target system:

Corrupted MAC on input.
Disconnecting: Packet corrupt

It's been hard enough to get this far; the crypto code is not the easiest
code to debug for a new-comer due to the amount of state needed to be
retained to understand the code (all the inline functions masking
multiple levels of containerisation and pointer dereference does not
make it easy to track what is stored where, and once I've been through
one bit of code, I find I'm having to revisit the same piece of code a
bit later to re-understand what it's doing.)

It's been difficult enough to find the engine plugin for openssl - the
original git repo which hosted it is now dead
(http://src.carnivore.it/users/common/af_alg/).  All that seems to be
left is someone's modified version on github, which seems to get some
maintanence.  Debian doesn't seem to carry AF_ALG openssl support, and
seems to only carry one package (strongswan) which supports this
interface.

Hence, I'm leaving further debugging to other parties, especially as
the userspace tooling for the AF_ALG seems rather lacking.  (Are there
any test programs, if so, can their location be documented and placed
in Documentation/crypto please?)

I'm not sure who the maintainer for drivers/crypto/marvell is, so I've
picked Thomas.  It would be nice if there was an entry in MAINTAINERS
for this driver.

The first patch in this series avoids kernel stack smashing if a crypto
driver forgets to set the 'statesize' member, but writes to what seems
to be a valid pointer passed to its export function.  Of course, this
won't completely stop stack smashing if the statesize member is
smaller than the data which the export function writes.  This patch is
optional.

The second patch adds the necessary statesize members to the Marvell
code which were previously missing.  Fixing this uncovered a further
problem, which the third patch addresses.

 crypto/algif_hash.c   | 6 +-
 drivers/crypto/marvell/hash.c | 9 +
 2 files changed, 14 insertions(+), 1 deletion(-)

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] crypto: fixes for Marvell hash

2015-10-09 Thread Russell King - ARM Linux
Version 2, same as version 1 but with a different patch 1, thanks to
Herbert for an alternative approach on that one.

 crypto/ahash.c| 3 ++-
 crypto/shash.c| 3 ++-
 drivers/crypto/marvell/hash.c | 9 +
 3 files changed, 13 insertions(+), 2 deletions(-)

On Fri, Oct 09, 2015 at 11:29:04AM +0100, Russell King - ARM Linux wrote:
> This small series of patches addresses oopses seen when trying to use
> the AF_ALG interface via openssl with openssh.  This series does not
> address all problems, but merely stops the kernel from smashing its
> kernel stack and oopsing.
> 
> With these fixes in place, the kernel no longer oopses.  However, with
> the digests enabled in openssl, openssh refuses to work, producing the
> following when attempting to connect to the target system:
> 
> Corrupted MAC on input.
> Disconnecting: Packet corrupt
> 
> It's been hard enough to get this far; the crypto code is not the easiest
> code to debug for a new-comer due to the amount of state needed to be
> retained to understand the code (all the inline functions masking
> multiple levels of containerisation and pointer dereference does not
> make it easy to track what is stored where, and once I've been through
> one bit of code, I find I'm having to revisit the same piece of code a
> bit later to re-understand what it's doing.)
> 
> It's been difficult enough to find the engine plugin for openssl - the
> original git repo which hosted it is now dead
> (http://src.carnivore.it/users/common/af_alg/).  All that seems to be
> left is someone's modified version on github, which seems to get some
> maintanence.  Debian doesn't seem to carry AF_ALG openssl support, and
> seems to only carry one package (strongswan) which supports this
> interface.
> 
> Hence, I'm leaving further debugging to other parties, especially as
> the userspace tooling for the AF_ALG seems rather lacking.  (Are there
> any test programs, if so, can their location be documented and placed
> in Documentation/crypto please?)
> 
> I'm not sure who the maintainer for drivers/crypto/marvell is, so I've
> picked Thomas.  It would be nice if there was an entry in MAINTAINERS
> for this driver.
> 
> The first patch in this series avoids kernel stack smashing if a crypto
> driver forgets to set the 'statesize' member, but writes to what seems
> to be a valid pointer passed to its export function.  Of course, this
> won't completely stop stack smashing if the statesize member is
> smaller than the data which the export function writes.  This patch is
> optional.
> 
> The second patch adds the necessary statesize members to the Marvell
> code which were previously missing.  Fixing this uncovered a further
> problem, which the third patch addresses.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] crypto: fixes for Marvell hash

2015-10-09 Thread Russell King - ARM Linux
On Fri, Oct 09, 2015 at 02:12:43PM +0200, Thomas Petazzoni wrote:
> Hello Russell,
> 
> On Fri, 9 Oct 2015 11:29:05 +0100, Russell King - ARM Linux wrote:
> > This small series of patches addresses oopses seen when trying to use
> > the AF_ALG interface via openssl with openssh.  This series does not
> > address all problems, but merely stops the kernel from smashing its
> > kernel stack and oopsing.
> 
> Thanks for debugging this and for the patches. However, could you Cc
> the main authors of the driver:
> 
> Boris Brezillon <boris.brezil...@free-electrons.com>
> Arnaud Ebalard <a...@natisbad.org>

Hmm.  You know, given how broken this is, I've a good mind to say that
this is more hassle than it's worth to get these simple fixes merged,
especially as there's no MAINTAINERS entry for them.  I'm not sending
the patch series a third time.  I could say if maintainers can't be
bothered to ensure that they're listed in MAINTAINERS then they aren't
maintainers at all.

If this means it won't be merged, so be it, I don't care enough given
that there's very little userspace support for kernel hw crypto.

> > I'm not sure who the maintainer for drivers/crypto/marvell is, so I've
> > picked Thomas.  It would be nice if there was an entry in MAINTAINERS
> > for this driver.
> 
> Indeed. However get_maintainer.pl gets this right:

Try with --no-git, which is the only sane mode for using get_maintainer.pl.
Picking up everyone who's ever touched a driver with as little as a spelling
fix in a git commit is abhorrent.

> $ ./scripts/get_maintainer.pl -f drivers/crypto/marvell/cesa.c 
> Herbert Xu <herb...@gondor.apana.org.au> (maintainer:CRYPTO 
> API,commit_signer:11/12=92%)
> "David S. Miller" <da...@davemloft.net> (maintainer:CRYPTO API)
> Boris BREZILLON <boris.brezil...@free-electrons.com> 
> (commit_signer:11/12=92%,authored:6/12=50%,added_lines:538/558=96%,removed_lines:7/12=58%)
> Arnaud Ebalard <a...@natisbad.org> 
> (commit_signer:7/12=58%,authored:4/12=33%,removed_lines:1/12=8%)
> Vladimir Zapolskiy <vladimir_zapols...@mentor.com> 
> (commit_signer:1/12=8%,authored:1/12=8%,removed_lines:3/12=25%)
> Krzysztof Kozlowski <k.kozlow...@samsung.com> 
> (commit_signer:1/12=8%,authored:1/12=8%,removed_lines:1/12=8%)
> 
> Thomas
> -- 
> Thomas Petazzoni, CTO, Free Electrons
> Embedded Linux, Kernel and Android engineering
> http://free-electrons.com

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/3] crypto: fixes for Marvell hash

2015-10-09 Thread Russell King - ARM Linux
Thomas says you're the maintainers of the marvell crypto driver.  I
picked sending these patches to Thomas because I couldn't work out
who should be receiving them.  If you wish to be copied on patches,
please put yourselves in the MAINTAINERS file.  In the mean time, you
can find the patches on the crypto list if you wish to review them.

Thanks.

On Fri, Oct 09, 2015 at 11:46:37AM +0100, Russell King - ARM Linux wrote:
> Version 2, same as version 1 but with a different patch 1, thanks to
> Herbert for an alternative approach on that one.
> 
>  crypto/ahash.c| 3 ++-
>  crypto/shash.c| 3 ++-
>  drivers/crypto/marvell/hash.c | 9 +
>  3 files changed, 13 insertions(+), 2 deletions(-)
> 
> On Fri, Oct 09, 2015 at 11:29:04AM +0100, Russell King - ARM Linux wrote:
> > This small series of patches addresses oopses seen when trying to use
> > the AF_ALG interface via openssl with openssh.  This series does not
> > address all problems, but merely stops the kernel from smashing its
> > kernel stack and oopsing.
> > 
> > With these fixes in place, the kernel no longer oopses.  However, with
> > the digests enabled in openssl, openssh refuses to work, producing the
> > following when attempting to connect to the target system:
> > 
> > Corrupted MAC on input.
> > Disconnecting: Packet corrupt
> > 
> > It's been hard enough to get this far; the crypto code is not the easiest
> > code to debug for a new-comer due to the amount of state needed to be
> > retained to understand the code (all the inline functions masking
> > multiple levels of containerisation and pointer dereference does not
> > make it easy to track what is stored where, and once I've been through
> > one bit of code, I find I'm having to revisit the same piece of code a
> > bit later to re-understand what it's doing.)
> > 
> > It's been difficult enough to find the engine plugin for openssl - the
> > original git repo which hosted it is now dead
> > (http://src.carnivore.it/users/common/af_alg/).  All that seems to be
> > left is someone's modified version on github, which seems to get some
> > maintanence.  Debian doesn't seem to carry AF_ALG openssl support, and
> > seems to only carry one package (strongswan) which supports this
> > interface.
> > 
> > Hence, I'm leaving further debugging to other parties, especially as
> > the userspace tooling for the AF_ALG seems rather lacking.  (Are there
> > any test programs, if so, can their location be documented and placed
> > in Documentation/crypto please?)
> > 
> > I'm not sure who the maintainer for drivers/crypto/marvell is, so I've
> > picked Thomas.  It would be nice if there was an entry in MAINTAINERS
> > for this driver.
> > 
> > The first patch in this series avoids kernel stack smashing if a crypto
> > driver forgets to set the 'statesize' member, but writes to what seems
> > to be a valid pointer passed to its export function.  Of course, this
> > won't completely stop stack smashing if the statesize member is
> > smaller than the data which the export function writes.  This patch is
> > optional.
> > 
> > The second patch adds the necessary statesize members to the Marvell
> > code which were previously missing.  Fixing this uncovered a further
> > problem, which the third patch addresses.
> 
> -- 
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] crypto: ensure algif_hash does not pass a zero-sized state

2015-10-09 Thread Russell King - ARM Linux
On Fri, Oct 09, 2015 at 06:34:28PM +0800, Herbert Xu wrote:
> On Fri, Oct 09, 2015 at 11:29:44AM +0100, Russell King wrote:
> > If the algorithm passed a zero statesize, do not pass a valid pointer
> > into the export/import functions.  Passing a valid pointer covers up
> > bugs in driver code which then go on to smash the kernel stack.
> > Instead, pass NULL, which will cause any attempt to write to the
> > pointer to fail.
> > 
> > Signed-off-by: Russell King 
> 
> The state size should never be zero for a hash algorithm.  Having
> a zero state means that the hash output must always be identical.
> Such an algorithm would be quite useless.
> 
> So how about adding a check upon hash registration to verify that
> the state size is greater than zero? The place to do it would be
> shash_prepare_alg and ahash_prepare_alg.

Do you mean something like this?  As statesize is an unsigned int, testing
for zero should be sufficient.

diff --git a/crypto/ahash.c b/crypto/ahash.c
index 8acb886032ae..9c1dc8d6106a 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -544,7 +544,8 @@ static int ahash_prepare_alg(struct ahash_alg *alg)
struct crypto_alg *base = >halg.base;
 
if (alg->halg.digestsize > PAGE_SIZE / 8 ||
-   alg->halg.statesize > PAGE_SIZE / 8)
+   alg->halg.statesize > PAGE_SIZE / 8 ||
+   alg->halg.statesize == 0)
return -EINVAL;
 
base->cra_type = _ahash_type;
diff --git a/crypto/shash.c b/crypto/shash.c
index ecb1e3d39bf0..ab3384b38542 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -585,7 +585,8 @@ static int shash_prepare_alg(struct shash_alg *alg)
 
if (alg->digestsize > PAGE_SIZE / 8 ||
alg->descsize > PAGE_SIZE / 8 ||
-   alg->statesize > PAGE_SIZE / 8)
+   alg->statesize > PAGE_SIZE / 8 ||
+   alg->statesize == 0)
return -EINVAL;
 
base->cra_type = _shash_type;

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/5] crypto: fixes for Marvell hash

2015-10-09 Thread Russell King - ARM Linux
After all, here's a version 3, which fixes the other bug I mentioned
below - we now have correct hash results.  A couple of patches have
been added, one to fix that, and a second patch to factor out the
duplication in the import functions.  This gets openssh working with
the digests enabled.

Acks so far received have been added.  Patch 3 has changed, so I didn't
add the ack for the previous version.  Please re-review patch 3.

Thanks.

 crypto/ahash.c|  3 +-
 crypto/shash.c|  3 +-
 drivers/crypto/marvell/hash.c | 80 +--
 3 files changed, 36 insertions(+), 50 deletions(-)

On Fri, Oct 09, 2015 at 11:46:37AM +0100, Russell King - ARM Linux wrote:
> Version 2, same as version 1 but with a different patch 1, thanks to
> Herbert for an alternative approach on that one.
> 
>  crypto/ahash.c| 3 ++-
>  crypto/shash.c| 3 ++-
>  drivers/crypto/marvell/hash.c | 9 +
>  3 files changed, 13 insertions(+), 2 deletions(-)
> 
> On Fri, Oct 09, 2015 at 11:29:04AM +0100, Russell King - ARM Linux wrote:
> > This small series of patches addresses oopses seen when trying to use
> > the AF_ALG interface via openssl with openssh.  This series does not
> > address all problems, but merely stops the kernel from smashing its
> > kernel stack and oopsing.
> > 
> > With these fixes in place, the kernel no longer oopses.  However, with
> > the digests enabled in openssl, openssh refuses to work, producing the
> > following when attempting to connect to the target system:
> > 
> > Corrupted MAC on input.
> > Disconnecting: Packet corrupt
> > 
> > It's been hard enough to get this far; the crypto code is not the easiest
> > code to debug for a new-comer due to the amount of state needed to be
> > retained to understand the code (all the inline functions masking
> > multiple levels of containerisation and pointer dereference does not
> > make it easy to track what is stored where, and once I've been through
> > one bit of code, I find I'm having to revisit the same piece of code a
> > bit later to re-understand what it's doing.)
> > 
> > It's been difficult enough to find the engine plugin for openssl - the
> > original git repo which hosted it is now dead
> > (http://src.carnivore.it/users/common/af_alg/).  All that seems to be
> > left is someone's modified version on github, which seems to get some
> > maintanence.  Debian doesn't seem to carry AF_ALG openssl support, and
> > seems to only carry one package (strongswan) which supports this
> > interface.
> > 
> > Hence, I'm leaving further debugging to other parties, especially as
> > the userspace tooling for the AF_ALG seems rather lacking.  (Are there
> > any test programs, if so, can their location be documented and placed
> > in Documentation/crypto please?)
> > 
> > I'm not sure who the maintainer for drivers/crypto/marvell is, so I've
> > picked Thomas.  It would be nice if there was an entry in MAINTAINERS
> > for this driver.
> > 
> > The first patch in this series avoids kernel stack smashing if a crypto
> > driver forgets to set the 'statesize' member, but writes to what seems
> > to be a valid pointer passed to its export function.  Of course, this
> > won't completely stop stack smashing if the statesize member is
> > smaller than the data which the export function writes.  This patch is
> > optional.
> > 
> > The second patch adds the necessary statesize members to the Marvell
> > code which were previously missing.  Fixing this uncovered a further
> > problem, which the third patch addresses.
> 
> -- 
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/5] crypto: marvell: initialise struct mv_cesa_ahash_req

2015-10-09 Thread Russell King - ARM Linux
On Fri, Oct 09, 2015 at 09:50:18PM +0200, Boris Brezillon wrote:
> Hi Russel,
> 
> On Fri, 09 Oct 2015 20:43:43 +0100
> Russell King  wrote:
> 
> > When a AF_ALG fd is accepted a second time (hence hash_accept() is
> > used), hash_accept_parent() allocates a new private context using
> > sock_kmalloc().  This context is uninitialised.  After use of the new
> > fd, we eventually end up with the kernel complaining:
> > 
> > marvell-cesa f109.crypto: dma_pool_free cesa_padding, c0627770/0 (bad 
> > dma)
> > 
> > where c0627770 is a random address.  Poisoning the memory allocated by
> > the above sock_kmalloc() produces kernel oopses within the marvell hash
> > code, particularly the interrupt handling.
> > 
> > The following simplfied call sequence occurs:
> > 
> > hash_accept()
> >   crypto_ahash_export()
> > marvell hash export function
> >   af_alg_accept()
> > hash_accept_parent()<== allocates uninitialised struct hash_ctx
> >   crypto_ahash_import()
> > marvell hash import function
> > 
> > hash_ctx contains the struct mv_cesa_ahash_req in its req.__ctx member,
> > and, as the marvell hash import function only partially initialises
> > this structure, we end up with a lot of members which are left with
> > whatever data was in memory prior to sock_kmalloc().
> > 
> > Add zero-initialisation of this structure.
> 
> Maybe you should also change your commit message since this patch no
> longer initializes the req struct to zero, otherwise

Oops.  s/zero-// :)

> Acked-by: Boris Brezillon 
> 
> > 
> > Signed-off-by: Russell King 
> > ---
> >  drivers/crypto/marvell/hash.c | 12 
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
> > index a259aced3b42..458867ce9515 100644
> > --- a/drivers/crypto/marvell/hash.c
> > +++ b/drivers/crypto/marvell/hash.c
> > @@ -831,6 +831,10 @@ static int mv_cesa_md5_import(struct ahash_request 
> > *req, const void *in)
> > unsigned int cache_ptr;
> > int ret;
> >  
> > +   ret = crypto_ahash_init(req);
> > +   if (ret)
> > +   return ret;
> > +
> > creq->len = in_state->byte_count;
> > memcpy(creq->state, in_state->hash, digsize);
> > creq->cache_ptr = 0;
> > @@ -921,6 +925,10 @@ static int mv_cesa_sha1_import(struct ahash_request 
> > *req, const void *in)
> > unsigned int cache_ptr;
> > int ret;
> >  
> > +   ret = crypto_ahash_init(req);
> > +   if (ret)
> > +   return ret;
> > +
> > creq->len = in_state->count;
> > memcpy(creq->state, in_state->state, digsize);
> > creq->cache_ptr = 0;
> > @@ -1022,6 +1030,10 @@ static int mv_cesa_sha256_import(struct 
> > ahash_request *req, const void *in)
> > unsigned int cache_ptr;
> > int ret;
> >  
> > +   ret = crypto_ahash_init(req);
> > +   if (ret)
> > +   return ret;
> > +
> > creq->len = in_state->count;
> > memcpy(creq->state, in_state->state, digsize);
> > creq->cache_ptr = 0;
> 
> 
> 
> -- 
> Boris Brezillon, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] MAINTAINERS: add maintainers for the Marvell Crypto driver

2015-10-09 Thread Russell King - ARM Linux
On Fri, Oct 09, 2015 at 05:39:02PM +0200, Thomas Petazzoni wrote:
> A new crypto driver for Marvell ARM platforms was added in
> drivers/crypto/marvell/ as part of commit f63601fd616ab ("crypto:
> marvell/cesa - add a new driver for Marvell's CESA"). This commit adds
> the relevant developers to the list of maintainers.
> 
> Cc: Boris Brezillon 
> Cc: Arnaud Ebalard 
> Cc: Herbert Xu 
> Cc: Russell King 

Suggested-by: Russell King 

;)

> Signed-off-by: Thomas Petazzoni 
> ---
>  MAINTAINERS | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7ba7ab7..d3f47f5c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6555,6 +6555,13 @@ M: Guenter Roeck 
>  S:   Maintained
>  F:   drivers/net/dsa/mv88e6352.c
>  
> +MARVELL CRYPTO DRIVER
> +M:   Boris Brezillon 
> +M:   Arnaud Ebalard 
> +F:   drivers/crypto/marvell/
> +S:   Maintained
> +L:   linux-crypto@vger.kernel.org
> +
>  MARVELL GIGABIT ETHERNET DRIVERS (skge/sky2)
>  M:   Mirko Lindner 
>  M:   Stephen Hemminger 
> -- 
> 2.6.1
> 

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: crypto: marvell/CESA: Issues with non cache-line aligned buffers

2015-07-03 Thread Russell King - ARM Linux
BTW, off-topic for this thread... but I notice from Mark Brown's builder
that mv_cesa is causing build errors in mainline now:

arm-allmodconfig
../drivers/crypto/mv_cesa.c:1037:2: error: implicit declaration of function 
'of_get_named_gen_pool' [-Werror=implicit-function-declaration]

arm-multi_v5_defconfig
../drivers/crypto/mv_cesa.c:1037:2: error: implicit declaration of function 
'of_get_named_gen_pool' [-Werror=implicit-function-declaration]

It seems it was fine on July 2nd, but the above was introduced today.

A few other things I notice when looking at this code:

/* Not all platforms can gate the clock, so it is not
   an error if the clock does not exists. */
cp-clk = clk_get(pdev-dev, NULL);
if (!IS_ERR(cp-clk))
clk_prepare_enable(cp-clk);

So, if clk_get() returns PTR_ERR(-EPROBE_DEFER) we treat that clock as
missing?  Is that really the behaviour you want there?

ret = request_irq(irq, crypto_int, 0, dev_name(pdev-dev),
cp);

What happens if crypto_int() is called when request_irq() unlocks its
spinlock, before the clock has been found?  Eg, because we're booting
from a kexec'd kernel.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: crypto: marvell/CESA: Issues with non cache-line aligned buffers

2015-07-03 Thread Russell King - ARM Linux
On Fri, Jul 03, 2015 at 11:43:05AM +0200, Boris Brezillon wrote:
 Which led us to think that this could be related to a non cache-line
 aligned buffer problem: if we share the cache line with someone
 modifying its data during the DMA transfer, we could experience data
 loss when the cpu decide to flush the data from this cache line to the
 memory.

There's really only two options when you have an overlapping cache line
and the cache line contains CPU-dirty data, but the memory contains
DMA-dirty data.  One of those dirty data has to win out, and the other
has to be destroyed.  (Either you invalidate the CPU cache line, loosing
the CPU-dirty data, or you write-back the cache line, overwriting the DMA-
dirty data.)

What that means is that it's a _bug_ to end up in this situation, nothing
more, nothing less.  There's nothing that arch code can do about it, it's
a DMA-API user bug.

What you'll find is that this is well documented.  See
Documentation/DMA-API.txt, part Id (Streaming DMA mappings):

  Warnings:  Memory coherency operates at a granularity called the cache
  line width.  In order for memory mapped by this API to operate
  correctly, the mapped region must begin exactly on a cache line
  boundary and end exactly on one (to prevent two separately mapped
  regions from sharing a single cache line).  Since the cache line size
  may not be known at compile time, the API will not enforce this
  requirement.  Therefore, it is recommended that driver writers who
  don't take special care to determine the cache line size at run time
  only map virtual regions that begin and end on page boundaries (which
  are guaranteed also to be cache line boundaries).

The behaviour you're seeing here is not ARM specific - any CPU which
uses cache lines and doesn't have hardware coherency will suffer from
this problem.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 05/14] crypto: marvell/CESA: add TDMA support

2015-06-18 Thread Russell King - ARM Linux
On Wed, Jun 17, 2015 at 05:50:01PM +0800, Herbert Xu wrote:
 On Wed, Jun 17, 2015 at 09:45:33AM +0200, Boris Brezillon wrote:
 
  +   ret = dma_map_sg(cesa_dev-dev, req-src, creq-src_nents,
  +DMA_TO_DEVICE);
  +   if (!ret)
  +   return -ENOMEM;
  +
  +   creq-src_nents = ret;
 
 DMA-API-HOWTO says that you must retain the original nents and
 use it when you call dma_unmap_sg.  So I'm afraid one more repost
 is needed :)

It's worse than that...  You're right on that point, but there's an
additional point.

If dma_map_sg() coalesces scatterlist entries, then ret will be smaller
than src_nents, and ret indicates how many scatterlist entries to be
walked during DMA - you should not use src_nents for that.  I couldn't
see where the driver used that information.  In fact, the driver seems
to be capable of walking more than src_nents/ret numbers of scatterlist
entries: it just keeps going with sg_next() until it hits the end of
the allocated scatterlist.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 05/14] crypto: marvell/CESA: add TDMA support

2015-06-18 Thread Russell King - ARM Linux
On Thu, Jun 18, 2015 at 11:33:24AM +0200, Boris Brezillon wrote:
 Hi Russel,
 
 On Thu, 18 Jun 2015 10:04:00 +0100
 Russell King - ARM Linux li...@arm.linux.org.uk wrote:
 
  On Wed, Jun 17, 2015 at 05:50:01PM +0800, Herbert Xu wrote:
   On Wed, Jun 17, 2015 at 09:45:33AM +0200, Boris Brezillon wrote:
   
+   ret = dma_map_sg(cesa_dev-dev, req-src, creq-src_nents,
+DMA_TO_DEVICE);
+   if (!ret)
+   return -ENOMEM;
+
+   creq-src_nents = ret;
   
   DMA-API-HOWTO says that you must retain the original nents and
   use it when you call dma_unmap_sg.  So I'm afraid one more repost
   is needed :)
  
  It's worse than that...  You're right on that point, but there's an
  additional point.
  
  If dma_map_sg() coalesces scatterlist entries, then ret will be smaller
  than src_nents, and ret indicates how many scatterlist entries to be
  walked during DMA - you should not use src_nents for that.  I couldn't
  see where the driver used that information.  In fact, the driver seems
  to be capable of walking more than src_nents/ret numbers of scatterlist
  entries: it just keeps going with sg_next() until it hits the end of
  the allocated scatterlist.
 
 Yes, I realized that, and I never used the value returned by
 dma_map_sg() to walk the scatterlist anyway: I was using the sg_next()
 and sg-length value (which I replaced by sg_dma_len() in v7 as
 suggested by Herbert).
 So the -src_nents assignment to dma_map_sg() return value was just a
 silly mistake caused by an uncareful read of the DMA-API-HOWTO.
 
 Am I missing something else ?

Yes.  'ret' should be used to indicate the number of scatterlist entries
to walk for DMA purposes after the scatterlist has been mapped.  For PIO
purposes, using src_nents is still acceptable.

As Herbert points out, you're stopping after the sum of transferred bytes
matches, so I suppose that's fine.

One other point though: you should use sg_dma_address() rather than
dereferencing sg-dma_address directly.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG?] crypto: caam: little/big endianness on ARM vs PPC

2015-06-15 Thread Russell King - ARM Linux
On Mon, Jun 15, 2015 at 06:33:17PM +0200, Jon Nettleton wrote:
 Funny enough I tackled this problem over the weekend as well.  My
 approach was to switch the driver over to use the *_relaxed() io
 functions and then special case the bits missing from the various
 ARCHs.  Basically adding setbits32 and clrbits32 for !PPC
 architectures and letting PPC and ARM share a writeq/readq set of
 functions.  I left the existing LITTLE_ENDIAN special case until I
 could verify if it was needed, or had been tested.

I'll follow up here with what I've mentioned elsewhere, and some further
thoughts.

I think this shows the dangers of using struct { } to define register
offsets.  Let's start here:

/*
 * caam_job_ring - direct job ring setup
 * 1-4 possible per instantiation, base + 1000/2000/3000/4000
 * Padded out to 0x1000
 */
struct caam_job_ring {
/* Input ring */
u64 inpring_base;   /* IRBAx -  Input desc ring baseaddr */
u32 rsvd1;

Apparently, this is a CPU-endian 64-bit value (it's not defined using
le64 or be64 which would fix it's endian.)

The second question, which comes up in light of the breakage that's
being reported is: is this really a 64-bit register, or is it a pair
of 32-bit registers side-by-side?

The documentation I'm looking at doesn't document the register at
base + 0x1000, but documents the one at base + 0x1004, and the one
at 0x1004 is given the name IRBAR0_LS, which presumably stands
for input ring base address register 0, least significant.

As the code originally stood for PPC, IRBAR0_LS is also at 0x1004,
but appears to be big endian.

On ARM, IRBAR0_LS appears at the same address, but is little endian.

This is *not* a 64-bit register at all, but is a pair of 32-bit
registers side by side.  Moreover, readq() should not be used - no
amount of arch mangling could ever produce a sane readq() which
coped with this.

So, the CAAM code is buggy in this regard: using readq() here when
endian-portability is required is wrong.  It's got to be two 32-bit
reads or two 32-bit writes in the appropriate endian.

Also, note that there's a big difference between __raw_readl() and
readl_relaxed().  readl_relaxed() is always little-endian.  __raw_readl()
is god-knows-what-the-archtecture-decides endian.  Switching PPC
drivers from __raw_readl() to readl_relaxed() is really not a good
idea unless someone from the PPC camp reviews and tests the code.

So, what I'd suggest is just fixing rd_reg64() and wr_reg64() to do
the right thing: keeping the two 32-bit words in the same order
irrespective of the endian-ness, and staying with the __raw_*
accessors until PPC people can look at this.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG?] crypto: caam: little/big endianness on ARM vs PPC

2015-06-15 Thread Russell King - ARM Linux
On Mon, Jun 15, 2015 at 05:59:07PM +0200, Steffen Trumtrar wrote:
 I'm working on CAAM support for the ARM-based i.MX6 SoCs. The current
 drivers/crypto/caam driver only works for PowerPC AFAIK.
 Actually, there isn't that much to do, to get support for the i.MX6 but
 one patch breaks the driver severely:
 
   commit ef94b1d834aace7101de77c3a7c2631b9ae9c5f6
   crypto: caam - Add definition of rd/wr_reg64 for little endian platform

You're not the only one who's hitting problems with this - Jon Nettleton
has been working on it recently.

The way this has been done is fairly yucky to start with: several things
about it are particularly horrid.  The first is the repetitive code
for the BE and LE cases, when all that's actually different is the
register order between the two code cases.

The second thing is the excessive use of masking - I'm pretty sure the
compiler won't complain with the below.

diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
index 378ddc17f60e..ba0fa630a25d 100644
--- a/drivers/crypto/caam/regs.h
+++ b/drivers/crypto/caam/regs.h
@@ -84,34 +84,28 @@
 #endif
 
 #ifndef CONFIG_64BIT
-#ifdef __BIG_ENDIAN
-static inline void wr_reg64(u64 __iomem *reg, u64 data)
-{
-   wr_reg32((u32 __iomem *)reg, (data  0xull)  32);
-   wr_reg32((u32 __iomem *)reg + 1, data  0xull);
-}
-
-static inline u64 rd_reg64(u64 __iomem *reg)
-{
-   return (((u64)rd_reg32((u32 __iomem *)reg))  32) |
-   ((u64)rd_reg32((u32 __iomem *)reg + 1));
-}
+#if defined(__BIG_ENDIAN)
+#define REG64_HI32(reg) ((u32 __iomem *)(reg))
+#define REG64_LO32(reg) ((u32 __iomem *)(reg) + 1)
+#elif defined(__LITTLE_ENDIAN)
+#define REG64_HI32(reg) ((u32 __iomem *)(reg) + 1)
+#define REG64_LO32(reg) ((u32 __iomem *)(reg))
 #else
-#ifdef __LITTLE_ENDIAN
+#error Broken endian?
+#endif
+
 static inline void wr_reg64(u64 __iomem *reg, u64 data)
 {
-   wr_reg32((u32 __iomem *)reg + 1, (data  0xull)  32);
-   wr_reg32((u32 __iomem *)reg, data  0xull);
+   wr_reg32(REG64_HI32(reg), data  32);
+   wr_reg32(REG64_LO32(reg), data);
 }
 
 static inline u64 rd_reg64(u64 __iomem *reg)
 {
-   return (((u64)rd_reg32((u32 __iomem *)reg + 1))  32) |
-   ((u64)rd_reg32((u32 __iomem *)reg));
+   return ((u64)rd_reg32(REG64_HI32(reg)))  32 |
+   rd_reg32(REG64_LO32(reg));
 }
 #endif
-#endif
-#endif
 
 /*
  * jr_outentry

The second issue is that apparently, the register order doesn't actually
change for LE devices - in other words, the byte order within each register
does change, but they aren't a 64-bit register, they're two separate 32-bit
registers.  So, they should always be written as such.

So, I'd get rid of the #if defined(__BIG_ENDIAN) stuff and instead have:

+/*
+ * The DMA address register is a pair of 32-bit registers, and should
+ * always be accessed as such.
+ */
+#define REG64_HI32(reg) ((u32 __iomem *)(reg))
+#define REG64_LO32(reg) ((u32 __iomem *)(reg) + 1)


-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [V4 PATCH 3/6] pci: Generic function for setting up PCI device DMA coherency

2015-05-20 Thread Russell King - ARM Linux
On Wed, May 20, 2015 at 07:00:25AM -0500, Suravee Suthikulanit wrote:
 On 5/20/2015 4:34 AM, Catalin Marinas wrote:
 We have a dummy of_dma_configure() already when !CONFIG_OF, otherwise
 we would need #ifndef here. I already replied, I think for other
 architectures we need this check to avoid a useless host-of_node test.
 
 It seems that there are several places that have similar check. Would it be
 good to convert this into a macro? Something like:
 
 #define OF_NODE_ENABLED(dev)  (IS_ENABLED(CONFIG_OF)  dev-of_node)

This /could/ be a useful compile-time optimisation: when CONFIG_OF is
disabled, dev-of_node exists but will always be NULL - but the
compiler doesn't know this.  Your suggestion above would tell the
compiler that when CONFIG_OF is disabled, OF_NODE_ENABLED() will
evaluate to a constant false, which means it can eliminate code.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-01 Thread Russell King - ARM Linux
On Tue, Jul 01, 2014 at 05:42:42PM +0100, Måns Rullgård wrote:
 Russell King rmk+ker...@arm.linux.org.uk writes:
 
  ARMv6 and greater introduced a new instruction (bx) which can be used
  to return from function calls.  Recent CPUs perform better when the
  bx lr instruction is used rather than the mov pc, lr instruction,
  and this sequence is strongly recommended to be used by the ARM
  architecture manual (section A.4.1.1).
 
  We provide a new macro ret with all its variants for the condition
  code which will resolve to the appropriate instruction.
 
 When the source register is not lr the name ret is a misnomer since
 only the bx lr instruction is predicted as a function return.  The
 bx instruction with other source registers uses the normal prediction
 mechanisms, leaving the return stack alone, and should not be used for
 function returns.  Any code currently using another register to return
 from a function should probably be modified to use lr instead, unless
 there are special reasons for doing otherwise.  If code jumping to an
 address in a non-lr register is not a return, using the ret name will
 make for some rather confusing reading.
 
 I would suggest either using a more neutral name than ret or adding an
 alias to be used for non-return jumps so as to make the intent clearer.

If you read the patch, the branches which are changed are those which
do indeed return in some way.  There are those which do this having
moved lr to a different register.

As you point out, bx lr /may/ be treated specially (I've actually been
discussing this with Will Deacon over the last couple of days, who has
also been talking to the hardware people in ARM, and Will is happy with
this patch as in its current form.)  This is why I've changed all
mov pc, reg instructions which return in some way to use this macro,
and left others (those which are used to call some function and return
back to the same point) alone.

I have thought about introducing a call macro for those other sites,
but as there are soo few of them, I've left them as-is for the time
being (this patch is already large enough.)

If there are any in the patch which you have specific concerns about,
please specifically point them out those which give you a concern.

Thanks.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/4] crypto: Add Allwinner Security System crypto accelerator

2014-06-22 Thread Russell King - ARM Linux
On Sun, Jun 22, 2014 at 02:23:15PM +0200, Marek Vasut wrote:
 On Sunday, June 22, 2014 at 01:58:08 PM, Corentin LABBE wrote:
 [...]
   + * This program is free software; you can redistribute it and/or modify
   + * it under the terms of the GNU General Public License as published by
   + * the Free Software Foundation version 2 of the License
   
   The license text seems incomplete.
   [...]
  
  I will replace it with a simplier line Licensed under the GPL-2.
 
 I'd suggest you to use the SPDX license identifiers then, but that's not 
 something the kernel crowd agreed upon yet IIRC.Therefore , just make the 
 text 
 complete please.

Marek,

The full text of the first paragraph (in COPYING) is:

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

which is deemed to be entirely sufficient; quoting the full text is
discouraged, especially when it includes the FSF address.  It is also
acceptable to restrict it to version 2 only, in which case something
like this can be used:

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License version 2 as
   published by the Free Software Foundation.

Of course, it's up to the author to make up their own mind at the end
of the day.

 Oh, ok, good question -- dear list, shall one use sg_page()+kmap or sg_virt()?

sg_page() + kmap() is preferred, because sg_virt() fails with highmem.
Using sg_virt() means you restrict the driver to non-highmem memory,
and if the kernel wants to place the data into a highmem page, it will
have to use bounce buffers (so it's inefficient).

What's even better is to use the scatterlist iterator, which will handle
this for you.  See the sg_miter_*() functions.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/4] crypto: Add Allwinner Security System crypto accelerator

2014-06-22 Thread Russell King - ARM Linux
On Tue, Jun 10, 2014 at 02:43:14PM +0200, LABBE Corentin wrote:
 +int sunxi_aes_poll(struct ablkcipher_request *areq)
 +{
...
 + if (areq-src == NULL || areq-dst == NULL) {
 + dev_err(ss-dev, ERROR: Some SGs are NULL %u\n, areq-nbytes);
 + return -1;

You return -1 from here quite frequently.  Have you verified that there
is no way for this return value to get to userspace?

Even if it can't get to userspace now, what if someone modifies the code
so it can be returned?

I detest crap that's written with return -1 in it because it looks like
there is no care with the established error handling semantics for the
kernel (small negative numbers are errno codes.)

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 19/51] DMA-API: media: dt3155v4l: replace dma_set_mask()+dma_set_coherent_mask() with new helper

2013-10-31 Thread Russell King - ARM Linux
On Thu, Oct 31, 2013 at 09:46:40AM -0200, Mauro Carvalho Chehab wrote:
 Hi Russell,
 
 Em Mon, 30 Sep 2013 13:57:47 +0200
 Hans Verkuil hverk...@xs4all.nl escreveu:
 
  On 09/19/2013 11:44 PM, Russell King wrote:
   Replace the following sequence:
   
 dma_set_mask(dev, mask);
 dma_set_coherent_mask(dev, mask);
   
   with a call to the new helper dma_set_mask_and_coherent().
   
   Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk
  
  Acked-by: Hans Verkuil hans.verk...@cisco.com
 
 Somehow, I lost your original post (I got unsubscribed on a few days 
 from all vger mailing lists at the end of september).
 
 I suspect that you want to sent this via your tree, right?

Yes please.

 If so:
 
 Acked-by: Mauro Carvalho Chehab m.che...@samsung.com

Added, thanks.

   - err = dma_set_mask(pdev-dev, DMA_BIT_MASK(32));
   - if (err)
   - return -ENODEV;
   - err = dma_set_coherent_mask(pdev-dev, DMA_BIT_MASK(32));
   + err = dma_set_mask_and_coherent(pdev-dev, DMA_BIT_MASK(32));
 if (err)
 return -ENODEV;

One thing I've just noticed is that return should be return err not
return -ENODEV - are you okay for me to change that in this patch?

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/3] ARM: NEON based fast(er) AES in CBC/CTR/XTS modes

2013-10-04 Thread Russell King - ARM Linux
On Fri, Oct 04, 2013 at 08:04:50PM +0200, Ard Biesheuvel wrote:
 First of all, please note that the whole point of working so closely
 with the OpenSSL maintainer on this is that the version I am
 presenting here is the verbatim output of the Perl script that lives
 in the OpenSSL tree. So just shipped, not shipped and hacked.
 
 Personally, I would much prefer merging the .pl file as well, I just
 thought (and I did poll some people informally) that this is not
 something most people are happy about. If I am wrong about this, than
 I am quite happy to respin so the .S is generated on the fly.

While it is desirable to keep the dependencies on external tools to a
minimum, as perl is already on the list of required dependencies, there
is no problem with including a script.

Also, remember that the GPL says:

The source code for a work means the preferred form of the work for
making modifications to it.

So here's the question: is the assembly code the perferred form to make
modifications?  From what you're saying above, the answer to that seems
to be no.  Now, I'm not going to throw toys out of the pram and say
that this is a hard requirement, but just take a moment to think about
how we treat vendors who don't do this, instead supplying non-preferred
forms of source code, and think about whether there's double standards
here.

Now, while I can imagine that people have an ideological objection to
perl (using comments like write only code etc) that's not a good
enough excuse to avoid including the perferred form for future
modification.

Now, we have mechanisms in the kernel build where we can include a
prepared source which can be used to lessen the burden on the toolset
required to build the kernel.  So, including both the perl script and
the pre-generated assembly is entirely acceptable.  This tends to
nullify the excuse that we don't want to add additional tool burden
to kbuild argument.

So, what I'd strongly recommend is that we add both the pre-generated
assembly with a _shipped suffix, the perl script, and include a rule
like this:

quiet_cmd_perl = PERL$@
  cmd_perl = perl $  $@

$(src)/blahblah.S_shipped: $(src)/myperlscript
$(call cmd,perl)

and that should end up running myperlscript whenever it has a date
stamp newer than the _shipped file, or if that file is missing.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/3] ARM: NEON based fast(er) AES in CBC/CTR/XTS modes

2013-10-04 Thread Russell King - ARM Linux
On Fri, Oct 04, 2013 at 02:34:01PM -0400, Nicolas Pitre wrote:
 Do you have an example of something that does require perl to build the 
 kernel on ARM?  I was under the impression that people try to avoid it 
 as much as possible in general.
 
 I'm personally sitting on the fence between effectively adding a new 
 kernel build dependencies or carrying the output of the perl script.  
 But if the kernel build does already require perl in practice then this 
 might tip the balance.

That is really not a concern with the modern kbuild.  It has supported
having shipped versions of generated files included in the source tree
for years.  So, we can have the perl version included (the preferred
form for editing) while avoiding the issue of requiring everyone to
have perl.

In other words, if you want to build a kernel, you don't need perl for
this.  If you want to edit the preferred form, then you do need perl
so that the preferred form can be turned into assembly to update the
shipped version.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/3] ARM: NEON based fast(er) AES in CBC/CTR/XTS modes

2013-10-04 Thread Russell King - ARM Linux
On Fri, Oct 04, 2013 at 08:41:35PM +0200, Ard Biesheuvel wrote:
 On 4 October 2013 20:34, Nicolas Pitre nicolas.pi...@linaro.org wrote:
  On Fri, 4 Oct 2013, Will Deacon wrote:
 [...]
 
  Why do you consider it unsuitable to ship the perl script with the kernel?
  Perl 5 is already documented as a build dependency in Documentation/Changes
 
  Do you have an example of something that does require perl to build the
  kernel on ARM?  I was under the impression that people try to avoid it
  as much as possible in general.
 
  I'm personally sitting on the fence between effectively adding a new
  kernel build dependencies or carrying the output of the perl script.
  But if the kernel build does already require perl in practice then this
  might tip the balance.
 
 
 I like Russell's suggestion the most, in fact. In this case, the build
 time requirement for Perl effectively gets suspended until you start
 making modifications to the perl script, and the relation between the
 .S and the .pl files is made explicit by the make rule.
 
 Should I put the cmd_perl rule in scripts/Makefile.build ? Or can I
 just keep it under arch/arm/crypto ?

Just running through the Makefiles, it seems we have a fair amount
of stuff already using perl in various ways.  So I wouldn't worry too
much about where it's placed.  It's probably something that should
eventually end up in scripts/ at _some_ point.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/51] DMA mask changes

2013-09-27 Thread Russell King - ARM Linux
On Thu, Sep 26, 2013 at 10:23:08PM +0200, Rafał Miłecki wrote:
 2013/9/19 Russell King - ARM Linux li...@arm.linux.org.uk:
  This email is only being sent to the mailing lists in question, not to
  anyone personally.  The list of individuals is far to great to do that.
  I'm hoping no mailing lists reject the patches based on the number of
  recipients.
 
 Huh, I think it was enough to send only 3 patches to the b43-dev@. Like:
 [PATCH 01/51] DMA-API: provide a helper to set both DMA and coherent DMA masks
 [PATCH 14/51] DMA-API: net: b43: (...)
 [PATCH 15/51] DMA-API: net: b43legacy: (...)
 ;)
 
 I believe Joe has some nice script for doing it that way. When fixing
 some coding style / formatting, he sends only related patches to the
 given ML.

If I did that, then the mailing lists would not get the first patch,
because almost none of the lists would be referred to by patch 1.

Moreover, people complain when they don't have access to the full
patch series - they assume patches are missing for some reason, and
they ask for the rest of the series.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [alsa-devel] [PATCH 43/51] DMA-API: dma: edma.c: no need to explicitly initialize DMA masks

2013-09-23 Thread Russell King - ARM Linux
On Mon, Sep 23, 2013 at 03:55:33PM +0530, Vinod Koul wrote:
 On Fri, Sep 20, 2013 at 12:15:39AM +0100, Russell King wrote:
  register_platform_device_full() can setup the DMA mask provided the
  appropriate member is set in struct platform_device_info.  So lets
  make that be the case.  This avoids a direct reference to the DMA
  masks by this driver.
  
  Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk
 Acked-by: Vinod Koul vinod.k...@intel.com
 
 This also brings me question that should we force the driver to use the
 dma_set_mask_and_coherent() API or they have below flexiblity too?

There's two issues here:
1. dma_set_mask_and_coherent() will only work if dev-dma_mask points at
   some storage for the mask.  This needs to have .dma_mask in the
   platform_device_info initialised.

2. Yes, this driver should also be calling the appropriate DMA mask setting
   functions in addition to having the mask initialized at device creation
   time.

Here's a replacement patch, though maybe it would be better to roll all
the additions of dma_set_mask_and_coherent() in drivers/dma into one
patch?  In other words, combine the addition of this with these two
patches:

dma: pl330: add dma_set_mask_and_coherent() call
dma: pl08x: add dma_set_mask_and_coherent() call

8=
From: Russell King rmk+ker...@arm.linux.org.uk
Subject: [PATCH] DMA-API: dma: edma.c: no need to explicitly initialize DMA
 masks

register_platform_device_full() can setup the DMA mask provided the
appropriate member is set in struct platform_device_info.  So lets
make that be the case.  This avoids a direct reference to the DMA
masks by this driver.

While here, add the dma_set_mask_and_coherent() call which the DMA API
requires DMA-using drivers to call.

Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk
---
 drivers/dma/edma.c |   10 ++
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index ff50ff4..fd5e48c 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -631,6 +631,10 @@ static int edma_probe(struct platform_device *pdev)
struct edma_cc *ecc;
int ret;
 
+   ret = dma_set_mask_and_coherent(pdev-dev, DMA_BIT_MASK(32));
+   if (ret)
+   return ret;
+
ecc = devm_kzalloc(pdev-dev, sizeof(*ecc), GFP_KERNEL);
if (!ecc) {
dev_err(pdev-dev, Can't allocate controller\n);
@@ -702,11 +706,13 @@ static struct platform_device *pdev0, *pdev1;
 static const struct platform_device_info edma_dev_info0 = {
.name = edma-dma-engine,
.id = 0,
+   .dma_mask = DMA_BIT_MASK(32),
 };
 
 static const struct platform_device_info edma_dev_info1 = {
.name = edma-dma-engine,
.id = 1,
+   .dma_mask = DMA_BIT_MASK(32),
 };
 
 static int edma_init(void)
@@ -720,8 +726,6 @@ static int edma_init(void)
ret = PTR_ERR(pdev0);
goto out;
}
-   pdev0-dev.dma_mask = pdev0-dev.coherent_dma_mask;
-   pdev0-dev.coherent_dma_mask = DMA_BIT_MASK(32);
}
 
if (EDMA_CTLRS == 2) {
@@ -731,8 +735,6 @@ static int edma_init(void)
platform_device_unregister(pdev0);
ret = PTR_ERR(pdev1);
}
-   pdev1-dev.dma_mask = pdev1-dev.coherent_dma_mask;
-   pdev1-dev.coherent_dma_mask = DMA_BIT_MASK(32);
}
 
 out:
-- 
1.7.4.4


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 36/51] DMA-API: usb: use dma_set_coherent_mask()

2013-09-23 Thread Russell King - ARM Linux
On Mon, Sep 23, 2013 at 02:27:39PM -0400, Alan Stern wrote:
 On Thu, 19 Sep 2013, Russell King wrote:
 
  The correct way for a driver to specify the coherent DMA mask is
  not to directly access the field in the struct device, but to use
  dma_set_coherent_mask().  Only arch and bus code should access this
  member directly.
  
  Convert all direct write accesses to using the correct API.
  
  Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk
 
  diff --git a/drivers/usb/host/ehci-platform.c 
  b/drivers/usb/host/ehci-platform.c
  index f6b790c..5b0cd2d 100644
  --- a/drivers/usb/host/ehci-platform.c
  +++ b/drivers/usb/host/ehci-platform.c
 
  @@ -91,8 +91,9 @@ static int ehci_platform_probe(struct platform_device 
  *dev)
  dev-dev.platform_data = ehci_platform_defaults;
  if (!dev-dev.dma_mask)
  dev-dev.dma_mask = dev-dev.coherent_dma_mask;
  -   if (!dev-dev.coherent_dma_mask)
  -   dev-dev.coherent_dma_mask = DMA_BIT_MASK(32);
  +   err = dma_set_coherent_mask(dev-dev, DMA_BIT_MASK(32));
  +   if (err)
  +   return err;
   
  pdata = dev_get_platdata(dev-dev);
 
 ehci-platform.c is a generic file, intended for use by multiple
 platforms.  Is it not possible that on some platforms, the arch or bus
 code may already have initialized the DMA masks?  Isn't something like 
 this (i.e., DMA bindings) planned for Device Tree?
 
 By eliminating the tests above and calling dma_set_coherent_mask or
 dma_coerce_mask_and_coherent unconditionally, this patch (and the next)
 would overwrite those initial settings.
 
 I don't know to what extent the same may be true for the other,
 platform-specific, drivers changed by this patch.  But it's something 
 to be aware of.

Please check the DMA API documentation:

=
For correct operation, you must interrogate the kernel in your device
probe routine to see if the DMA controller on the machine can properly
support the DMA addressing limitation your device has.  It is good
style to do this even if your device holds the default setting,
because this shows that you did think about these issues wrt. your
device.

The query is performed via a call to dma_set_mask():

int dma_set_mask(struct device *dev, u64 mask);

The query for consistent allocations is performed via a call to
dma_set_coherent_mask():

int dma_set_coherent_mask(struct device *dev, u64 mask);
=

So, All drivers which use DMA are supposed to issue the appropriate
calls to check the DMA masks before they perform DMA, even if the
default is correct.

And yes, this is all part of sorting out the DT mess - we should
start not from the current mess (which is really totally haphazard)
but from the well established position of how the DMA API _should_
be used.  What that means is that all drivers should be issuing
these calls as appropriate today.

The default mask setup when the device is created is just that -
it's a default mask, and it should not be used to decide anything
about the device.  That's something which the driver should compute
on its own accord and then inform the various other layers via the
appropriate call.

Remember, on PCI, even when we have 64-bit, we do not declare PCI
devices with a 64-bit DMA mask: that's up to PCI device drivers to
_try_ to set a 64-bit DMA mask, which when successful _allows_ them
to use 64-bit DMA.  If it fails, they have to only use 32-bit.  If
they want a smaller mask, the _driver_ has to set the smaller mask,
not the device creating code.

The reason we're into this (particularly on ARM) is that we got lazy
because we could get by with declaring a DMA mask at device creation
time, since all devices were statically declared.  Now it's time to
get rid of those old lazy ways and start doing things correctly and
to the requirements of the APIs.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 42/51] DMA-API: usb: musb: use platform_device_register_full() to avoid directly messing with dma masks

2013-09-20 Thread Russell King - ARM Linux
On Fri, Sep 20, 2013 at 08:11:25AM -0500, Felipe Balbi wrote:
 Hi,
 
 On Fri, Sep 20, 2013 at 12:14:38AM +0100, Russell King wrote:
  Use platform_device_register_full() for those drivers which can, to
  avoid messing directly with DMA masks.  This can only be done when
  the driver does not need to access the allocated musb platform device
  from within its callbacks, which may be called during the musb
  device probing.
  
  Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk
 
 you want me to carry this one through my tree or you prefer getting my
 Acked-by ? Either way works for me:
 
 Acked-by: Felipe Balbi ba...@ti.com
 
 there's also the third option of me setting up a branch with only this
 patch and we both merge it, that'd also work.

I think this patch is sufficiently stand-alone that it should be fine
if you want to take it through your tree.  That may be better in the
long run to avoid conflicts with this patch and any future work in
this area during this cycle.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 39/51] DMA-API: others: use dma_set_coherent_mask()

2013-09-20 Thread Russell King - ARM Linux
On Fri, Sep 20, 2013 at 07:16:52AM -0500, Tejun Heo wrote:
 On Fri, Sep 20, 2013 at 12:11:38AM +0100, Russell King wrote:
  The correct way for a driver to specify the coherent DMA mask is
  not to directly access the field in the struct device, but to use
  dma_set_coherent_mask().  Only arch and bus code should access this
  member directly.
  
  Convert all direct write accesses to using the correct API.
  
  Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk
 
 Acked-by: Tejun Heo t...@kernel.org
 
 The patch is pretty widely spread.  I don't mind how it gets routed
 but what's the plan?

The plan is... I'm going to try and avoid going through the hell of
re-posting this patch series to all the recipients another time...
(It's taken some 17 hours and lots of hand holding to get this patch
set out without exim jumping off a cliff into deep OOM - soo deep that
even the OOM killer doesn't run and the CPU is 100% idle because
every single process stuck in an uninterruptible sleep waiting for
every other process to free some memory - ouch!)

I know that dealing with this patch set will be a problem due to how
widespread this is, but much of the driver level changes come down to
depending on a couple of patches.  One solution would be if I published
a branch with just the dependencies in, which subsystem maintainers
could pull, and then apply the appropriate patches on top.

Another would be if subsystem maintainers are happy that I carry them,
I can add the acks, and then later on towards the end of the cycle,
provide a branch subsystem maintainers could pull.

Or... if you can think of something easier...
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/51] DMA-API: provide a helper to set both DMA and coherent DMA masks

2013-09-20 Thread Russell King - ARM Linux
On Fri, Sep 20, 2013 at 02:21:37AM +0100, Ben Hutchings wrote:
 On Thu, 2013-09-19 at 22:25 +0100, Russell King wrote:
 [...]
  -dma_set_coherent_mask() will always be able to set the same or a
  -smaller mask as dma_set_mask(). However for the rare case that a
  +The coherent coherent mask will always be able to set the same or a
  +smaller mask as the streaming mask. However for the rare case that a
 [...]
 
 The new wording doesn't make sense; a mask doesn't set itself.  I would
 suggest:
 
 The coherent mask can always be set to the same or a smaller mask than
 the streaming mask.

Yes, the original sentence is not particularly good, but I think even
your modified version can be interpreted as a mask setting itself
for all the same reasons that the original can be (which mask does same
refer to?)

Even so, I prefer your version.  Thanks. :)
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/51] DMA mask changes

2013-09-19 Thread Russell King - ARM Linux
This started out as a request to look at the DMA mask situation, and how
to solve the issues which we have on ARM - notably how the DMA mask
should be setup.

However, I started off reviewing how the dma_mask and coherent_dma_mask
was being used, and what I found was rather messy, and in some cases
rather buggy.  I tried to get some of the bug fixes in before the last
merge window, but it seems that the maintainers preferred to have the
full solution rather than a simple -rc suitable bug fix.

So, this is an attempt to clean things up.

The first point here is that drivers performing DMA should be calling
dma_set_mask()/dma_set_coherent_mask() in their probe function to verify
that DMA can be performed.  Lots of ARM drivers omit this step; please
refer to the DMA API documentation on this subject.

What this means is that the DMA mask provided by bus code is a default
value - nothing more.  It doesn't have to accurately reflect what the
device is actually capable of.  Apart from the storage for dev-dma_mask
being initialised for any device which is DMA capable, there is no other
initialisation which is strictly necessary at device creation time.

Now, these cleanups address two major areas:
1. The setting of DMA masks, particularly when both the coherent and
   streaming DMA masks are set together.

2. The initialisation of DMA masks by drivers - this seems to be becoming
   a popular habbit, one which may not be entirely the right solution.
   Rather than having this scattered throughout the tree, I've pulled
   that into a central location (and called it coercing the DMA mask -
   because it really is about forcing the DMA mask to be that value.)

3. Finally, addressing the long held misbelief that DMA masks somehow
   correspond with physical addresses.  We already have established
   long ago that dma_addr_t values returned from the DMA API are the
   values which you program into the DMA controller, and so are the
   bus addresses.  It is _only_ sane that DMA masks are also bus
   related too, and not related to physical address spaces.

(3) is a very important point for LPAE systems, which may still have
less than 4GB of memory, but this memory is all located above the 4GB
physical boundary.  This means with the current model, any device
using a 32-bit DMA mask fails - even though the DMA controller is
still only a 32-bit DMA controller but the 32-bit bus addresses map
to system memory.  To put it another way, the bus addresses have a
4GB physical offset on them.

This email is only being sent to the mailing lists in question, not to
anyone personally.  The list of individuals is far to great to do that.
I'm hoping no mailing lists reject the patches based on the number of
recipients.

Patches based on v3.12-rc1.

 Documentation/DMA-API-HOWTO.txt   |   37 +--
 Documentation/DMA-API.txt |8 +++
 arch/arm/include/asm/dma-mapping.h|8 +++
 arch/arm/mm/dma-mapping.c |   49 ++--
 arch/arm/mm/init.c|   12 +++---
 arch/arm/mm/mm.h  |2 +
 arch/powerpc/kernel/vio.c |3 +-
 block/blk-settings.c  |8 ++--
 drivers/amba/bus.c|6 +--
 drivers/ata/pata_ixp4xx_cf.c  |5 ++-
 drivers/ata/pata_octeon_cf.c  |5 +-
 drivers/block/nvme-core.c |   10 ++---
 drivers/crypto/ixp4xx_crypto.c|   48 ++--
 drivers/dma/amba-pl08x.c  |5 ++
 drivers/dma/dw/platform.c |8 +--
 drivers/dma/edma.c|6 +--
 drivers/dma/pl330.c   |4 ++
 drivers/firmware/dcdbas.c |   23 +-
 drivers/firmware/google/gsmi.c|   13 +++--
 drivers/gpu/drm/exynos/exynos_drm_drv.c   |6 ++-
 drivers/gpu/drm/omapdrm/omap_dmm_tiler.c  |5 +-
 drivers/media/platform/omap3isp/isp.c |6 +-
 drivers/media/platform/omap3isp/isp.h |3 -
 drivers/mmc/card/queue.c  |3 +-
 drivers/mmc/host/sdhci-acpi.c |5 +-
 drivers/net/ethernet/broadcom/b44.c   |3 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |8 +---
 drivers/net/ethernet/brocade/bna/bnad.c   |   13 ++
 drivers/net/ethernet/emulex/benet/be_main.c   |   12 +
 drivers/net/ethernet/intel/e1000/e1000_main.c |9 +---
 drivers/net/ethernet/intel/e1000e/netdev.c|   18 +++-
 drivers/net/ethernet/intel/igb/igb_main.c |   18 +++-
 drivers/net/ethernet/intel/igbvf/netdev.c |   18 +++-
 drivers/net/ethernet/intel/ixgb/ixgb_main.c   |   16 ++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 

Re: [RFC PATCH 2/2] arm64: add support for AES using ARMv8 Crypto Extensions

2013-09-15 Thread Russell King - ARM Linux
On Sat, Sep 14, 2013 at 05:11:53PM +0300, Jussi Kivilinna wrote:
 On 14.09.2013 16:30, Ard Biesheuvel wrote:
  On 14 September 2013 10:08, Jussi Kivilinna jussi.kivili...@iki.fi wrote:
  On 13.09.2013 18:08, Ard Biesheuvel wrote:
  This adds ARMv8 Crypto Extensions based implemenations of
  AES in CBC, CTR and XTS mode.
 
  Signed-off-by: Ard Biesheuvel ard.biesheu...@linaro.org
  ---
  ..snip..
  +static int xts_set_key(struct crypto_tfm *tfm, const u8 *in_key,
  +unsigned int key_len)
  +{
  + struct crypto_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
  + u32 *flags = tfm-crt_flags;
  + int ret;
  +
  + ret = crypto_aes_expand_key(ctx-key1, in_key, key_len/2);
  + if (!ret)
  + ret = crypto_aes_expand_key(ctx-key2, in_key[key_len/2],
  + key_len/2);
 
  Use checkpatch.
 
  
  Um, I did get a bunch of errors and warnings from checkpatch.pl tbh,
  put not in this particular location. Care to elaborate?
  
 
 Well, the checkpatch.pl I had stored to brain had become corrupted and kept
 saying that you need spaces around all operators. But apparently spaces are
 just required for assignment operators.

checkpatch is not definitive.  It is merely an implementation of the
coding style.  The coding style is the definitive documentation, and
it says about this:

| Use one space around (on each side of) most binary and ternary operators,
| such as any of these:
|
|=  +  -  *  /  %  |^  =  =  ==  !=  ?  :
|
| but no space after unary operators:
|  *  +  -  ~  !  sizeof  typeof  alignof  __attribute__  defined
|
| no space before the postfix increment  decrement unary operators:
|++  --
|
| no space after the prefix increment  decrement unary operators:
|++  --
|
| and no space around the '.' and - structure member operators.

So, you're quite right that the above is wrong.  It needs spaces around the
/ operators.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] crypto: ux500/cryp - Set DMA configuration though dma_slave_config()

2013-04-26 Thread Russell King - ARM Linux
On Fri, Apr 26, 2013 at 01:46:46PM +0530, Vinod Koul wrote:
 On Fri, Apr 26, 2013 at 10:28:39AM +0200, Linus Walleij wrote:
  On Thu, Apr 25, 2013 at 4:11 PM, Arnd Bergmann a...@arndb.de wrote:
  
   The dma engine driver must know the address in its dma space, while the
   slave driver has it available in physical space. These two are often the
   same, but there is no generic way to convert between the two, especially
   if the dma engine resides behind an IOMMU.
  
   The best assumption we can make is that the dma engine driver knows
   how to convert between the two. Interestingly the documentation for
   dma_slave_config talks about physical address, while the structure
   itself uses a dma_addr_t. Linus Walleij introduced the structure in
   c156d0a5b0 DMAENGINE: generic slave channel control v3, so I assume
   he can shed some light on what he was thinking. I assume the documentation
   is right but the structure is not and should be converted to use
   phys_add_t or resource_size_t.
  
  OK I could cook a patch for that, but I think I need some input from
  Vinod and/or Russell on this.
 the dma_slave_config is physical address that should be passed directly to the
 controller. Obviosuly it should phys_addr_t :)

What you've just said is actually confusing.

physical address is normally the term used to describe the addresses
seen to the RAM.  phys_addr_t describes this.  This is not necessarily
what needs to be programmed into the DMA controller.

For RAM addresses, they must be mapped via the DMA API - and this gives
you a dma_addr_t.

DMA address is the address to be programmed into a DMA controller to
access a particular address in RAM or device, and has type dma_addr_t.
When you're programming a DMA controller to access a device, you are
clearly telling it the address on the _DMA controller's bus_ to access
that register, which may or may not be the same as the physical address.

There are platforms in existence where phys_addr_t can be 32-bit but
dma_addr_t can be 64-bit.  Getting this stuff wrong can cause problems.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] crypto: ux500/cryp - Set DMA configuration though dma_slave_config()

2013-04-26 Thread Russell King - ARM Linux
On Fri, Apr 26, 2013 at 11:39:20AM +0200, Arnd Bergmann wrote:
 On Friday 26 April 2013 13:46:46 Vinod Koul wrote:
  
  The mapping  unmapping of dma buffer (memcpy and memory buffer in this 
  txn) is
  required to be performed by the client driver. The dmanegine or dmaengine 
  driver
  will not do that for you...
 
 I've been wondering about this part: since we need to pass the 'struct 
 device' of
 the dma engine (rather than the slave) into dma_map_single, what is the 
 official
 way to do that? Should the slave driver just rely on dma_chan-device-dev to
 work correctly for the purpose of dma-mapping.h interfaces?

Yes.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mv_cesa dcache problem since 2.6.37 was: Re: mv_cesa hash functions

2012-05-06 Thread Russell King - ARM Linux
On Sun, May 06, 2012 at 12:49:30AM +0200, Simon Baatz wrote:
 Am 23.02.2012 19:34, schrieb Phil Sutter:
  But you might suffer from another problem, which is only present on
  ARM machines with VIVT cache and linux = 2.6.37: due to commit
  f8b63c1, ARM: 6382/1: Remove superfluous flush_kernel_dcache_page()
  which prevents pages being flushed from inside the scatterlist
  iterator API. This patch seems to introduce problems in other places
  (namely NFS), too but I sadly did not have time to investigate this
  further. I will post a possible (cryptodev-internal) solution to
  cryptodev-linux-de...@gna.org, maybe this fixes the problem with
  openssl. Greetings, Phil 

Well, the reason it has been removed is as follows:

When pages are allocated, they have a special flag bit called PG_arch_1
cleared by the page allocator.

We used to use PG_arch_1 to indicate that the page contained dirty cache
lines - so by default, the page was marked clean.

Normal kernel behaviour upon allocating a page into the page cache is to
read data off the disk into the new page cache page, calling
flush_kernel_dcache_page() in the process.  This would ensure that data
would be pushed out of the cache lines associated with the kernel mapping
of the page into RAM.

This means when we map the page into userspace, we can be sure that the new
mapping will see the data written there.  Moreover, when we place a page
into a user process, we check this PG_arch_1 bit, and if it's set, we
flush the kernel mapping cache lines at this point as well.

Patch 6379/1 changed this behaviour: PG_arch_1 being _clear_ now indicates
that the page contains dirty data.

What this means is that when a page is placed into userspace, we check the
PG_arch_1 bit.  If it is clear, we flush the kernel mapping for the page
to push dirty data out to RAM.  This makes the flush in
flush_kernel_dcache_page() entirely redundant.

 since there has been no reaction on this, I would like to bring this
 issue up again (I sadly don't have the expertise to investigate this
 further...). The issue is not limited to cryptodev, but seems to be
 either a problem with commit f8b63c1 or a problem in mv_cesa that was
 uncovered by this commit.

Can you reproduce it with anything else?

It could be a problem with the way crypto stuff works - I've never had
any dealings with that subsystem, so I really can't comment.  If crypto
uses scatterlists, and walks them with the standard API, and uses
scatterlists with pages which are already mapped into userspace, then
I can see how the above commit would make things go wrong.

So, without knowledge of the crypto subsystem, I'm not sure I can help
sort this out.

If it's possible to reproduce with NFS, and it seems sorted in the latest
kernel, that's probably because something changed with NFS - NFS did for
a time have issues on ARM for various reasons (because of remapping kernel
memory into vmalloc space and expecting it to be DMA coherent...) and that
got fixed.  Whether that's why you're not seeing problems with v3.4-rc5,
I couldn't be sure unless you did a bisection between the bad and good
kernel versions.  That would at least allow us to confirm that that issue
has been properly resolved.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm: new W macro to WORD_ACCESS

2009-09-16 Thread Russell King - ARM Linux
On Wed, Sep 16, 2009 at 07:58:12PM +0200, Sebastian Andrzej Siewior wrote:
 leads to a build error because the crypto/cast6.c defines a function
 which is named W.

W has nothing to do with the access size, so this change makes it _really_
confusing.  What it's about is telling the compiler to use the ARM 32-bit
version of the instruction coding when building for Thumb2 (which
can intermix Thumb and ARM instruction sets.)

So the macro needs to be renamed... NAK on this patch as it stands.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm/orion5x: add sram support for crypto

2009-06-11 Thread Russell King - ARM Linux
On Thu, Jun 11, 2009 at 11:07:34PM +0200, Sebastian Andrzej Siewior wrote:
 If you thing it is too early I can keep hacking in my own git tree until
 I get the dmac_flush_range() hack out or so.

The problem that I percieve with these kinds of hacks is that they
tend to spread into other code, and then we end up with problems
when new architectures come along.

For these interfaces, I am a strong believer in purpose-defined
interfaces to caches and the like.  If what we have doesn't provide
what's required, we need to provide something else.

So, the question is what are you trying to do with this
dmac_flush_range() and your SRAM?  Are you trying to achieve I/D cache
coherency for data written there, so you can execute it later, or are
you trying to ensure DMA coherency from SRAM?
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: fix handling of sg buffers in ixp4xx driver

2009-03-02 Thread Russell King - ARM Linux
On Mon, Mar 02, 2009 at 12:45:12PM +0100, Christian Hohnstaedt wrote:
 
  - keep dma functions away from chained scatterlists.
Use the existing scatterlist iteration inside the driver
to call dma_map_single() for each chunk and avoid dma_map_sg().

Hmm, interesting.  So crypto has its own scatterlist chaining stuff which
is different and incompatible from what is supported by the rest of the
kernel.

That's really confusing.  There's know way to know whether a struct scatterlist
should be walked by using the scatterwalk_* stuff or the generic stuff.
Excellent for reviewing.

So, sorry, I'm not qualified to review this.  Please find someone else
who knows about crypto stuff.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixp4xx_crypto panic with fragmented packets in scatterlist

2009-02-26 Thread Russell King - ARM Linux
On Wed, Feb 25, 2009 at 04:35:40PM +0100, Karl Hiramoto wrote:
 The attached patch fixes my issue, but am not sure if it is correct or
 will  cause problems else where.

 diff -Naurp linux-2.6.28.7.a/arch/arm/include/asm/scatterlist.h 
 linux-2.6.28.7.b/arch/arm/include/asm/scatterlist.h
 --- linux-2.6.28.7.a/arch/arm/include/asm/scatterlist.h   2009-02-20 
 23:41:27.0 +0100
 +++ linux-2.6.28.7.b/arch/arm/include/asm/scatterlist.h   2009-02-25 
 16:19:59.0 +0100
 @@ -24,4 +24,6 @@ struct scatterlist {
  #define sg_dma_address(sg)  ((sg)-dma_address)
  #define sg_dma_len(sg)  ((sg)-length)
  
 +#define ARCH_HAS_SG_CHAIN
 +

We can't merge this until _all_ of ARM has been fixed for walking
scatterlist chains.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ixp4xx_crypto panic with fragmented packets in scatterlist

2009-02-26 Thread Russell King - ARM Linux
On Thu, Feb 26, 2009 at 08:10:43PM +0800, Herbert Xu wrote:
 Russell King - ARM Linux li...@arm.linux.org.uk wrote:
 
  We can't merge this until _all_ of ARM has been fixed for walking
  scatterlist chains.
 
 Right, this is definitely not the way to fix this bug.  Because
 even if ARM completely supported chaining, you still have to fix
 all the other architectures as well.

I don't think you can use chained scatterlists unless the architecture
supports it.  It's not a case that you have to flatten the chaining
before passing it over to the arch - it seems you're not allowed to
create a chained scatterlist in the first place:

static inline void sg_chain(struct scatterlist *prv, unsigned int prv_nents,
struct scatterlist *sgl)
{
#ifndef ARCH_HAS_SG_CHAIN
BUG();
#endif
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html