Re: [PATCH 2/7] crypto: inside-secure: add SafeXcel EIP197 crypto engine driver

2017-04-18 Thread Antoine Tenart
Hi Robin,

On Wed, Apr 12, 2017 at 02:54:13PM +0100, Robin Murphy wrote:
> 
> Bit of a drive-by, but since I have it in my head that crypto drivers
> are a hotspot for dodgy DMA usage (in part due to the hardware often
> being a bit esoteric with embedded RAMs and such), this caught my eye
> and I thought I'd give this a quick once-over to check for anything
> smelly. Unfortunately, I was not disappointed... ;)

:-)

> On 29/03/17 13:44, Antoine Tenart wrote:
> [...]
> > diff --git a/drivers/crypto/inside-secure/safexcel.c 
> > b/drivers/crypto/inside-secure/safexcel.c
> > new file mode 100644
> > index ..00f3f2c85d05
> > --- /dev/null
> > +++ b/drivers/crypto/inside-secure/safexcel.c
> [...]
> > +int safexcel_invalidate_cache(struct crypto_async_request *async,
> > + struct safexcel_context *ctx,
> > + struct safexcel_crypto_priv *priv,
> > + dma_addr_t ctxr_dma, int ring,
> > + struct safexcel_request *request)
> > +{
> > +   struct safexcel_command_desc *cdesc;
> > +   struct safexcel_result_desc *rdesc;
> > +   phys_addr_t ctxr_phys;
> > +   int ret = 0;
> > +
> > +   ctxr_phys = dma_to_phys(priv->dev, ctxr_dma);
> 
> No no no. This is a SWIOTLB-specific (or otherwise arch-private,
> depending on implementation) helper, not a DMA API function, and should
> not be called directly by drivers. There is no guarantee it will give
> the result you expect, or even compile, in all cases (if the kbuild
> robot hasn't already revealed via the COMPILE_TEST dependency).
> 
> That said, AFAICS ctxr_phys ends up in a descriptor which is given to
> the device, so trying to use a physical address is wrong and it should
> still be the DMA address anyway.

I see. The cryptographic engine should always use dma addresses. I'll
update the descriptors structure members from "phys" to "dma" as well.

> [...]
> > +static int safexcel_probe(struct platform_device *pdev)
> > +{
> > +   struct device *dev = >dev;
> > +   struct resource *res;
> > +   struct safexcel_crypto_priv *priv;
> > +   int i, ret;
> > +
> > +   priv = devm_kzalloc(dev, sizeof(struct safexcel_crypto_priv),
> > +   GFP_KERNEL);
> > +   if (!priv)
> > +   return -ENOMEM;
> > +
> > +   priv->dev = dev;
> > +
> > +   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +   priv->base = devm_ioremap_resource(dev, res);
> > +   if (IS_ERR(priv->base)) {
> > +   dev_err(dev, "failed to get resource\n");
> > +   return PTR_ERR(priv->base);
> > +   }
> > +
> > +   priv->clk = of_clk_get(dev->of_node, 0);
> > +   if (!IS_ERR(priv->clk)) {
> > +   ret = clk_prepare_enable(priv->clk);
> > +   if (ret) {
> > +   dev_err(dev, "unable to enable clk (%d)\n", ret);
> > +   return ret;
> > +   }
> > +   } else {
> > +   /* The clock isn't mandatory */
> > +   if (PTR_ERR(priv->clk) == -EPROBE_DEFER)
> > +   return -EPROBE_DEFER;
> > +   }
> 
> You should call dma_set_mask_and_coherent() before any DMA API calls,
> both to confirm DMA really is going to work all, and also (since this IP
> apparently supports >32-bit addresses) to describe the full inherent
> addressing capability, not least to avoid wasting time/space with
> unnecessary bounce buffering otherwise.

OK, I'll add a call to this helper before using DMA API calls.

> > +
> > +err_pool:
> > +   dma_pool_destroy(priv->context_pool);
> 
> You used dmam_pool_create(), so not only is an explicit destroy
> unnecessary, but doing so with the non-managed version is actively
> harmful, since devres would end up double-freeing the pool after you return.

I saw this and fixed it in my local version.

> > +struct safexcel_token {
> > +   u32 packet_length:17;
> > +   u8 stat:2;
> > +   u16 instructions:9;
> > +   u8 opcode:4;
> > +} __packed;
> 
> Using bitfields in hardware descriptors seems pretty risky, since you've
> no real guarantee two compilers will lay them out the same way (or that
> either would be correct WRT what the hardware expects). I'd be more
> inclined to construct all these fields with explicit masking and shifting.

Hmm, I'm not sure to follow you here. If I use bitfields in a __packed
structure, we should be safe. Why would the compiler have a different
behaviour?

> > +static int safexcel_aes_send(struct crypto_async_request *async,
> > +int ring, struct safexcel_request *request,
> > +int *commands, int *results)
> > +{
> > +   struct ablkcipher_request *req = ablkcipher_request_cast(async);
> > +   struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
> > +   struct safexcel_crypto_priv *priv = ctx->priv;
> > +   struct safexcel_command_desc *cdesc;
> > +   struct safexcel_result_desc *rdesc;
> > +   struct scatterlist *sg;
> > +   phys_addr_t ctxr_phys;
> > +   int nr_src, nr_dst, n_cdesc = 0, 

Re: [PATCH 2/7] crypto: inside-secure: add SafeXcel EIP197 crypto engine driver

2017-04-12 Thread Robin Murphy
Hi Antoine,

Bit of a drive-by, but since I have it in my head that crypto drivers
are a hotspot for dodgy DMA usage (in part due to the hardware often
being a bit esoteric with embedded RAMs and such), this caught my eye
and I thought I'd give this a quick once-over to check for anything
smelly. Unfortunately, I was not disappointed... ;)

On 29/03/17 13:44, Antoine Tenart wrote:
[...]
> diff --git a/drivers/crypto/inside-secure/safexcel.c 
> b/drivers/crypto/inside-secure/safexcel.c
> new file mode 100644
> index ..00f3f2c85d05
> --- /dev/null
> +++ b/drivers/crypto/inside-secure/safexcel.c
[...]
> +int safexcel_invalidate_cache(struct crypto_async_request *async,
> +   struct safexcel_context *ctx,
> +   struct safexcel_crypto_priv *priv,
> +   dma_addr_t ctxr_dma, int ring,
> +   struct safexcel_request *request)
> +{
> + struct safexcel_command_desc *cdesc;
> + struct safexcel_result_desc *rdesc;
> + phys_addr_t ctxr_phys;
> + int ret = 0;
> +
> + ctxr_phys = dma_to_phys(priv->dev, ctxr_dma);

No no no. This is a SWIOTLB-specific (or otherwise arch-private,
depending on implementation) helper, not a DMA API function, and should
not be called directly by drivers. There is no guarantee it will give
the result you expect, or even compile, in all cases (if the kbuild
robot hasn't already revealed via the COMPILE_TEST dependency).

That said, AFAICS ctxr_phys ends up in a descriptor which is given to
the device, so trying to use a physical address is wrong and it should
still be the DMA address anyway.

> + spin_lock_bh(>ring[ring].egress_lock);
> +
> + /* prepare command descriptor */
> + cdesc = safexcel_add_cdesc(priv, ring, true, true, 0, 0, 0, ctxr_phys);
> + if (IS_ERR(cdesc)) {
> + ret = PTR_ERR(cdesc);
> + goto unlock;
> + }
> +
> + cdesc->control_data.type = EIP197_TYPE_EXTENDED;
> + cdesc->control_data.options = 0;
> + cdesc->control_data.refresh = 0;
> + cdesc->control_data.control0 = CONTEXT_CONTROL_INV_TR;
> +
> + /* prepare result descriptor */
> + rdesc = safexcel_add_rdesc(priv, ring, true, true, 0, 0);
> +
> + if (IS_ERR(rdesc)) {
> + ret = PTR_ERR(rdesc);
> + goto cdesc_rollback;
> + }
> +
> + request->req = async;
> + goto unlock;
> +
> +cdesc_rollback:
> + safexcel_ring_rollback_wptr(priv, >ring[ring].cdr);
> +
> +unlock:
> + spin_unlock_bh(>ring[ring].egress_lock);
> + return ret;
> +}
[...]
> +static int safexcel_probe(struct platform_device *pdev)
> +{
> + struct device *dev = >dev;
> + struct resource *res;
> + struct safexcel_crypto_priv *priv;
> + int i, ret;
> +
> + priv = devm_kzalloc(dev, sizeof(struct safexcel_crypto_priv),
> + GFP_KERNEL);
> + if (!priv)
> + return -ENOMEM;
> +
> + priv->dev = dev;
> +
> + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + priv->base = devm_ioremap_resource(dev, res);
> + if (IS_ERR(priv->base)) {
> + dev_err(dev, "failed to get resource\n");
> + return PTR_ERR(priv->base);
> + }
> +
> + priv->clk = of_clk_get(dev->of_node, 0);
> + if (!IS_ERR(priv->clk)) {
> + ret = clk_prepare_enable(priv->clk);
> + if (ret) {
> + dev_err(dev, "unable to enable clk (%d)\n", ret);
> + return ret;
> + }
> + } else {
> + /* The clock isn't mandatory */
> + if (PTR_ERR(priv->clk) == -EPROBE_DEFER)
> + return -EPROBE_DEFER;
> + }

You should call dma_set_mask_and_coherent() before any DMA API calls,
both to confirm DMA really is going to work all, and also (since this IP
apparently supports >32-bit addresses) to describe the full inherent
addressing capability, not least to avoid wasting time/space with
unnecessary bounce buffering otherwise.

> + priv->context_pool = dmam_pool_create("safexcel-context", dev,
> +   sizeof(struct 
> safexcel_context_record),
> +   1, 0);
> + if (!priv->context_pool) {
> + ret = -ENOMEM;
> + goto err_clk;
> + }
> +
> + safexcel_configure(priv);
> +
> + for (i = 0; i < priv->config.rings; i++) {
> + char irq_name[6] = {0}; /* "ringX\0" */
> + char wq_name[9] = {0}; /* "wq_ringX\0" */
> + int irq;
> + struct safexcel_ring_irq_data *ring_irq;
> +
> + ret = safexcel_init_ring_descriptors(priv,
> +  >ring[i].cdr,
> +  >ring[i].rdr);
> + if (ret)
> + goto err_pool;
> +
> + ring_irq = devm_kzalloc(dev, sizeof(struct 
>