Hi Robin,

On Wed, Apr 12, 2017 at 02:54:13PM +0100, Robin Murphy wrote:
> 
> Bit of a drive-by, but since I have it in my head that crypto drivers
> are a hotspot for dodgy DMA usage (in part due to the hardware often
> being a bit esoteric with embedded RAMs and such), this caught my eye
> and I thought I'd give this a quick once-over to check for anything
> smelly. Unfortunately, I was not disappointed... ;)

:-)

> On 29/03/17 13:44, Antoine Tenart wrote:
> [...]
> > diff --git a/drivers/crypto/inside-secure/safexcel.c 
> > b/drivers/crypto/inside-secure/safexcel.c
> > new file mode 100644
> > index 000000000000..00f3f2c85d05
> > --- /dev/null
> > +++ b/drivers/crypto/inside-secure/safexcel.c
> [...]
> > +int safexcel_invalidate_cache(struct crypto_async_request *async,
> > +                         struct safexcel_context *ctx,
> > +                         struct safexcel_crypto_priv *priv,
> > +                         dma_addr_t ctxr_dma, int ring,
> > +                         struct safexcel_request *request)
> > +{
> > +   struct safexcel_command_desc *cdesc;
> > +   struct safexcel_result_desc *rdesc;
> > +   phys_addr_t ctxr_phys;
> > +   int ret = 0;
> > +
> > +   ctxr_phys = dma_to_phys(priv->dev, ctxr_dma);
> 
> No no no. This is a SWIOTLB-specific (or otherwise arch-private,
> depending on implementation) helper, not a DMA API function, and should
> not be called directly by drivers. There is no guarantee it will give
> the result you expect, or even compile, in all cases (if the kbuild
> robot hasn't already revealed via the COMPILE_TEST dependency).
> 
> That said, AFAICS ctxr_phys ends up in a descriptor which is given to
> the device, so trying to use a physical address is wrong and it should
> still be the DMA address anyway.

I see. The cryptographic engine should always use dma addresses. I'll
update the descriptors structure members from "phys" to "dma" as well.

> [...]
> > +static int safexcel_probe(struct platform_device *pdev)
> > +{
> > +   struct device *dev = &pdev->dev;
> > +   struct resource *res;
> > +   struct safexcel_crypto_priv *priv;
> > +   int i, ret;
> > +
> > +   priv = devm_kzalloc(dev, sizeof(struct safexcel_crypto_priv),
> > +                       GFP_KERNEL);
> > +   if (!priv)
> > +           return -ENOMEM;
> > +
> > +   priv->dev = dev;
> > +
> > +   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +   priv->base = devm_ioremap_resource(dev, res);
> > +   if (IS_ERR(priv->base)) {
> > +           dev_err(dev, "failed to get resource\n");
> > +           return PTR_ERR(priv->base);
> > +   }
> > +
> > +   priv->clk = of_clk_get(dev->of_node, 0);
> > +   if (!IS_ERR(priv->clk)) {
> > +           ret = clk_prepare_enable(priv->clk);
> > +           if (ret) {
> > +                   dev_err(dev, "unable to enable clk (%d)\n", ret);
> > +                   return ret;
> > +           }
> > +   } else {
> > +           /* The clock isn't mandatory */
> > +           if (PTR_ERR(priv->clk) == -EPROBE_DEFER)
> > +                   return -EPROBE_DEFER;
> > +   }
> 
> You should call dma_set_mask_and_coherent() before any DMA API calls,
> both to confirm DMA really is going to work all, and also (since this IP
> apparently supports >32-bit addresses) to describe the full inherent
> addressing capability, not least to avoid wasting time/space with
> unnecessary bounce buffering otherwise.

OK, I'll add a call to this helper before using DMA API calls.

> > +
> > +err_pool:
> > +   dma_pool_destroy(priv->context_pool);
> 
> You used dmam_pool_create(), so not only is an explicit destroy
> unnecessary, but doing so with the non-managed version is actively
> harmful, since devres would end up double-freeing the pool after you return.

I saw this and fixed it in my local version.

> > +struct safexcel_token {
> > +   u32 packet_length:17;
> > +   u8 stat:2;
> > +   u16 instructions:9;
> > +   u8 opcode:4;
> > +} __packed;
> 
> Using bitfields in hardware descriptors seems pretty risky, since you've
> no real guarantee two compilers will lay them out the same way (or that
> either would be correct WRT what the hardware expects). I'd be more
> inclined to construct all these fields with explicit masking and shifting.

Hmm, I'm not sure to follow you here. If I use bitfields in a __packed
structure, we should be safe. Why would the compiler have a different
behaviour?

> > +static int safexcel_aes_send(struct crypto_async_request *async,
> > +                        int ring, struct safexcel_request *request,
> > +                        int *commands, int *results)
> > +{
> > +   struct ablkcipher_request *req = ablkcipher_request_cast(async);
> > +   struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
> > +   struct safexcel_crypto_priv *priv = ctx->priv;
> > +   struct safexcel_command_desc *cdesc;
> > +   struct safexcel_result_desc *rdesc;
> > +   struct scatterlist *sg;
> > +   phys_addr_t ctxr_phys;
> > +   int nr_src, nr_dst, n_cdesc = 0, n_rdesc = 0, queued = req->nbytes;
> > +   int i, ret = 0;
> > +
> > +   request->req = &req->base;
> > +
> > +   if (req->src == req->dst) {
> > +           nr_src = sg_nents_for_len(req->src, req->nbytes);
> > +           nr_dst = nr_src;
> > +
> > +           if (dma_map_sg(priv->dev, req->src, nr_src, DMA_BIDIRECTIONAL) 
> > <= 0)
> 
> Nit: you only need to check for zero/nonzero to determine failure
> (similarly below) - dma_map_sg() cannot return negative values.

OK.

> Bigger complaint: you should not ignore the successful return value and
> rely on it being equal to nr_src - please see the description of
> dma_map_sg() in Documenatation/DMA-API.txt

OK, I'll have a look at it and fix this.

> > +   /* command descriptors */
> > +   for_each_sg(req->src, sg, nr_src, i) {
> > +           phys_addr_t sg_phys = dma_to_phys(priv->dev, 
> > sg_dma_address(sg));
> > +           int len = sg_dma_len(sg);
> 
> If dma_map_sg() coalesced any entries into the same mapping such that
> count < nents, these could well give bogus values toward the end of the
> list once i >= count(if you're lucky, at least len might be 0).

OK, I'll fix this.

> > +   /* Add a command descriptor for the cached data, if any */
> > +   if (cache_len) {
> > +           ctx->base.cache_dma = dma_map_single(priv->dev, req->cache,
> 
> This is pretty dodgy, since req->cache is inside a live data structure,
> adjoining parts of which are updated whilst still mapped (the update of
> req->processed below). You just about get away without data corruption
> since we *probably* don't invalidate anything when unmapping
> DMA_TO_DEVICE, and the coherency hardware *probably* doesn't do anything
> crazy, but you've no real guarantee of that - any DMA buffer should
> really be separately kmalloced. "That's a nice dirty cache line you've
> got there, it'd be a shame if anything were to happen to it..."

OK, good to know.

> > +   rdr->base = dmam_alloc_coherent(priv->dev,
> > +                                   rdr->offset * EIP197_DEFAULT_RING_SIZE,
> > +                                   &rdr->base_dma, GFP_KERNEL);
> > +   if (!rdr->base) {
> > +           dmam_free_coherent(priv->dev,
> > +                              cdr->offset * EIP197_DEFAULT_RING_SIZE,
> > +                              cdr->base, cdr->base_dma);
> 
> Returning an error here will abort your probe routine, so devres will
> clean these up there and then - there's no need to do free anything
> explicitly. That's rather the point of using devm_*() to begin with.

Sure.

> > +   dmam_free_coherent(priv->dev,
> > +                      cdr->offset * EIP197_DEFAULT_RING_SIZE,
> > +                      cdr->base, cdr->base_dma);
> > +   dmam_free_coherent(priv->dev,
> > +                      rdr->offset * EIP197_DEFAULT_RING_SIZE,
> > +                      rdr->base, rdr->base_dma);
> > +}
> 
> Again, this is only called at device teardown, so the whole function is
> redundant.

OK, I'll remove these.

Thanks for the review!
Antoine

-- 
Antoine Ténart, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

Attachment: signature.asc
Description: PGP signature

Reply via email to