On Tue, Sep 21, 2021 at 7:24 AM Ben Widawsky <[email protected]> wrote:
>
> On 21-09-14 12:31:22, Dan Williams wrote:
> > The kbuild robot reports:
> >
> >     drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
> >     limit (1024) in function 'devm_cxl_add_decoder'
> >
> > It is also the case the devm_cxl_add_decoder() is unwieldy to use for
> > all the different decoder types. Fix the stack usage by splitting the
> > creation into alloc and add steps. This also allows for context
> > specific construction before adding.
> >
> > With the split the caller is responsible for registering a devm callback
> > to trigger device_unregister() for the decoder rather than it being
> > implicit in the decoder registration. I.e. the routine that calls alloc
> > is responsible for calling put_device() if the "add" operation fails.
> >
> > Reported-by: kernel test robot <[email protected]>
> > Reported-by: Nathan Chancellor <[email protected]>
> > Reported-by: Dan Carpenter <[email protected]>
> > Signed-off-by: Dan Williams <[email protected]>
>
> I have some comments inline. You can take them or leave them. Hopefully you 
> can
> pull in my patch to document these after too.
>
> Reviewed-by: Ben Widawsky <[email protected]>
>
> > ---
> > Changes since v4:
> > - hold the device lock over the list_empty(&port->dports) check
> >   (Jonathan)
> > - move the list_empty() check after the check for NULL @target_map in
> >   anticipation of endpoint decoders (Ben)
> >
> >  drivers/cxl/acpi.c      |   84 +++++++++++++++++++++++---------
> >  drivers/cxl/core/bus.c  |  123 
> > +++++++++++++++--------------------------------
> >  drivers/cxl/core/core.h |    5 --
> >  drivers/cxl/core/pmem.c |    7 ++-
> >  drivers/cxl/cxl.h       |   15 ++----
> >  5 files changed, 110 insertions(+), 124 deletions(-)
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index d39cc797a64e..2368a8b67698 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> >       struct cxl_decoder *cxld;
> >       acpi_size len, cur = 0;
> >       void *cedt_subtable;
> > -     unsigned long flags;
> >       int rc;
> >
> >       len = acpi_cedt->length - sizeof(*acpi_cedt);
> > @@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> >               for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
> >                       target_map[i] = cfmws->interleave_targets[i];
> >
> > -             flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > -             cxld = devm_cxl_add_decoder(dev, root_port,
> > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > -                                         cfmws->base_hpa, 
> > cfmws->window_size,
> > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > -                                         
> > CFMWS_INTERLEAVE_GRANULARITY(cfmws),
> > -                                         CXL_DECODER_EXPANDER,
> > -                                         flags, target_map);
> > -
> > -             if (IS_ERR(cxld)) {
> > +             cxld = cxl_decoder_alloc(root_port,
> > +                                      CFMWS_INTERLEAVE_WAYS(cfmws));
> > +             if (IS_ERR(cxld))
> > +                     goto next;
> > +
> > +             cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > +             cxld->target_type = CXL_DECODER_EXPANDER;
> > +             cxld->range = (struct range) {
> > +                     .start = cfmws->base_hpa,
> > +                     .end = cfmws->base_hpa + cfmws->window_size - 1,
> > +             };
> > +             cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
> > +             cxld->interleave_granularity =
> > +                     CFMWS_INTERLEAVE_GRANULARITY(cfmws);
> > +
> > +             rc = cxl_decoder_add(cxld, target_map);
> > +             if (rc)
> > +                     put_device(&cxld->dev);
> > +             else
> > +                     rc = cxl_decoder_autoremove(dev, cxld);
>
> For posterity, I'll say I don't love this interface overall, but I don't have 
> a
> better suggestion.
>
> alloc()
> open coded configuration
> add()
> open coded autoremove
>
> I understand some of the background on moving the responsibility of the devm
> callback to the actual aller, it just ends up a fairly weird interface now 
> since
> all 4 steps are needed to actually create a decoder for consumption by the
> driver.
>
> I'd request a new function to configure the decoder before adding except I 
> don't
> think it's worth doing that either.

Certainly this approach was taken for practical reasons, not elegance.
I too don't see how to clean this up further. It's either make the
caller responsible for all steps, or have monster functions for all
the possible ways a decoder might be configured and tempt the stack
window warnings again.

>
> > +             if (rc) {
> >                       dev_err(dev, "Failed to add decoder for 
> > %#llx-%#llx\n",
> >                               cfmws->base_hpa, cfmws->base_hpa +
> >                               cfmws->window_size - 1);
>
> Do you think it makes sense to explain to the user what the consequence is of
> this?

In the log message? No, I think that would be too pedantic.

>
> > -             } else {
> > -                     dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > -                             dev_name(&cxld->dev), cfmws->base_hpa,
> > -                              cfmws->base_hpa + cfmws->window_size - 1);
> > +                     goto next;
> >               }
> > +             dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > +                     dev_name(&cxld->dev), cfmws->base_hpa,
> > +                     cfmws->base_hpa + cfmws->window_size - 1);
> > +next:
> >               cur += c->length;
> >       }
> >  }
> > @@ -266,6 +277,7 @@ static int add_host_bridge_uport(struct device *match, 
> > void *arg)
> >       struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> >       struct acpi_pci_root *pci_root;
> >       struct cxl_walk_context ctx;
> > +     int single_port_map[1], rc;
> >       struct cxl_decoder *cxld;
> >       struct cxl_dport *dport;
> >       struct cxl_port *port;
> > @@ -301,22 +313,46 @@ static int add_host_bridge_uport(struct device 
> > *match, void *arg)
> >               return -ENODEV;
> >       if (ctx.error)
> >               return ctx.error;
> > +     if (ctx.count > 1)
> > +             return 0;
> >
> >       /* TODO: Scan CHBCR for HDM Decoder resources */
> >
> >       /*
> > -      * In the single-port host-bridge case there are no HDM decoders
> > -      * in the CHBCR and a 1:1 passthrough decode is implied.
> > +      * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> > +      * Structure) single ported host-bridges need not publish a decoder
> > +      * capability when a passthrough decode can be assumed, i.e. all
> > +      * transactions that the uport sees are claimed and passed to the 
> > single
> > +      * dport. Default the range a 0-base 0-length until the first CXL 
> > region
> > +      * is activated.
> >        */
> > -     if (ctx.count == 1) {
> > -             cxld = devm_cxl_add_passthrough_decoder(host, port);
> > -             if (IS_ERR(cxld))
> > -                     return PTR_ERR(cxld);
> > +     cxld = cxl_decoder_alloc(port, 1);
> > +     if (IS_ERR(cxld))
> > +             return PTR_ERR(cxld);
> > +
> > +     cxld->interleave_ways = 1;
> > +     cxld->interleave_granularity = PAGE_SIZE;
> > +     cxld->target_type = CXL_DECODER_EXPANDER;
> > +     cxld->range = (struct range) {
> > +             .start = 0,
> > +             .end = -1,
> > +     };
> >
> > -             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > -     }
> > +     device_lock(&port->dev);
> > +     dport = list_first_entry(&port->dports, typeof(*dport), list);
> > +     device_unlock(&port->dev);
> >
> > -     return 0;
> > +     single_port_map[0] = dport->port_id;
> > +
> > +     rc = cxl_decoder_add(cxld, single_port_map);
> > +     if (rc)
> > +             put_device(&cxld->dev);
> > +     else
> > +             rc = cxl_decoder_autoremove(host, cxld);
> > +
> > +     if (rc == 0)
> > +             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > +     return rc;
> >  }
> >
> >  static int add_host_bridge_dport(struct device *match, void *arg)
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 6dfdeaf999f0..396252749477 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -453,10 +453,8 @@ int cxl_add_dport(struct cxl_port *port, struct device 
> > *dport_dev, int port_id,
> >  }
> >  EXPORT_SYMBOL_GPL(cxl_add_dport);
> >
> > -static int decoder_populate_targets(struct device *host,
> > -                                 struct cxl_decoder *cxld,
> > -                                 struct cxl_port *port, int *target_map,
> > -                                 int nr_targets)
> > +static int decoder_populate_targets(struct cxl_decoder *cxld,
> > +                                 struct cxl_port *port, int *target_map)
> >  {
> >       int rc = 0, i;
> >
> > @@ -464,42 +462,36 @@ static int decoder_populate_targets(struct device 
> > *host,
> >               return 0;
> >
> >       device_lock(&port->dev);
> > -     for (i = 0; i < nr_targets; i++) {
> > +     if (list_empty(&port->dports)) {
> > +             rc = -EINVAL;
> > +             goto out_unlock;
> > +     }
>
> Forewarning, I think I'm still going to need to modify this check for 
> endpoints.

I should have expanded the amount of context in the diff. That "return
0;" above is from the !target_map check which should be true for
endpoints, that was one of your earlier feedback items that I heeded.
So I don't think you'll trip over this.

>
> > +
> > +     for (i = 0; i < cxld->nr_targets; i++) {
> >               struct cxl_dport *dport = find_dport(port, target_map[i]);
> >
> >               if (!dport) {
> >                       rc = -ENXIO;
> > -                     break;
> > +                     goto out_unlock;
> >               }
> > -             dev_dbg(host, "%s: target: %d\n", dev_name(dport->dport), i);
> >               cxld->target[i] = dport;
> >       }
> > +
> > +out_unlock:
> >       device_unlock(&port->dev);
> >
> >       return rc;
> >  }
> >
> > -static struct cxl_decoder *
> > -cxl_decoder_alloc(struct device *host, struct cxl_port *port, int 
> > nr_targets,
> > -               resource_size_t base, resource_size_t len,
> > -               int interleave_ways, int interleave_granularity,
> > -               enum cxl_decoder_type type, unsigned long flags,
> > -               int *target_map)
> > +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int 
> > nr_targets)
> >  {
> >       struct cxl_decoder *cxld;
> >       struct device *dev;
> >       int rc = 0;
> >
> > -     if (interleave_ways < 1)
> > +     if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
> >               return ERR_PTR(-EINVAL);
> >
> > -     device_lock(&port->dev);
> > -     if (list_empty(&port->dports))
> > -             rc = -EINVAL;
> > -     device_unlock(&port->dev);
> > -     if (rc)
> > -             return ERR_PTR(rc);
> > -
> >       cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> >       if (!cxld)
> >               return ERR_PTR(-ENOMEM);
> > @@ -508,22 +500,8 @@ cxl_decoder_alloc(struct device *host, struct cxl_port 
> > *port, int nr_targets,
> >       if (rc < 0)
> >               goto err;
> >
> > -     *cxld = (struct cxl_decoder) {
> > -             .id = rc,
> > -             .range = {
> > -                     .start = base,
> > -                     .end = base + len - 1,
> > -             },
> > -             .flags = flags,
> > -             .interleave_ways = interleave_ways,
> > -             .interleave_granularity = interleave_granularity,
> > -             .target_type = type,
> > -     };
> > -
> > -     rc = decoder_populate_targets(host, cxld, port, target_map, 
> > nr_targets);
> > -     if (rc)
> > -             goto err;
> > -
> > +     cxld->id = rc;
> > +     cxld->nr_targets = nr_targets;
>
> Would be really nice if cxld->nr_targets could be const...
>

Sure, conversion is a little messy, but not too bad.

> >       dev = &cxld->dev;
> >       device_initialize(dev);
> >       device_set_pm_not_required(dev);
> > @@ -541,72 +519,47 @@ cxl_decoder_alloc(struct device *host, struct 
> > cxl_port *port, int nr_targets,
> >       kfree(cxld);
> >       return ERR_PTR(rc);
> >  }
> > +EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
> >
> > -struct cxl_decoder *
> > -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int 
> > nr_targets,
> > -                  resource_size_t base, resource_size_t len,
> > -                  int interleave_ways, int interleave_granularity,
> > -                  enum cxl_decoder_type type, unsigned long flags,
> > -                  int *target_map)
> > +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> >  {
> > -     struct cxl_decoder *cxld;
> > +     struct cxl_port *port;
> >       struct device *dev;
> >       int rc;
> >
> > -     if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
> > -             return ERR_PTR(-EINVAL);
> > +     if (!cxld)
> > +             return -EINVAL;
>
> I don't mind, but I think calling this with !cxld is a driver bug, right?
> Perhaps upgrade to WARN_ONCE?

Sure.

>
> >
> > -     cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
> > -                              interleave_ways, interleave_granularity, 
> > type,
> > -                              flags, target_map);
> >       if (IS_ERR(cxld))
> > -             return cxld;
> > +             return PTR_ERR(cxld);
>
> Same as above.

Ok.

>
> >
> > -     dev = &cxld->dev;
> > -     rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
> > -     if (rc)
> > -             goto err;
> > +     if (cxld->interleave_ways < 1)
> > +             return -EINVAL;
> >
> > -     rc = device_add(dev);
> > +     port = to_cxl_port(cxld->dev.parent);
> > +     rc = decoder_populate_targets(cxld, port, target_map);
> >       if (rc)
> > -             goto err;
> > +             return rc;
> >
> > -     rc = devm_add_action_or_reset(host, unregister_cxl_dev, dev);
> > +     dev = &cxld->dev;
> > +     rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
> >       if (rc)
> > -             return ERR_PTR(rc);
> > -     return cxld;
> > +             return rc;
> >
> > -err:
> > -     put_device(dev);
> > -     return ERR_PTR(rc);
> > +     return device_add(dev);
> >  }
> > -EXPORT_SYMBOL_GPL(devm_cxl_add_decoder);
> > +EXPORT_SYMBOL_GPL(cxl_decoder_add);
> >
> > -/*
> > - * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability 
> > Structure)
> > - * single ported host-bridges need not publish a decoder capability when a
> > - * passthrough decode can be assumed, i.e. all transactions that the uport 
> > sees
> > - * are claimed and passed to the single dport. Default the range a 0-base
> > - * 0-length until the first CXL region is activated.
> > - */
> > -struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> > -                                                  struct cxl_port *port)
> > +static void cxld_unregister(void *dev)
> >  {
> > -     struct cxl_dport *dport;
> > -     int target_map[1];
> > -
> > -     device_lock(&port->dev);
> > -     dport = list_first_entry_or_null(&port->dports, typeof(*dport), list);
> > -     device_unlock(&port->dev);
> > -
> > -     if (!dport)
> > -             return ERR_PTR(-ENXIO);
> > +     device_unregister(dev);
> > +}
> >
> > -     target_map[0] = dport->port_id;
> > -     return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
> > -                                 CXL_DECODER_EXPANDER, 0, target_map);
> > +int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld)
> > +{
> > +     return devm_add_action_or_reset(host, cxld_unregister, &cxld->dev);
> >  }
> > -EXPORT_SYMBOL_GPL(devm_cxl_add_passthrough_decoder);
> > +EXPORT_SYMBOL_GPL(cxl_decoder_autoremove);
> >
> >  /**
> >   * __cxl_driver_register - register a driver for the cxl bus
> > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> > index c85b7fbad02d..e0c9aacc4e9c 100644
> > --- a/drivers/cxl/core/core.h
> > +++ b/drivers/cxl/core/core.h
> > @@ -9,11 +9,6 @@ extern const struct device_type cxl_nvdimm_type;
> >
> >  extern struct attribute_group cxl_base_attribute_group;
> >
> > -static inline void unregister_cxl_dev(void *dev)
> > -{
> > -     device_unregister(dev);
> > -}
> > -
> >  struct cxl_send_command;
> >  struct cxl_mem_query_commands;
> >  int cxl_query_cmd(struct cxl_memdev *cxlmd,
> > diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
> > index 74be5132df1c..5032f4c1c69d 100644
> > --- a/drivers/cxl/core/pmem.c
> > +++ b/drivers/cxl/core/pmem.c
> > @@ -222,6 +222,11 @@ static struct cxl_nvdimm *cxl_nvdimm_alloc(struct 
> > cxl_memdev *cxlmd)
> >       return cxl_nvd;
> >  }
> >
> > +static void cxl_nvd_unregister(void *dev)
> > +{
> > +     device_unregister(dev);
> > +}
> > +
> >  /**
> >   * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm
> >   * @host: same host as @cxlmd
> > @@ -251,7 +256,7 @@ int devm_cxl_add_nvdimm(struct device *host, struct 
> > cxl_memdev *cxlmd)
> >       dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
> >               dev_name(dev));
> >
> > -     return devm_add_action_or_reset(host, unregister_cxl_dev, dev);
> > +     return devm_add_action_or_reset(host, cxl_nvd_unregister, dev);
> >
> >  err:
> >       put_device(dev);
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 9af5745ba2c0..7d6b011dd963 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -195,6 +195,7 @@ enum cxl_decoder_type {
> >   * @interleave_granularity: data stride per dport
> >   * @target_type: accelerator vs expander (type2 vs type3) selector
> >   * @flags: memory type capabilities and locking
> > + * @nr_targets: number of elements in @target
> >   * @target: active ordered target list in current decoder configuration
> >   */
> >  struct cxl_decoder {
> > @@ -205,6 +206,7 @@ struct cxl_decoder {
> >       int interleave_granularity;
> >       enum cxl_decoder_type target_type;
> >       unsigned long flags;
> > +     int nr_targets;
> >       struct cxl_dport *target[];
> >  };
> >
> > @@ -286,15 +288,10 @@ int cxl_add_dport(struct cxl_port *port, struct 
> > device *dport, int port_id,
> >
> >  struct cxl_decoder *to_cxl_decoder(struct device *dev);
> >  bool is_root_decoder(struct device *dev);
> > -struct cxl_decoder *
> > -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int 
> > nr_targets,
> > -                  resource_size_t base, resource_size_t len,
> > -                  int interleave_ways, int interleave_granularity,
> > -                  enum cxl_decoder_type type, unsigned long flags,
> > -                  int *target_map);
> > -
> > -struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> > -                                                  struct cxl_port *port);
> > +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int 
> > nr_targets);
> > +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
> > +int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> > +
> >  extern struct bus_type cxl_bus_type;
> >
> >  struct cxl_driver {
> >

Reply via email to