On Mon, Sep 14, 2020 at 12:33:11PM +0200, Mateusz Holenko wrote:
> On Fri, Sep 11, 2020 at 2:57 AM Stafford Horne <sho...@gmail.com> wrote:
> >
> > On Wed, Aug 12, 2020 at 02:34:34PM +0200, Mateusz Holenko wrote:
> > > From: Pawel Czarnecki <pczarne...@internships.antmicro.com>
> > >
> > > This commit adds driver for the FPGA-based LiteX SoC
> > > Controller from LiteX SoC builder.
> > >
> > > Co-developed-by: Mateusz Holenko <mhole...@antmicro.com>
> > > Signed-off-by: Mateusz Holenko <mhole...@antmicro.com>
> > > Signed-off-by: Pawel Czarnecki <pczarne...@internships.antmicro.com>
> > > ---
> > > +     node = dev->of_node;
> > > +     if (!node)
> > > +             return -ENODEV;

We return here without BUG() if the setup fails.

> > > +
> > > +     soc_ctrl_dev = devm_kzalloc(dev, sizeof(*soc_ctrl_dev), GFP_KERNEL);
> > > +     if (!soc_ctrl_dev)
> > > +             return -ENOMEM;

We return here without BUG() if we are out of memory.

> > > +
> > > +     soc_ctrl_dev->base = devm_platform_ioremap_resource(pdev, 0);
> > > +     if (IS_ERR(soc_ctrl_dev->base))
> > > +             return PTR_ERR(soc_ctrl_dev->base);

Etc.

> > > +
> > > +     result = litex_check_csr_access(soc_ctrl_dev->base);
> > > +     if (result) {
> > > +             // LiteX CSRs access is broken which means that
> > > +             // none of LiteX drivers will most probably
> > > +             // operate correctly
> > The comment format here with // is not usually used in the kernel, but its 
> > not
> > forbidded.  Could you use the /* */ multiline style?
> 
> Sure, I'll change the commenting style here.
> 
> >
> > > +             BUG();
> > Instead of stopping the system with BUG, could we just do:
> >
> >         return litex_check_csr_access(soc_ctrl_dev->base);
> >
> > We already have failure for NODEV/NOMEM so might as well not call BUG() here
> > too.
> 
> It's true that litex_check_csr_accessors() already generates error
> codes that could be
> returned directly.
> The point of using BUG() macro here, however, is to stop booting the
> system so that it's visible
> (and impossible to miss for the user) that an unresolvable HW issue
> was encountered.
> 
> CSR-accessors - the litex_{g,s}et_reg() functions - are intended to be
> used by other LiteX drivers
> and it's very unlikely that those drivers would work properly after
> the fail of litex_check_csr_accessors().
> Since in such case the UART driver will be affected too (no boot logs
> and error messages visible to the user),
> I thought it'll be easier to spot and debug the problem if the system
> stopped in the BUG loop.
> Perhaps there are other, more linux-friendly, ways of achieving a
> similar goal - I'm open for suggestions.

I see your point, but I thought if failed with an exit status above, we could do
the same here.  But I guess failing here means that something is really wrong as
validation failed.

Some points:
 - If we return here, the system will still boot but there will be no UART
 - If we bail with BUG(), here the system stops, and there is no UART
 - Both cases the user can connect with a debugger and read "dmesg", to see what
   is wrong, but BUG() does not print an error message on all architectures.

We could also use:

 - WARN(1, "Failed to validate CSR registers, the system is probably broken.");

If you want to keep BUG() it may be fine.

I am not an expert on handling these type of bailout's so other input is
appreciated.

-Stafford

Reply via email to