Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-04 Thread Saravana Kannan
On Wed, Mar 3, 2021 at 2:21 AM Michael Walle  wrote:
>
> Am 2021-03-03 10:28, schrieb Saravana Kannan:
> > On Wed, Mar 3, 2021 at 12:59 AM Michael Walle  wrote:
> >>
> >> Am 2021-03-02 23:47, schrieb Saravana Kannan:
> >> > On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan 
> >> > wrote:
> >> >>
> >> >> On Tue, Mar 2, 2021 at 2:24 PM Michael Walle  wrote:
> >> >> >
> >> >> > Am 2021-03-02 22:11, schrieb Saravana Kannan:
> >> >> > > I think Patch 1 should fix [4] without [5]. Can you test the series
> >> >> > > please?
> >> >> >
> >> >> > Mh, I'm on latest linux-next (next-20210302) and I've applied patch 
> >> >> > 3/3
> >> >> > and
> >> >> > reverted commit 7007b745a508 ("PCI: layerscape: Convert to
> >> >> > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be 
> >> >> > working,
> >> >> > right? But it is. Did I miss something?
> >> >>
> >> >> You need to revert [5].
> >> >
> >> > My bad. You did revert it. Ah... I wonder if it was due to
> >> > fw_devlink.strict that I added. To break PCI again, also set
> >> > fw_devlink.strict=1 in the kernel command line.
> >>
> >> Indeed, adding fw_devlink.strict=1 will break PCI again. But if
> >> I then apply 1/3 and 2/3 again, PCI is still broken. Just to be clear:
> >> I'm keeping the fw_devlink.strict=1 parameter.
> >
> > Thanks for your testing! I assume you are also setting fw_devlink=on?
>
> I've applied patch 3/3 and added nothing to the commandline, so yes.
>
> > Hmmm... ok. In the working case, does your PCI probe before IOMMU? If
> > yes, then your results make sense.
>
> Yes that was the conclusion last time. That the probe is deferred and
> the __init section is already discarded when there might a second
> try of the probe.

Long response below, but the TL;DR is:
The real fix for your case was the implementation of fw_devlink.strict
and NOT Patch 1 of this series. So, sorry for wasting your test
effort.

During the earlier debugging (for take I), this is what I thought:

With fw_devlink=permissive, your boot sequence was (Case 1):
1. IOMMU probe
2. PCI builtin_platform_driver_probe() attempt
- Driver core sets up PCI with IOMMU
- PCI probe succeeds.
- PCI works with IOMMU. < Remember this point.

And with fw_devlink=on, I thought the IOMMU probe order was
unnecessarily changed and caused this (Case 2):
1. IOMMU probe reordered for some reason to be attempted before its
suppliers. Gets deferred.
2. PCI probe attempt
- fw_devlink + device links defers the probe because IOMMU isn't ready.
- builtin_platform_driver_probe() replaces drv->probe with
platform_probe_fail()
3. IOMMU deferred probe succeeds eventually.
4. PCI deferred probe is attempted
- platform_probe_fail() which is a stub just returns -ENXIO

And if this was the case, patch 1 in this series would have fixed it
by removing unnecessary reordering of probes.

But what was really happening was (after I went through your logs
again and looked at the code):
With fw_devlink=permissive, your boot sequence was really (Case 3):
1. PCI builtin_platform_driver_probe() attempt
- Driver core does NOT set up PCI with IOMMU
- PCI probe succeeds.
- PCI works without IOMMU. < Remember this point.
2. IOMMU probes

And with fw_devlink=on what was happening was (Case 4):
1. PCI builtin_platform_driver_probe() attempt
- fw_devlink + device links defers the probe because it thinks
IOMMU is mandatory and isn't ready.
- builtin_platform_driver_probe() replaces drv->probe with
platform_probe_fail()
2. IOMMU probes.
3. PCI deferred probe is attempted
- platform_probe_fail() which is a stub just returns -ENXIO
4. PCI is broken now.

In your case IOMMU is not mandatory and PCI works without IOMMU even
when fw_devlink=off/permissive. So the real fix for your case is the
addition of fw_devlink.strict and defaulting it to 0. Because of my
misunderstanding of your case, I didn't realize I already fixed your
case and I thought Patch 1 in this series would fix your case.

Patch 1 in this series is still important for other reasons, just not for you.

> So I guess, Patch 1/3 and Patch 2/3 doesn't fix that and the drivers
> still need to be converted to builtin_platform_driver(), right?

So there is no real issue between fw_devlink=on and
builtin_platform_driver_probe() anymore. At least none that I know of
or has been reported.

If you really want your PCI to work _with_ IOMMU, then
builtin_platform_driver_probe() is wrong even with fw_devlink=off. And
if you wanted PCI to work with IOMMU support and fw_devlink wasn't
available, you'll have to play initcall chicken with the IOMMU driver
or implement some IOMMU check + deferred probing in your PCI probe
function.

However, with fw_devlink=on, all you have to do is set fw_devlink=on
and fw_devlink.strict=1 and use builtin_platform_driver() and not have
to care about initcall orders or figure out how to defer when IOMMU
isn't ready yet.

-Saravana


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-03 Thread Saravana Kannan
On Wed, Mar 3, 2021 at 2:03 AM Geert Uytterhoeven  wrote:
>
> Hi Saravana,
>
> On Wed, Mar 3, 2021 at 10:24 AM Saravana Kannan  wrote:
> > On Wed, Mar 3, 2021 at 1:22 AM Geert Uytterhoeven  
> > wrote:
> > > On Tue, Mar 2, 2021 at 10:11 PM Saravana Kannan  
> > > wrote:
> > > > This series fixes the last few remaining issues reported when 
> > > > fw_devlink=on
> > > > by default.
> > >
> > > [...]
> > >
> > > Thanks for your series!
> > >
> > > > Geert/Marek,
> > > >
> > > > As far as I know, there shouldn't have any more issues you reported that
> > > > are still left unfixed after this series. Please correct me if I'm 
> > > > wrong or
> > > > if you find new issues.
> > >
> > > While this fixes the core support, there may still be driver fixes left
> > > that were not developed in time for the v5.12-rc1 merge window.
> > > Personally, I'm aware of "soc: renesas: rmobile-sysc: Mark fwnode
> > > when PM domain is added", which I have queued for v5.13[1].
> > > There may be other fixes for other platforms.
> >
> > Right, I intended this series for 5.13. Is that what you are trying to say 
> > too?
>
> OK, v5.13 is fine for me.
> It wasn't clear to me if you intended (the last patch of) this series to
> be merged for v5.12-rcX or v5.13.

The entire series is meant for 5.13.

I don't want to land the Patch 1/3 in 5.12 in case it causes some
regression. And 2/3 isn't urgent.

-Saravana

>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-03 Thread Michael Walle

Am 2021-03-03 10:28, schrieb Saravana Kannan:

On Wed, Mar 3, 2021 at 12:59 AM Michael Walle  wrote:


Am 2021-03-02 23:47, schrieb Saravana Kannan:
> On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan 
> wrote:
>>
>> On Tue, Mar 2, 2021 at 2:24 PM Michael Walle  wrote:
>> >
>> > Am 2021-03-02 22:11, schrieb Saravana Kannan:
>> > > I think Patch 1 should fix [4] without [5]. Can you test the series
>> > > please?
>> >
>> > Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3
>> > and
>> > reverted commit 7007b745a508 ("PCI: layerscape: Convert to
>> > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working,
>> > right? But it is. Did I miss something?
>>
>> You need to revert [5].
>
> My bad. You did revert it. Ah... I wonder if it was due to
> fw_devlink.strict that I added. To break PCI again, also set
> fw_devlink.strict=1 in the kernel command line.

Indeed, adding fw_devlink.strict=1 will break PCI again. But if
I then apply 1/3 and 2/3 again, PCI is still broken. Just to be clear:
I'm keeping the fw_devlink.strict=1 parameter.


Thanks for your testing! I assume you are also setting fw_devlink=on?


I've applied patch 3/3 and added nothing to the commandline, so yes.


Hmmm... ok. In the working case, does your PCI probe before IOMMU? If
yes, then your results make sense.


Yes that was the conclusion last time. That the probe is deferred and
the __init section is already discarded when there might a second
try of the probe.

So I guess, Patch 1/3 and Patch 2/3 doesn't fix that and the drivers
still need to be converted to builtin_platform_driver(), right?


If your PCI does probe after IOMMU and uses IOMMU, then I'm not sure
what else could be changing the order of the device probing. In any
case, glad that the default case works and we have a fix merged even
for .strict=1.

-Saravana


-michael


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-03 Thread Geert Uytterhoeven
Hi Saravana,

On Wed, Mar 3, 2021 at 10:24 AM Saravana Kannan  wrote:
> On Wed, Mar 3, 2021 at 1:22 AM Geert Uytterhoeven  
> wrote:
> > On Tue, Mar 2, 2021 at 10:11 PM Saravana Kannan  
> > wrote:
> > > This series fixes the last few remaining issues reported when 
> > > fw_devlink=on
> > > by default.
> >
> > [...]
> >
> > Thanks for your series!
> >
> > > Geert/Marek,
> > >
> > > As far as I know, there shouldn't have any more issues you reported that
> > > are still left unfixed after this series. Please correct me if I'm wrong 
> > > or
> > > if you find new issues.
> >
> > While this fixes the core support, there may still be driver fixes left
> > that were not developed in time for the v5.12-rc1 merge window.
> > Personally, I'm aware of "soc: renesas: rmobile-sysc: Mark fwnode
> > when PM domain is added", which I have queued for v5.13[1].
> > There may be other fixes for other platforms.
>
> Right, I intended this series for 5.13. Is that what you are trying to say 
> too?

OK, v5.13 is fine for me.
It wasn't clear to me if you intended (the last patch of) this series to
be merged for v5.12-rcX or v5.13.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-03 Thread Saravana Kannan
On Wed, Mar 3, 2021 at 1:22 AM Geert Uytterhoeven  wrote:
>
> Hi Saravana,
>
> On Tue, Mar 2, 2021 at 10:11 PM Saravana Kannan  wrote:
> > This series fixes the last few remaining issues reported when fw_devlink=on
> > by default.
>
> [...]
>
> Thanks for your series!
>
> > Geert/Marek,
> >
> > As far as I know, there shouldn't have any more issues you reported that
> > are still left unfixed after this series. Please correct me if I'm wrong or
> > if you find new issues.
>
> While this fixes the core support, there may still be driver fixes left
> that were not developed in time for the v5.12-rc1 merge window.
> Personally, I'm aware of "soc: renesas: rmobile-sysc: Mark fwnode
> when PM domain is added", which I have queued for v5.13[1].
> There may be other fixes for other platforms.

Right, I intended this series for 5.13. Is that what you are trying to say too?

-Saravana


>
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel.git/commit/?h=renesas-drivers-for-v5.13&id=fb13bbd6c90ee4fb983c0e9a341bd2832a3857cf
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-03 Thread Michael Walle

Am 2021-03-02 23:47, schrieb Saravana Kannan:
On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan  
wrote:


On Tue, Mar 2, 2021 at 2:24 PM Michael Walle  wrote:
>
> Am 2021-03-02 22:11, schrieb Saravana Kannan:
> > I think Patch 1 should fix [4] without [5]. Can you test the series
> > please?
>
> Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3
> and
> reverted commit 7007b745a508 ("PCI: layerscape: Convert to
> builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working,
> right? But it is. Did I miss something?

You need to revert [5].


My bad. You did revert it. Ah... I wonder if it was due to
fw_devlink.strict that I added. To break PCI again, also set
fw_devlink.strict=1 in the kernel command line.


Indeed, adding fw_devlink.strict=1 will break PCI again. But if
I then apply 1/3 and 2/3 again, PCI is still broken. Just to be clear:
I'm keeping the fw_devlink.strict=1 parameter.

-michael


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-03 Thread Geert Uytterhoeven
Hi Saravana,

On Tue, Mar 2, 2021 at 10:11 PM Saravana Kannan  wrote:
> This series fixes the last few remaining issues reported when fw_devlink=on
> by default.

[...]

Thanks for your series!

> Geert/Marek,
>
> As far as I know, there shouldn't have any more issues you reported that
> are still left unfixed after this series. Please correct me if I'm wrong or
> if you find new issues.

While this fixes the core support, there may still be driver fixes left
that were not developed in time for the v5.12-rc1 merge window.
Personally, I'm aware of "soc: renesas: rmobile-sysc: Mark fwnode
when PM domain is added", which I have queued for v5.13[1].
There may be other fixes for other platforms.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel.git/commit/?h=renesas-drivers-for-v5.13&id=fb13bbd6c90ee4fb983c0e9a341bd2832a3857cf

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-03 Thread Saravana Kannan
On Wed, Mar 3, 2021 at 12:59 AM Michael Walle  wrote:
>
> Am 2021-03-02 23:47, schrieb Saravana Kannan:
> > On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan 
> > wrote:
> >>
> >> On Tue, Mar 2, 2021 at 2:24 PM Michael Walle  wrote:
> >> >
> >> > Am 2021-03-02 22:11, schrieb Saravana Kannan:
> >> > > I think Patch 1 should fix [4] without [5]. Can you test the series
> >> > > please?
> >> >
> >> > Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3
> >> > and
> >> > reverted commit 7007b745a508 ("PCI: layerscape: Convert to
> >> > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working,
> >> > right? But it is. Did I miss something?
> >>
> >> You need to revert [5].
> >
> > My bad. You did revert it. Ah... I wonder if it was due to
> > fw_devlink.strict that I added. To break PCI again, also set
> > fw_devlink.strict=1 in the kernel command line.
>
> Indeed, adding fw_devlink.strict=1 will break PCI again. But if
> I then apply 1/3 and 2/3 again, PCI is still broken. Just to be clear:
> I'm keeping the fw_devlink.strict=1 parameter.

Thanks for your testing! I assume you are also setting fw_devlink=on?

Hmmm... ok. In the working case, does your PCI probe before IOMMU? If
yes, then your results make sense.

If your PCI does probe after IOMMU and uses IOMMU, then I'm not sure
what else could be changing the order of the device probing. In any
case, glad that the default case works and we have a fix merged even
for .strict=1.

-Saravana


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-02 Thread Saravana Kannan
On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan  wrote:
>
> On Tue, Mar 2, 2021 at 2:24 PM Michael Walle  wrote:
> >
> > Am 2021-03-02 22:11, schrieb Saravana Kannan:
> > > I think Patch 1 should fix [4] without [5]. Can you test the series
> > > please?
> >
> > Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3
> > and
> > reverted commit 7007b745a508 ("PCI: layerscape: Convert to
> > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working,
> > right? But it is. Did I miss something?
>
> You need to revert [5].

My bad. You did revert it. Ah... I wonder if it was due to
fw_devlink.strict that I added. To break PCI again, also set
fw_devlink.strict=1 in the kernel command line.

-Saravana


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-02 Thread Saravana Kannan
On Tue, Mar 2, 2021 at 2:24 PM Michael Walle  wrote:
>
> Am 2021-03-02 22:11, schrieb Saravana Kannan:
> > I think Patch 1 should fix [4] without [5]. Can you test the series
> > please?
>
> Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3
> and
> reverted commit 7007b745a508 ("PCI: layerscape: Convert to
> builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working,
> right? But it is. Did I miss something?

You need to revert [5].

-Saravana

>
> Anyway, I've also applied Patch 1/3 and 2/3 and it still works. But I
> guess that doesn't say much.
>
> -michael


Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-02 Thread Michael Walle

Am 2021-03-02 22:11, schrieb Saravana Kannan:
I think Patch 1 should fix [4] without [5]. Can you test the series 
please?


Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3 
and

reverted commit 7007b745a508 ("PCI: layerscape: Convert to
builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working,
right? But it is. Did I miss something?

Anyway, I've also applied Patch 1/3 and 2/3 and it still works. But I
guess that doesn't say much.

-michael


[PATCH v1 0/3] driver core: Set fw_devlink=on take II

2021-03-02 Thread Saravana Kannan
This series fixes the last few remaining issues reported when fw_devlink=on
by default.

Patch 1 is just [6] pulled in without changes into this series. It reduces
some unnecessary probe reordering caused by a combination of fw_devlink and
existing device link code. This fixes some issue caused by fw_devlink=on
with respect to DMAs and IOMMUs [1].

Patch 2 fixes a warning [2] present in code unrelated to fw_devlink. It was
just exposed by fw_devlink.

Jon,

Patch 2 should address the issues you reported[2] even without [3]. Could
you test this series please?

Michael,

I think Patch 1 should fix [4] without [5]. Can you test the series please?

Geert/Marek,

As far as I know, there shouldn't have any more issues you reported that
are still left unfixed after this series. Please correct me if I'm wrong or
if you find new issues.

[1] - 
https://lore.kernel.org/lkml/camuhmduvvr8jes51_8_ypoicr-nwad_2nklyukwey8mbxx9...@mail.gmail.com/
[2] - 
https://lore.kernel.org/lkml/56f7d032-ba5a-a8c7-23de-2969d98c5...@nvidia.com/
[3] - 
https://lore.kernel.org/lkml/5176f496-facb-d7b0-9f4e-a9e4b8974...@nvidia.com/
[4] - https://lore.kernel.org/lkml/4b9ae679b6f76d2f7e340e2ec229d...@walle.cc/
[5] - https://lore.kernel.org/lkml/20210120105246.23218-1-mich...@walle.cc/
[6] - 
https://lore.kernel.org/lkml/20210217235130.1744843-1-sarava...@google.com/

Cc: Michael Walle 
Cc: Jon Hunter 
Cc: Marek Szyprowski 
Cc: Geert Uytterhoeven 
Cc: Guenter Roeck 

Thanks,
Saravana

Saravana Kannan (3):
  driver core: Avoid pointless deferred probe attempts
  driver core: Update device link status properly for
device_bind_driver()
  Revert "Revert "driver core: Set fw_devlink=on by default""

 drivers/base/base.h|  1 +
 drivers/base/core.c| 37 -
 drivers/base/dd.c  | 10 +-
 include/linux/device.h |  4 
 4 files changed, 50 insertions(+), 2 deletions(-)

-- 
2.30.1.766.gb4fecdf3b7-goog