On Wed, Aug 21, 2019 at 08:42:24PM +0300, Alexander Amelkin wrote: > 21.08.2019 19:32, Guenter Roeck wrote: > > On Wed, Aug 21, 2019 at 06:57:43PM +0300, Ivan Mikhaylov wrote: > >> Set WDT_CLEAR_TIMEOUT_AND_BOOT_CODE_SELECTION into WDT_CLEAR_TIMEOUT_STATUS > >> to clear out boot code source and re-enable access to the primary SPI flash > >> chip while booted via wdt2 from the alternate chip. > >> > >> AST2400 datasheet says: > >> "In the 2nd flash booting mode, all the address mapping to CS0# would be > >> re-directed to CS1#. And CS0# is not accessable under this mode. To access > >> CS0#, firmware should clear the 2nd boot mode register in the WDT2 status > >> register WDT30.bit[1]." > > Is there reason to not do this automatically when loading the module > > in alt-boot mode ? What means does userspace have to determine if CS0 > > or CS1 is active at any given time ? If there is reason to ever have CS1 > > active instead of CS0, what means would userspace have to enable it ? > > Yes, there is. The driver is loaded long before the filesystems are mounted. > The filesystems, in the event of alternate/recovery boot, need to be mounted > from the same chip that the kernel was booted. For one reason because the > main chip at CS0 is most probably corrupt. If you clear that bit when driver > is loaded, your software will not know that and will try to mount the wrong > filesystems. The whole idea of ASPEED's switching chipselects is to have > identical firmware in both chips, without the need to process the alternate > boot state in any way except for indicating a successful boot and restoring > access to CS0 when needed. > > The userspace can read bootstatus sysfs node to determine if an alternate > boot has occured. > > With ASPEED, CS1 is activated automatically by wdt2 when system fails to boot > from the primary flash chip (at CS0) and disable the watchdog to indicate a > successful boot. When that happens, both CS0 and CS1 controls get routed in > hardware to CS1 line, making the primary flash chip inaccessible. Depending > on the architecture of the user-space software, it may choose to re-enable > access to the primary chip via CS0 at different times. There must be a way to > do so. > So by activating cs0, userspace would essentially pull its own root file system from underneath itself ?
> > If userspace can not really determine if CS1 or CS0 is active, all it could > > ever do was to enable CS0 to be in a deterministic state. If so, it doesn't > > make sense to ever have CS1 active, and re-enabling CS0 could be automatic. > > > > Similar, if CS1 can ever be enabled, there is no means for userspace to > > ensure > > that some other application did not re-enable CS0 while it believes that CS1 > > is enabled. If there is no means for userspace to enable CS1, it can never > > be > > sure what is enabled (because some other entity may have enabled CS0 while > > userspace just thought that CS1 is still enabled). Again, the only means > > to guarantee a well defined state would be to explicitly enable CS0 and > > provive > > no means to enable CS1. Again, this could be done during boot, not requiring > > an explicit request from userspace. > > Please understand that activation of CS1 in place of CS0 is NOT a software > choice! > > > >> + if (unlikely(!wdt)) > >> + return -ENODEV; > >> + > > How would this ever happen, and how / where is drvdata set to NULL ? > > This is purely for robustness. Seeing a pointer obtained via a function > accessed without first checking it for validity makes me nervous. > This is not how kernel code is commonly written. Sure, we could add similar checks to each sysfs access code in the kernel, blowing up its size significantly. I do not see a point of this. > This code most probably adds nothing at the assembly level. > That seems quite unlikely. Please demonstrate. > > > >> + writel(WDT_CLEAR_TIMEOUT_AND_BOOT_CODE_SELECTION, > >> + wdt->base + WDT_CLEAR_TIMEOUT_STATUS); > >> + wdt->wdd.bootstatus |= WDIOF_EXTERN1; > > The variable reflects the _boot status_. It should not change after booting. > Is there any documentation that dictates that? All I could find is > > "bootstatus: status of the device after booting". That doesn't look to me > like it absolutely can not change to reflect the updated status (that is, to > reflect that the originally set up alternate CS routing has been reset to > normal). > You choose to interpret "after booting" in a kind of novel way, which I find a bit disturbing. I am not really sure how else to describe "boot status" in a way that does not permit such reinterpratation of the term. On top of that, how specifically would "WDIOF_EXTERN1" reflect what you claim it does ? Not only you are hijacking bootstatus9 (which is supposed to describe the reason for a reboot), you are also hijacking WDIOF_EXTERN1. That seems highly arbitrary to me, and is not really how an API/ABI should be used. Guenter > If you absolutely disallow that, I think we could make 'access_cs0' readable > instead, so it could report the current state of the boot code selection bit. > Reverted, I suppose. That way 'access_cs0' would report 1 after 1 has been > written to it (it wouldn't be possible to write a zero). > > > @@ -223,6 +248,9 @@ static int aspeed_wdt_probe(struct platform_device > > *pdev) > > > > wdt->ctrl = WDT_CTRL_1MHZ_CLK; > > > > + if (of_property_read_bool(np, "aspeed,alt-boot")) > > + wdt->wdd.groups = bswitch_groups; > > + > > Why does this have to be separate to the existing evaluation of > > aspeed,alt-boot, and why does the existing code not work ? > > > > Also, is it guaranteed that this does not interfer with existing > > support for alt-boot ? > > I think Ivan will comment on this. > > With best regards, > Alexander Amelkin, > BIOS/BMC Team Lead, YADRO > https://yadro.com > >

