sumpfralle opened a new pull request, #17224:
URL: https://github.com/apache/nuttx/pull/17224
## Summary
Previously the function `up_cpu_pause` was used during flash write
operations for preventing all other CPUs from executing code from flash.
The above function was removed in d8cb7759.
Thus, writing to flash on `rp2040` was only possible in non-SMP mode.
This PR allows write operations to the flash on rp2040 even in SMP mode. It
blocks all but the current CPU for the duration of the critical function (write
or erase).
The implementation is based on the previous implementation (see d8cb7759).
It is slightly enhanced by supporting more than two CPUs and it avoids
global variables.
Closes: #16203
*BEWARE*: this is just a draft. Please take a look at the "discussion" at
the end.
## Impact
It is possible to write to flash on rp2040 even in SMP mode.
## Testing
1. enable `CONFIG_PIPES` and `CONFIG_RP2040_FLASH_FILE_SYSTEM`
1. apply the following patch (disables "smartfs", which would block the mtd):
```patch
--- a/boards/arm/rp2040/common/src/rp2040_common_bringup.c
+++ b/boards/arm/rp2040/common/src/rp2040_common_bringup.c
@@ -726,7 +726,7 @@ int rp2040_common_bringup(void)
}
#endif
-#ifdef CONFIG_RP2040_FLASH_FILE_SYSTEM
+#ifdef disabled_CONFIG_RP2040_FLASH_FILE_SYSTEM
mtd_dev = rp2040_flash_mtd_initialize();
```
1. run `rp2040_flash_mtd_initialize();` in your startup code
1. run in the NuttX shell: `netcat -l 10000 | dd of=/dev/mtd bs=1k seek=512`
1. run on your host: `nc -N "$nuttx_host" 10000 <filename`
1. watch the `dd` process in the NuttX shell to finish after the connection
of `netcat` was closed
Without the patch, this process would hang (after commenting out the removed
`up_cpu_pause` calls).
Running the `smp` app (`CONFIG_TESTING_SMP`) during the above write
operation does not cause issues.
## Details to be discussed?
1. The proposed implementation of the "pause all other CPUs" feature is
generic enough to be helpful for other architectures, as well. E.g. the
`esp32_spiram.c` and `esp32s3_spiram.c` use a similar implementation for
guarding their flash accesses (see `g_cpu_pause`). `rp2350` will surely needs
this, too, when flash write support is added.
* Should I move this set of function (`*_smp_isolation`) somewhere else
(e.g. below `sched/sched/`)? Which name would be appropriate?
(`*_smp_isolation` is just my silly choice)
* Should I propose a separate pull request with the respective changes
for esp32 in order to allow others to test them? (I do not have such a board)
* Should I revive `sched_note_cpu_pause`? Currently it is probably
unused.
1. The implementation uses "spin_lock_t" (instead of `bool` flags). I picked
this for readability. Is this a burden? Should I use `bool` instead?
1. The implementation moves the "enter_critical_section" call closer to the
flash write operation (inside the "SMP isolation" calls). Previously
`enter_critical_section` was called *before* `up_cpu_pause`. Now it is called
afterwards.
* I am not sure, *why* this is necessary, but the original call location
leads to random hangs after writing around 50 kB.
1. The functions `enter_smp_isolation` and `leave_smp_isolation` call
`sched_lock`/`sched_unlock`. This may feel a bit excessive, but I think, it is
necessary:
- a) `nxsched_smp_call_async` tries to run the "pause handler" directly
on the current CPU, if `this_cpu()` is part of the given cpuset (see
[source](https://github.com/apache/nuttx/blob/master/sched/sched/sched_smp.c#L317)).
This would starve our current task and prevent the execution of
"leave_smp_isolation". This behavior of "nxsched_smp_call_async" is probably a
bug.
- b) We should not allow other SMP-aware functions to run, because they
may rely on SMP being enabled (causing deadlocks?). This is possible, since
`CONFIG_SMP` is used everywhere as the indicator for "multiple processes
running in parallel". This is not true during the "smp isolation" context
created by these functions.
1. At the moment, the blocked CPUs may receive and handle interrupts. At
least I do not see anything preventing this. Maybe we should run
"enter_critical_section" for each busy-locked CPU in order to avoid this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]