Hi all, I ran into a few problems with the RTOS suspended task stack analysis code in src/rtos. I am using Cortex-M4 parts.
The first problem is that rtos_generic_stack_read doesn’t actually implement alignment properly in the case of downward-growing stacks. The code in rtos.c on lines 492 through 494 works fine for upward-growing stacks, but for downward-growing stacks, if new_stack_ptr is already a multiple of 8, then (new_stack_ptr & ~(alignment - 1)) is equal to new_stack_ptr, and then you add alignment, yielding new_stack_ptr + 8. If alignment is 8 and new_stack_ptr is already a multiple of 8, you shouldn’t adjust the pointer at all. The second problem is that, for Cortex-M, the stack *doesn’t* actually have to be 8-byte aligned at all times, despite what rtos_standard_Cortex_M3_stacking declares! It (typically, according to ABI) has to be 8-byte aligned on entry to a function, but that doesn’t mean it’s always 8-byte aligned: it’s allowed to be unaligned in the middle of a function, and a yield could definitely happen in the middle of a function. In actual fact, assuming CCR.STKALIGN=1, what happens is that the CPU makes the exception frame it pushes on the stack larger or smaller such that it’s 8-byte aligned *after* the push, on entry to the ISR. It records whether or not it did this in bit 9 of the saved xPSR, so that it can properly *de-align* the stack pointer on exception return if it was unaligned to start with. So, to summarize: to reconstruct the actual value of SP as it was in the task, you should *not* assume alignment to 8 bytes on the final task SP, but you *should* check xPSR[9] and, if set, add 4 to the SP you would otherwise calculate (technically OR the SP with 4, not add, is what the CPU does, but this only matters if the stack was improperly aligned on exception return, which should hopefully not happen). The third problem is that it looks like there’s no way to accommodate custom stack layouts other than by hacking OpenOCD itself (I use my own FreeRTOS port and would prefer to be able to define a custom stack layout, including the option to sometimes-but-not-always have the floating-point registers saved). It would be really great if we could describe the stack layout, and maybe even the special logic like xPSR[9] alignment handling, at runtime without rebuilding OpenOCD, say by writing a TCL function or some TCL data structures instead. I could very easily submit a patch to fix the first bug (and perhaps that should be fixed even if we change Cortex-M parts to incorporate xPSR[9], so that non-Cortex-M parts can take advantage of the fix), but the second or third problems would be rather more involved structural changes to the architecture of OpenOCD. Thus, I would like some comments on what people think before I embark on trying to do any of these things—maybe there’s something I’m missing, perhaps a much easier way to accomplish this. I also have limited time to work on OpenOCD, so if I’m expected to do the whole TCL conversion myself, it might take a while, so it would be great to know whether such a change would be accepted before starting work, and also whether there are any other people interested in doing any of this work. -- Christopher Head
signature.asc
Description: PGP signature
------------------------------------------------------------------------------
_______________________________________________ OpenOCD-devel mailing list OpenOCD-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openocd-devel