Hi all,
I ran into a few problems with the RTOS suspended task stack analysis
code in src/rtos. I am using Cortex-M4 parts.

The first problem is that rtos_generic_stack_read doesn’t actually
implement alignment properly in the case of downward-growing stacks.
The code in rtos.c on lines 492 through 494 works fine for
upward-growing stacks, but for downward-growing stacks, if
new_stack_ptr is already a multiple of 8, then (new_stack_ptr &
~(alignment - 1)) is equal to new_stack_ptr, and then you add
alignment, yielding new_stack_ptr + 8. If alignment is 8 and
new_stack_ptr is already a multiple of 8, you shouldn’t adjust the
pointer at all.

The second problem is that, for Cortex-M, the stack *doesn’t* actually
have to be 8-byte aligned at all times, despite what
rtos_standard_Cortex_M3_stacking declares! It (typically, according to
ABI) has to be 8-byte aligned on entry to a function, but that doesn’t
mean it’s always 8-byte aligned: it’s allowed to be unaligned in the
middle of a function, and a yield could definitely happen in the middle
of a function. In actual fact, assuming CCR.STKALIGN=1, what happens is
that the CPU makes the exception frame it pushes on the stack larger or
smaller such that it’s 8-byte aligned *after* the push, on entry to the
ISR. It records whether or not it did this in bit 9 of the saved xPSR,
so that it can properly *de-align* the stack pointer on exception
return if it was unaligned to start with. So, to summarize: to
reconstruct the actual value of SP as it was in the task, you should
*not* assume alignment to 8 bytes on the final task SP, but you
*should* check xPSR[9] and, if set, add 4 to the SP you would otherwise
calculate (technically OR the SP with 4, not add, is what the CPU does,
but this only matters if the stack was improperly aligned on exception
return, which should hopefully not happen).

The third problem is that it looks like there’s no way to accommodate
custom stack layouts other than by hacking OpenOCD itself (I use my own
FreeRTOS port and would prefer to be able to define a custom stack
layout, including the option to sometimes-but-not-always have the
floating-point registers saved). It would be really great if we could
describe the stack layout, and maybe even the special logic like
xPSR[9] alignment handling, at runtime without rebuilding OpenOCD, say
by writing a TCL function or some TCL data structures instead.

I could very easily submit a patch to fix the first bug (and perhaps
that should be fixed even if we change Cortex-M parts to incorporate
xPSR[9], so that non-Cortex-M parts can take advantage of the fix), but
the second or third problems would be rather more involved structural
changes to the architecture of OpenOCD. Thus, I would like some
comments on what people think before I embark on trying to do any of
these things—maybe there’s something I’m missing, perhaps a much easier
way to accomplish this. I also have limited time to work on OpenOCD, so
if I’m expected to do the whole TCL conversion myself, it might take a
while, so it would be great to know whether such a change would be
accepted before starting work, and also whether there are any other
people interested in doing any of this work.
-- 
Christopher Head

Attachment: signature.asc
Description: PGP signature

------------------------------------------------------------------------------
_______________________________________________
OpenOCD-devel mailing list
OpenOCD-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openocd-devel

Reply via email to