Hello,
I would like to solicit a review for some fixes of (mostly) AVR
architecture code. These are some things I noticed when attempting to
create a port for AVR DA/DB family. I am doing my development based on
ATmega family since I am familiar with that and these patches (should)
only apply to it. I tried to avoid any changes that would affect the
other families but I cannot do any testing to confirm these patches
don't break anything for them.
I tested these patches with ATmega1284 - since most of them affect
context switch code, I made a simple stress application that spawns 4
tasks. First reads data from serial port, stores-and-forwards them
through tasks 2 and 3 and finally to a task 4 which writes them back
to
serial port to be verified for integrity by the PC side. As long as no
new data are arriving, currently held data is constantly shuffled
through registers and stack. None of the tasks ever sleep, forcing as
many context switches as possible. I ran this application for a few
hours and no data corruption was observed.
Please note that I do not have a Github account so, if these patches
are
accepted, I will be relying on someone else picking them up and doing
a
PR.
I will attempt to post these patches through the mailing list, there
is
also a git repository nuttx.git available at git.kerogit.eu through
HTTP/S . The branch with the changes is called avr_fixes_rfc1 and is
based on upstream's releases/12.9 . It also rebases cleanly onto
current
master if that is the more preferred location where to put these.
Description of the patches follows:
=== 0001 fixed function name ===
Mistyped function name in Documentation resulting in something not
present in the sources.
=== 0002 fixed typo in Kconfig ===
Missing letter.
=== 0003 save RAMPZ register into thread context ===
RAMPZ is an extension register used when accessing program memory
(flash) from the program itself. It is concatenated with value in
pointer register pair Z. Since user applications can reasonably be
expected to load their data from program memory, this register should
be
saved into the thread context as well.
That said, it's quite unlikely to hit the bug (reading from incorrect
address) fixed by this patch - there are three conditions that need to
be met:
- forced context switch of a task that is currently reading from
program
memory
- any other tasks that receives CPU time also needs to read from
program
memory
- program memory address of those tasks must differ in RAMPZ value,
which is the highest byte
Meeting the last condition is somewhat unlikely because all data
stored
in program memory tends to be placed in a single region by the
compiler.
So to trigger the bug, this region would need to be large enough to
cross 64kB boundary. Considering most supported devices have 128kB of
flash in total and NuttX itself needs close to 64kB with very few
features enabled, it is quite unlikely to have this much const data
and
NuttX and still an useful program.
Still feels worth fixing though, impact on existing users should be
negligible - few bytes of memory spent and few extra instructions. The
fix also applies to ATmega2560 with 256kB of flash where it is more
likely to have a program that can actually hit this.
Note that similar bug still exists in ATmega2560 with its EIND
register.
This register is used to extend address register pair used for making
indirect jumps (making jumps to addresses beyond 128kB boundary
possible.) This is not addressed by this patch as I don't have such
chip
to be able to test the change.
=== 0004 flag microcontrollers that have RAMPZ register ===
Enables code from previous patch on all currently supported chips.
=== 0005 replace AVR __flash qualifier with __memx for some chips ===
This one may need some feedback. This patch changes __flash qualifier
to
__memx in IOBJ macro. These qualifiers instruct the compiler to place
(or, rather, keep) const data in program memory and read it from
there.
The difference is that __flash is only able to address 64kB memory
size
while __memx extends the pointer size to 24 bits. The (possible) issue
here is that this definition of IOBJ is currently restricted to
compilation with AVR toolchain which I don't use and therefore can't
test. However, I believe that their toolchain is based on gcc (which I
use) and therefore behaves the same. At the minimum, the __memx
qualifier is supported there for sure as it is the value used in IPTR
macro already present in the code.
Anyway, with this changes and with corresponding configuration options
enabled, the code compiles and - judging from the assembly - seems to
do
the right thing (ie. does respect the __memx qualifier from IPTR macro
used in parameters for syslog and syslog-related functions.)
Using __flash qualifier as it is used now may cause bugs when const
data
is placed to memory regions above 64kB. As of now, compiler tends to
put
them at the beginning of the .text section so it is again unlikely to
happen - as long as the data isn't large enough. And again, might be
worth fixing since impact on existing users should be negligible. For
starters, this is currently only used for debugging code and the
associated cost of using 24 bit pointers is already paid in current
form
because of IPTR being set to __memx
This change does not apply for devices with 64kB or smaller program
memory (although there is no such device currently supported.)
=== 0006 fixed PC load on chips with more than 128kB program memory
===
The original code uses Y pointer register pair to load 3 bytes from
given memory address. However, using subi (SUBtract Immediate)
instruction here is incorrect as it only affects the register it is
called upon - YL, low byte from the Y register pair. High byte remains
unchanged, resulting in incorrect load if the lower byte value wraps.
The correct instruction for this would be sbiw (SuBtract Immediate
from
Word) which operates on the whole pair. The fix does not do that
though
- instead, the Y register pair is preloaded with value incremented by
1
and loads are executed using pre-decrement in the load instructions.
This both fixes the bug and also saves a bit of CPU time. The value in
Y
register then ends up being different compared to the original code
but
that should not matter since - as far as I can see - it is not used
and
is later overwritten.
All the feedback and help is appreciated and my apologies for this
text
being this long