[issue] Killing a parent pid and all its children pid all together

2021-05-27 Thread Rodrigo Garcia
Hello all,

My issue is related to testing the SIGKILL in Nuttx and not making it work
correctly.

Description:
=
I am using a ESP32 DevKitC Board using default NSH configuration.

1) Enabled CONFIG_SIG_DEFAULT=y and CONFIG_SCHED_HAVE_PARENT=y
2) After starting the board, type "sh" +  3 times. Run "ps":

nsh> sh
nsh> sh
nsh> sh
nsh> ps
  PID  PPID PRI POLICY   TYPENPX STATEEVENT SIGMASK   STACK
COMMAND
0 0   0 FIFO Kthread N-- Ready   003072
Idle Task
1 0 100 RR   Task--- Waiting  Signal 002096 init
4 1 100 RR   Task--- Waiting  Signal 002096 sh
5 4 100 RR   Task--- Waiting  Signal 002096 sh
6 5 100 RR   Task--- Running 002096 sh

3) kill the parent "sh" and try to run "ps"
Result is a situation where it is very hard to type any command, as seen
below:

nsh> kill -9 4
nsh> nsh>
nsh>
nsh> ps
nsh: p: command not found
nsh> sp
nsh: ss: command not found
nsh> ps
nsh: pp: command not found
nsh> ss
nsh: ss: command not found
nsh> pps
nsh: p: command not found
nsh> ps
nsh: spsp: command not found
nsh> sp
nsh: ss: command not found
nsh> sp
  PID  PPID PRI POLICY   TYPENPX STATEEVENT SIGMASK   STACK
COMMAND
0 0   0 FIFO Kthread N-- Ready   003072
Idle Task
1 0 100 RR   Task--- Ready   002096 init
5 4 100 RR   Task--- Waiting  Signal 002096 sh
6 5 100 RR   Task--- Running 002096 sh

4) As above, none of the children were killed... and I think that init (pid
1) and sh (pid 6) are both reading the console input, generating all those
"sort of typing errors" above. I had to type "sp" to run "ps".

Question:
Do I need to set up some other option on menuconfig in order to kill the
pid of the parent and all its children all together?

Best Regards,
Rodrigo Garcia.


Re: Port of project from NuttX 7.30 to 10.1 RC1: Unexpected IRQ

2021-05-27 Thread Alan Carvalho de Assis
I think a benefit from renaming many of those "up_something" to
"stm32_something", "esp32_something", etc is now it is easy for
software find the right function.

I think many IDEs cannot handle functions search correctly for NuttX
because they don't have heuristics to know that IF I'm searching a
function inside a board or inside an arch, it shouldn't return a
function with same name from other board or from other arch.

So, at end-of-day, these modifications you are complain about, will
make the life of all users better.

BR,

Alan

On 5/27/21, Sebastien Lorquet  wrote:
> I sill wonder what is the purpose of this variable rename. Sorry to say,
> but it just looks cosmetic while critically breaking everything that was
> made before, and this kind of thing is a nightmare for migration when
> you cant follow the project day to day. Boards can be external to the
> project, and are a supported feature, so they should continue to work
> reliably even if you change the internal sauce!
>
> At one point there was too many trafic on the mailing list and I just
> stopped reading it, I marked several hundreds of messages as read
> without having the time to go through then. It seems that this change
> was made during this time.
>
> Sebastien
>
> Le 27/05/2021 à 09:38, Sebastien Lorquet a écrit :
>> Boom, that was the extrastuff. The board now boots. We're going to run
>> a lot of functional tests to make sure everything is okay, but I dont
>> have this strange hardfault at boot.
>>
>> Thank you.
>>
>> I did not find this page despite searching through a lot of
>> documentation, mainly the "official" ReadTheDocs-like documentation.
>>
>> I suggest you link to this doc in the getting started manuals.
>>
>> Sebastien
>>
>>
>> Le 26/05/2021 à 18:42, Abdelatif Guettouche a écrit :
>>> Maybe this one could help:
>>> https://cwiki.apache.org/confluence/display/NUTTX/NuttX+9.1#NuttX9.1-CompatibilityConcerns
>>>
>>>
>>>
 I am using the flat (monolithic build) and I see no place that define
 this flag, at all.
 I dont even see a place in the codebase that defines this flag.
>>> __KERNEL__ is defined in tools/Config.mk (line:100)
>>>
 The fact that mm_initialize only shows one region is weird... where is
>>> the heap for the main RAM at 0x2000?
>>>
>>> CONFIG_MM_REGIONS needs to be set up correctly if you have multiple
>>> heap regions.
>>>
>>> On Wed, May 26, 2021 at 5:22 PM Sebastien Lorquet
>>>  wrote:
 Hello,

 Thanks for the remarks.

 I am using the flat (monolithic build) and I see no place that define
 this flag, at all.

 I dont even see a place in the codebase that defines this flag.

 I see nothing related to mm, nor anything outdated in my Make.defs,
 which is from my old setup, yes, but still similar to a recent one.

 Sebastien

 Le 26/05/2021 à 18:08, raiden00pl a écrit :
> If you use CONFIG_BUILD_FLAT=y, make sure that __KERNEL__ flag is
> set here:
> https://github.com/apache/incubator-nuttx/blob/master/include/nuttx/mm/mm.h#L85
>
>
> I remember that at some point I had a similar hardfault in mm which
> doesn't
> make sense and it was due to outdated board Make.defs.
>
> śr., 26 maj 2021 o 17:21 Sebastien Lorquet 
> napisał(a):
>
>> Update: stack dump and register analysis are in fact pointing to a
>> crash
>> in mm_alloc
>>
>> I have enabled memory management debug:
>>
>> mm_initialize: Heap: start=0x1000 size=65536
>> mm_addregion: Region 1: base=0x1154 size=65184
>> stm32_netinitialize: Enabling PHY power
>> stm32_netinitialize: PHY reset...
>> stm32_netinitialize: PHY reset done.
>> stm32_netinitialize: Configuring PHY int
>> F
>> mm_free: Freeing 0x70fb460b
>> irq_unexpected_isr: ERROR irq: 3
>> up_assert: Assertion failed at file:irq/irq_unexpectedisr.c line: 50
>> up_registerdump: R0: 0001 2000737c c0f2 08000101 
>>   200073c8
>> up_registerdump: R8:     
>> 200073c8 080126ad 080126f8
>> up_registerdump: xPSR: 2100 PRIMASK:  CONTROL: 
>> up_registerdump: EXC_RETURN: fff9
>> up_dumpstate: sp: 200072c8
>> up_dumpstate: stack base: 20007078
>> up_dumpstate: stack size: 0400
>>
>> The fact that mm_initialize only shows one region is weird...
>> where is
>> the heap for the main RAM at 0x2000?
>>
>> the mm_free(0x70fb460b) is not what causes the hardfault (it comes
>> later), but what the hell is is this invalid address!
>>
>> This is the first call to mm_free, here is the backtrace:
>>
>> Breakpoint 1, mm_free (heap=0x200060b4 , mem=0x70fb460b) at
>> mm_heap/mm_free.c:85
>> 85if (!mem)
>> (gdb) bt
>> #0  mm_free (heap=0x200060b4 , mem=0x70fb460b) at
>> mm_heap/mm_free.c:85
>> #1  0x0801264a 

Re: Nimble on U-blox Nina B112 (Nrf52832)

2021-05-27 Thread Erik Englund
Yes, I think seamless Nordic BLE support is important for NuttX.

I will try to release some time for this, I´ve got nrf52832, nrf52833 and
nrf52840 boards/products at my disposal.

Med vänlig hälsning
Erik Englund

Innoware Development AB
Hyttvägen 13
73338 SALA

Org.nr. 556790-2977
www.innoware.se


Den ons 26 maj 2021 kl 11:42 skrev Alan Carvalho de Assis :

> Thank you Erik and Greg,
>
> I think we need to modify the default "sdc" board config to get nimBLE
> running correctly.
>
> Thank you for these suggestions.
>
> BR,
>
> Alan
>
> On Wednesday, May 26, 2021, Erik Englund  wrote:
>
> > I was encountering the same error while trying to run NuttX 10.x / nimble
> > on NRF52832, tracked it down to insufficient ram available.
> > The nimble nsh-app were present in the builtin-apps internal lists, but
> > when trying to allocate application stack NuttX will return an error
> code,
> > and it seems all allocation error codes when trying to start an nsh-app
> > will result in that "command not found" error message.
> > I think enabling some memory debugging flags in nuttx will show you the
> > correct error.
> >
> > So this isn't a nimble problem.
> >
> >
> > Med vänlig hälsning
> > Erik Englund
> >
> > Innoware Development AB
> > Hyttvägen 13
> > 73338 SALA
> >
> >
> > Org.nr. 556790-2977
> > www.innoware.se
> >
> >
> > Den ons 26 maj 2021 kl 01:42 skrev Gregory Nutt :
> >
> > > The failure doesn't seem to have anything to do with nimBLE.  The
> nimble
> > > app is not running at all!
> > >
> > > Put a breakpoint on nimble_main().  I doubt that you ever get there.
> > > But I don't know why.
> > >
> > > The error report is probably misleading too...  I seem to recall that
> > > that there is an issue that the NSH error reported is always "command
> > > not found" even if some other error occurs.  It does mean that NSH
> could
> > > not run the built-in command, but it does not necessarily mean that the
> > > built-in command was not found.
> > >
> > > On 5/25/2021 4:51 PM, Alan Carvalho de Assis wrote:
> > > > Hi Matias and Miguel,
> > > >
> > > > I just tried nimble on nrf52832-mdk board without success:
> > > >
> > > > $ ./tools/configure.sh nrf52832-mdk:sdc
> > > > $ make
> > > >
> > > > It downloaded and compiled nimble for NuttX correctly, the nuttx.bin
> > > > was about 314944 bytes.
> > > >
> > > > When I drop this file inside DAPLINK disk it tries to flash and
> create
> > > > the file FAIL.TXT with this content:
> > > >
> > > > "The hex file cannot be decoded. Checksum calculation failure
> > occurred."
> > > >
> > > > Then I ran "make menuconfig" and enabled the "Intel HEX binary
> format"
> > > > and after copying the nuttx.hex to DAPLINK disk the error
> disappeared.
> > > >
> > > > Accessing the nsh terminal I can see the nimble binary, but it is not
> > > running:
> > > >
> > > > NuttShell (NSH) NuttX-10.1.0-RC1
> > > > nsh> ?
> > > > help usage:  help [-v] []
> > > >
> > > >. cdecho  hexdump   mkdir pssource
> > > >   unset
> > > >[ cpexec  ifconfig  mkfatfs   pwd   test
> > > >   usleep
> > > >? cmp   exit  ifdownmkrd  rmtime
> > > >   xd
> > > >basename  dirname   false ifup  mount rmdir true
> > > >break ddfree  kill  mvset   uname
> > > >cat   dfhelp  lsnslookup  sleep umount
> > > >
> > > > Builtin Apps:
> > > >nimble  sh  nsh
> > > > nsh> nimble
> > > > nsh: nimble: command not found
> > > > nsh> ifconfig
> > > > bnep0   Link encap:UNSPEC at UP
> > > >
> > > > nsh> nimble -h
> > > > nsh: nimble: command not found
> > > > nsh> nimble
> > > > nsh: nimble: command not found
> > > > nsh>
> > > >
> > > > Initially I thought it was caused by recent update of the nimble
> stack
> > > > on NuttX, but I moved to a commit previous to that update and still
> > > > facing same error.
> > > >
> > > > Matias, do you think it could be some issue with my crosscompiler?
> > > >
> > > > I'm using the default ARM gcc from Ubuntu 20.04 gcc-arm-none-eabi
> > > package:
> > > >
> > > > gcc version 9.2.1 20191025 (release) [ARM/arm-9-branch revision
> > > > 277599] (15:9-2019-q4-0ubuntu1)
> > > >
> > > > Thank you very much!
> > > >
> > > > BR,
> > > >
> > > > Alan
> > > >
> > > > On 5/25/21, Miguel Wisintainer  wrote:
> > > >> Matias
> > > >>
> > > >> Me and Alan will investigate!
> > > >>
> > > >> Thank you so much!
> > > >>
> > > >> Enviado do Email
> para
> > > >> Windows 10
> > > >>
> > > >>
> > >
> >
>


Re: Port of project from NuttX 7.30 to 10.1 RC1: Unexpected IRQ

2021-05-27 Thread Sebastien Lorquet
I sill wonder what is the purpose of this variable rename. Sorry to say, 
but it just looks cosmetic while critically breaking everything that was 
made before, and this kind of thing is a nightmare for migration when 
you cant follow the project day to day. Boards can be external to the 
project, and are a supported feature, so they should continue to work 
reliably even if you change the internal sauce!


At one point there was too many trafic on the mailing list and I just 
stopped reading it, I marked several hundreds of messages as read 
without having the time to go through then. It seems that this change 
was made during this time.


Sebastien

Le 27/05/2021 à 09:38, Sebastien Lorquet a écrit :
Boom, that was the extrastuff. The board now boots. We're going to run 
a lot of functional tests to make sure everything is okay, but I dont 
have this strange hardfault at boot.


Thank you.

I did not find this page despite searching through a lot of 
documentation, mainly the "official" ReadTheDocs-like documentation.


I suggest you link to this doc in the getting started manuals.

Sebastien


Le 26/05/2021 à 18:42, Abdelatif Guettouche a écrit :

Maybe this one could help:
https://cwiki.apache.org/confluence/display/NUTTX/NuttX+9.1#NuttX9.1-CompatibilityConcerns 




I am using the flat (monolithic build) and I see no place that define
this flag, at all.
I dont even see a place in the codebase that defines this flag.

__KERNEL__ is defined in tools/Config.mk (line:100)


The fact that mm_initialize only shows one region is weird... where is

the heap for the main RAM at 0x2000?

CONFIG_MM_REGIONS needs to be set up correctly if you have multiple
heap regions.

On Wed, May 26, 2021 at 5:22 PM Sebastien Lorquet 
 wrote:

Hello,

Thanks for the remarks.

I am using the flat (monolithic build) and I see no place that define
this flag, at all.

I dont even see a place in the codebase that defines this flag.

I see nothing related to mm, nor anything outdated in my Make.defs,
which is from my old setup, yes, but still similar to a recent one.

Sebastien

Le 26/05/2021 à 18:08, raiden00pl a écrit :
If you use CONFIG_BUILD_FLAT=y, make sure that __KERNEL__ flag is 
set here:
https://github.com/apache/incubator-nuttx/blob/master/include/nuttx/mm/mm.h#L85 

I remember that at some point I had a similar hardfault in mm which 
doesn't

make sense and it was due to outdated board Make.defs.

śr., 26 maj 2021 o 17:21 Sebastien Lorquet 
napisał(a):

Update: stack dump and register analysis are in fact pointing to a 
crash

in mm_alloc

I have enabled memory management debug:

mm_initialize: Heap: start=0x1000 size=65536
mm_addregion: Region 1: base=0x1154 size=65184
stm32_netinitialize: Enabling PHY power
stm32_netinitialize: PHY reset...
stm32_netinitialize: PHY reset done.
stm32_netinitialize: Configuring PHY int
F
mm_free: Freeing 0x70fb460b
irq_unexpected_isr: ERROR irq: 3
up_assert: Assertion failed at file:irq/irq_unexpectedisr.c line: 50
up_registerdump: R0: 0001 2000737c c0f2 08000101 
  200073c8
up_registerdump: R8:     
200073c8 080126ad 080126f8
up_registerdump: xPSR: 2100 PRIMASK:  CONTROL: 
up_registerdump: EXC_RETURN: fff9
up_dumpstate: sp: 200072c8
up_dumpstate: stack base: 20007078
up_dumpstate: stack size: 0400

The fact that mm_initialize only shows one region is weird... 
where is

the heap for the main RAM at 0x2000?

the mm_free(0x70fb460b) is not what causes the hardfault (it comes
later), but what the hell is is this invalid address!

This is the first call to mm_free, here is the backtrace:

Breakpoint 1, mm_free (heap=0x200060b4 , mem=0x70fb460b) at
mm_heap/mm_free.c:85
85    if (!mem)
(gdb) bt
#0  mm_free (heap=0x200060b4 , mem=0x70fb460b) at
mm_heap/mm_free.c:85
#1  0x0801264a in mm_free_delaylist (heap=0x200060b4 ) at
mm_heap/mm_malloc.c:82
#2  0x08012672 in mm_malloc (heap=0x200060b4 , size=24) at
mm_heap/mm_malloc.c:115
#3  0x08012a32 in mm_zalloc (heap=0x200060b4 , size=24) at
mm_heap/mm_zalloc.c:45
#4  0x080123ac in zalloc (size=24) at umm_heap/umm_zalloc.c:68
#5  0x080399fa in inode_alloc (name=0x8059a78 "") at
inode/fs_inodereserve.c:78
#6  0x08039a5c in inode_root_reserve () at 
inode/fs_inodereserve.c:129

#7  0x080398cc in inode_initialize () at inode/fs_inode.c:92
#8  0x08039284 in fs_initialize () at fs_initialize.c:47
#9  0x08007eb4 in nx_start () at init/nx_start.c:600
#10 0x0800421e in __start () at chip/stm32_start.c:338

As previously analyzed, this happens in fs_initialize through
inode_root_reserve, so I was on the right track.

Caller shows mm_free called with that weird address:

(gdb) f 1
#1  0x0801264a in mm_free_delaylist (heap=0x200060b4 ) at
mm_heap/mm_malloc.c:82
82    mm_free(heap, address);
(gdb) list
77
78    /* The address should always be non-NULL since that was
checked in the
79 * 'while' condition above.

Re: Port of project from NuttX 7.30 to 10.1 RC1: Unexpected IRQ

2021-05-27 Thread Sebastien Lorquet
Boom, that was the extrastuff. The board now boots. We're going to run a 
lot of functional tests to make sure everything is okay, but I dont have 
this strange hardfault at boot.


Thank you.

I did not find this page despite searching through a lot of 
documentation, mainly the "official" ReadTheDocs-like documentation.


I suggest you link to this doc in the getting started manuals.

Sebastien


Le 26/05/2021 à 18:42, Abdelatif Guettouche a écrit :

Maybe this one could help:
https://cwiki.apache.org/confluence/display/NUTTX/NuttX+9.1#NuttX9.1-CompatibilityConcerns


I am using the flat (monolithic build) and I see no place that define
this flag, at all.
I dont even see a place in the codebase that defines this flag.

__KERNEL__ is defined in tools/Config.mk (line:100)


The fact that mm_initialize only shows one region is weird... where is

the heap for the main RAM at 0x2000?

CONFIG_MM_REGIONS needs to be set up correctly if you have multiple
heap regions.

On Wed, May 26, 2021 at 5:22 PM Sebastien Lorquet  wrote:

Hello,

Thanks for the remarks.

I am using the flat (monolithic build) and I see no place that define
this flag, at all.

I dont even see a place in the codebase that defines this flag.

I see nothing related to mm, nor anything outdated in my Make.defs,
which is from my old setup, yes, but still similar to a recent one.

Sebastien

Le 26/05/2021 à 18:08, raiden00pl a écrit :

If you use CONFIG_BUILD_FLAT=y, make sure that __KERNEL__ flag is set here:
https://github.com/apache/incubator-nuttx/blob/master/include/nuttx/mm/mm.h#L85
I remember that at some point I had a similar hardfault in mm which doesn't
make sense and it was due to outdated board Make.defs.

śr., 26 maj 2021 o 17:21 Sebastien Lorquet 
napisał(a):


Update: stack dump and register analysis are in fact pointing to a crash
in mm_alloc

I have enabled memory management debug:

mm_initialize: Heap: start=0x1000 size=65536
mm_addregion: Region 1: base=0x1154 size=65184
stm32_netinitialize: Enabling PHY power
stm32_netinitialize: PHY reset...
stm32_netinitialize: PHY reset done.
stm32_netinitialize: Configuring PHY int
F
mm_free: Freeing 0x70fb460b
irq_unexpected_isr: ERROR irq: 3
up_assert: Assertion failed at file:irq/irq_unexpectedisr.c line: 50
up_registerdump: R0: 0001 2000737c c0f2 08000101 
  200073c8
up_registerdump: R8:     
200073c8 080126ad 080126f8
up_registerdump: xPSR: 2100 PRIMASK:  CONTROL: 
up_registerdump: EXC_RETURN: fff9
up_dumpstate: sp: 200072c8
up_dumpstate: stack base: 20007078
up_dumpstate: stack size: 0400

The fact that mm_initialize only shows one region is weird... where is
the heap for the main RAM at 0x2000?

the mm_free(0x70fb460b) is not what causes the hardfault (it comes
later), but what the hell is is this invalid address!

This is the first call to mm_free, here is the backtrace:

Breakpoint 1, mm_free (heap=0x200060b4 , mem=0x70fb460b) at
mm_heap/mm_free.c:85
85if (!mem)
(gdb) bt
#0  mm_free (heap=0x200060b4 , mem=0x70fb460b) at
mm_heap/mm_free.c:85
#1  0x0801264a in mm_free_delaylist (heap=0x200060b4 ) at
mm_heap/mm_malloc.c:82
#2  0x08012672 in mm_malloc (heap=0x200060b4 , size=24) at
mm_heap/mm_malloc.c:115
#3  0x08012a32 in mm_zalloc (heap=0x200060b4 , size=24) at
mm_heap/mm_zalloc.c:45
#4  0x080123ac in zalloc (size=24) at umm_heap/umm_zalloc.c:68
#5  0x080399fa in inode_alloc (name=0x8059a78 "") at
inode/fs_inodereserve.c:78
#6  0x08039a5c in inode_root_reserve () at inode/fs_inodereserve.c:129
#7  0x080398cc in inode_initialize () at inode/fs_inode.c:92
#8  0x08039284 in fs_initialize () at fs_initialize.c:47
#9  0x08007eb4 in nx_start () at init/nx_start.c:600
#10 0x0800421e in __start () at chip/stm32_start.c:338

As previously analyzed, this happens in fs_initialize through
inode_root_reserve, so I was on the right track.

Caller shows mm_free called with that weird address:

(gdb) f 1
#1  0x0801264a in mm_free_delaylist (heap=0x200060b4 ) at
mm_heap/mm_malloc.c:82
82mm_free(heap, address);
(gdb) list
77
78/* The address should always be non-NULL since that was
checked in the
79 * 'while' condition above.
80 */
81
82mm_free(heap, address); <-- address == 0x70fb460b
83  }
84  #endif
85  }
86

(gdb) print _mmheap
$3 = (struct mm_heap_s *) 0x200060b4 
(gdb) print g_mmheap
$4 = {mm_impl = 0x0}

this is not good!

This is not a timing or IRQ related issue but a heap issue.

R15 = 080126f8 translates to here:


https://github.com/apache/incubator-nuttx/blob/master/mm/mm_heap/mm_malloc.c#L199

=> this free() has corrupted a badly initialized heap, and the next
malloc fails, giving a hardfault because that address is invalid.

Horrific mess!

==>

I think that my old board code does not initialize the board properly, I
probably have to check for differences between my code and the

Re: Port of project from NuttX 7.30 to 10.1 RC1: Unexpected IRQ

2021-05-27 Thread Sebastien Lorquet
Code was working perfectly before the nuttx upgrade so I tend to think 
that stack is sufficient.


Sebastien

Le 26/05/2021 à 21:49, David Sidrane a écrit :

Hi Sebastien,

Stack crashing into heap?

Have you upped the stack sizes across the board?


David

-Original Message-
From: Sebastien Lorquet [mailto:sebast...@lorquet.fr]
Sent: Wednesday, May 26, 2021 9:22 AM
To: dev@nuttx.apache.org
Subject: Re: Port of project from NuttX 7.30 to 10.1 RC1: Unexpected IRQ

Hello,

Thanks for the remarks.

I am using the flat (monolithic build) and I see no place that define
this flag, at all.

I dont even see a place in the codebase that defines this flag.

I see nothing related to mm, nor anything outdated in my Make.defs,
which is from my old setup, yes, but still similar to a recent one.

Sebastien

Le 26/05/2021 à 18:08, raiden00pl a écrit :

If you use CONFIG_BUILD_FLAT=y, make sure that __KERNEL__ flag is set
here:
https://github.com/apache/incubator-nuttx/blob/master/include/nuttx/mm/mm.h#L85
I remember that at some point I had a similar hardfault in mm which
doesn't
make sense and it was due to outdated board Make.defs.

śr., 26 maj 2021 o 17:21 Sebastien Lorquet 
napisał(a):


Update: stack dump and register analysis are in fact pointing to a crash
in mm_alloc

I have enabled memory management debug:

mm_initialize: Heap: start=0x1000 size=65536
mm_addregion: Region 1: base=0x1154 size=65184
stm32_netinitialize: Enabling PHY power
stm32_netinitialize: PHY reset...
stm32_netinitialize: PHY reset done.
stm32_netinitialize: Configuring PHY int
F
mm_free: Freeing 0x70fb460b
irq_unexpected_isr: ERROR irq: 3
up_assert: Assertion failed at file:irq/irq_unexpectedisr.c line: 50
up_registerdump: R0: 0001 2000737c c0f2 08000101 
  200073c8
up_registerdump: R8:     
200073c8 080126ad 080126f8
up_registerdump: xPSR: 2100 PRIMASK:  CONTROL: 
up_registerdump: EXC_RETURN: fff9
up_dumpstate: sp: 200072c8
up_dumpstate: stack base: 20007078
up_dumpstate: stack size: 0400

The fact that mm_initialize only shows one region is weird... where is
the heap for the main RAM at 0x2000?

the mm_free(0x70fb460b) is not what causes the hardfault (it comes
later), but what the hell is is this invalid address!

This is the first call to mm_free, here is the backtrace:

Breakpoint 1, mm_free (heap=0x200060b4 , mem=0x70fb460b) at
mm_heap/mm_free.c:85
85if (!mem)
(gdb) bt
#0  mm_free (heap=0x200060b4 , mem=0x70fb460b) at
mm_heap/mm_free.c:85
#1  0x0801264a in mm_free_delaylist (heap=0x200060b4 ) at
mm_heap/mm_malloc.c:82
#2  0x08012672 in mm_malloc (heap=0x200060b4 , size=24) at
mm_heap/mm_malloc.c:115
#3  0x08012a32 in mm_zalloc (heap=0x200060b4 , size=24) at
mm_heap/mm_zalloc.c:45
#4  0x080123ac in zalloc (size=24) at umm_heap/umm_zalloc.c:68
#5  0x080399fa in inode_alloc (name=0x8059a78 "") at
inode/fs_inodereserve.c:78
#6  0x08039a5c in inode_root_reserve () at inode/fs_inodereserve.c:129
#7  0x080398cc in inode_initialize () at inode/fs_inode.c:92
#8  0x08039284 in fs_initialize () at fs_initialize.c:47
#9  0x08007eb4 in nx_start () at init/nx_start.c:600
#10 0x0800421e in __start () at chip/stm32_start.c:338

As previously analyzed, this happens in fs_initialize through
inode_root_reserve, so I was on the right track.

Caller shows mm_free called with that weird address:

(gdb) f 1
#1  0x0801264a in mm_free_delaylist (heap=0x200060b4 ) at
mm_heap/mm_malloc.c:82
82mm_free(heap, address);
(gdb) list
77
78/* The address should always be non-NULL since that was
checked in the
79 * 'while' condition above.
80 */
81
82mm_free(heap, address); <-- address == 0x70fb460b
83  }
84  #endif
85  }
86

(gdb) print _mmheap
$3 = (struct mm_heap_s *) 0x200060b4 
(gdb) print g_mmheap
$4 = {mm_impl = 0x0}

this is not good!

This is not a timing or IRQ related issue but a heap issue.

R15 = 080126f8 translates to here:


https://github.com/apache/incubator-nuttx/blob/master/mm/mm_heap/mm_malloc.c#L199

=> this free() has corrupted a badly initialized heap, and the next
malloc fails, giving a hardfault because that address is invalid.

Horrific mess!

==>

I think that my old board code does not initialize the board properly, I
probably have to check for differences between my code and the
stm32f429i-disco built-in board (on which I based my board).

Sebastien

Le 25/05/2021 à 21:26, Nathan Hartman a écrit :

On Tue, May 25, 2021 at 12:02 PM Sebastien Lorquet 
Back to the business

After this we managed to recompile our project using the latest NuttX
sources, but it fails when trying to init the PHY irq on our
STM32F427
board: We get "unexpected IRQ".

Yes I know that's pretty vague :-)

Is there anything obvious I should have been careful with in this
domain, before I dig the jtag probe to fix it (tomorrow) ?

I would first start by looking