Same problem as https://nuttx.yahoogroups.narkive.com/3hggphCi/problem-related-semaphore-and-priority-inheritance

On 3/31/2023 2:19 PM, Xiang Xiao wrote:
When the priority inheritance is enabled on a semaphore, sem_wait will add
the current thread to the semaphore's holder table and expect that the same
thread will call sem_post later to remove it from the holder table.
If we mess this fundamental assumption by waiting/posting from different
threads, many strange things will happen. For example, let's consider
what's happen when a program send a TCP packet:

    1. The send task call sem_wait to become a holder and get IOB
    2. Network subsystem copy the user buffer into IOB and add IOB to the
    queue
    3. The send task exit and then semphare contain a dangling pointer to
    the sending tcb
    4. After network subsystem send IOB to the wire and return it  the pool,
    sem_post is called and will touch the dangling pointer

Zeng Zhaoxiu provides a patch(https://github.com/apache/nuttx/pull/5171) to
workaround this issue.
But, the semaphore holder tracking can't work as we expect anymore.

On Sat, Apr 1, 2023 at 3:52 AM Petro Karashchenko <
petro.karashche...@gmail.com> wrote:

Xiang Xiao, is that still true for the latest code in master branch?
And by "system will
crash if  the priority inheritance enabled semaphore is waited or posted
from different threads" do you mean at the point of sem_post/sem_wait or
some system instability in general?

Best regards,
Petro

On Fri, Mar 31, 2023, 10:39 PM Xiang Xiao <xiaoxiang781...@gmail.com>
wrote:

BTW, https://github.com/apache/nuttx/pull/5070 report that the system
will
crash if  the priority inheritance enabled semaphore is waited or posted
from different threads.

On Sat, Apr 1, 2023 at 3:20 AM Xiang Xiao <xiaoxiang781...@gmail.com>
wrote:


On Sat, Apr 1, 2023 at 12:34 AM Gregory Nutt <spudan...@gmail.com>
wrote:
On 3/31/2023 8:56 AM, Gregory Nutt wrote:
Even more. In my previous example if semaphore is posted from the
interrupt
we do not know which of TaskA or TaskB is no longer a "holder l"
of a
semaphore.

You are right.  In this usage case, the counting semaphore is not
being used for locking; it is being used for signalling an event per

https://cwiki.apache.org/confluence/display/NUTTX/Signaling+Semaphores+and+Priority+Inheritance
In that case, priority inheritance must be turned off.

You example is really confusing because you are mixing two different
concepts, just subtly enough to be really confusing.  If a counting
semaphore is used for locking a set of multiple resources, the posting
the semaphore does not release the resource.  That is not the way that
it is done.  Here is a more believable example of how this would work:

  1. TaskA with priority 10 allocates DMA0 channel by calling a DMA
     channel allocation function.  If a DMA channel is available, it is
     allocated and the allocation function takes the semaphore. TaskA
     then starts DMA activity.
  2. TaskA waits on another signaling semaphore for the DMA to
complete.
  3. TaskB with priority 20 does the same.
  4. TaskC with priority 30 wants to allocate a DMA channel.  It calls
     the channel allocation function which waits on the sempahore for a
     count to be available.  This boost the priority of TaskA and TaskB
     to 30.
  5. When the DMA started by TaskA completes, it signals TaskA which
     calls the resource deallocation function which increments the
     counting semaphore, giving the count to TaskC and storing the base
     priorities.


Normally, the resource(dma channel here) is allocated from one
thread/task, but may be freed in another thread/task. Please consider
how
we malloc and free memory.


The confusion arises because you are mixing the signaling logic with
the
resource deallocation logic.

The mm/iob logic works basically this way.  The logic more complex
then
you would think from above.  IOBs is an example of a critical system
resource that has multiple copies and utilizes a counting semaphore
with
priority inheritance to achieve good real time performance.   IOB
handling is key logic for the correct real time operation of the
overall
system.  Nothing we do must risk this.


IOB is a very good example to demonstrate why it's a bad and dangerous
idea to enable priority inheritance for the counting semaphore. IOB is
normally allocated in the send thread but free in the work thread. If
we
want the priority inheritance to work as expected instead of crashing
the
system, sem_wait/sem_post must come from the same thread, which is a
kind
of lock.


Other places where this logic is (probably) used:

     arch/arm/src/rp2040/rp2040_i2s.c: nxsem_init(&priv->bufsem, 0,
     CONFIG_RP2040_I2S_MAXINFLIGHT);
     arch/arm/src/rtl8720c/amebaz_depend.c:  if (sem_init(_sema, 0,
     init_val))
     arch/arm/src/sama5/sam_ssc.c: nxsem_init(&priv->bufsem, 0,
     CONFIG_SAMA5_SSC_MAXINFLIGHT);
     arch/arm/src/samv7/sam_ssc.c: nxsem_init(&priv->bufsem, 0,
     CONFIG_SAMV7_SSC_MAXINFLIGHT);
     arch/arm/src/stm32/stm32_i2s.c: nxsem_init(&priv->bufsem, 0,
     CONFIG_STM32_I2S_MAXINFLIGHT);
     drivers/can/mcp2515.c: nxsem_init(&priv->txfsem, 0,
     MCP2515_NUM_TX_BUFFERS);
     drivers/video/vnc/vnc_server.c: nxsem_init(&session->freesem, 0,
     CONFIG_VNCSERVER_NUPDATES);
     sched/pthread/pthread_completejoin.c: nxsem_init(&pjoin->data_sem,
     0, (ntasks_waiting + 1));
     wireless/bluetooth/bt_hcicore.c: nxsem_init(&g_btdev.le_pkts_sem,
0,
     g_btdev.le_pkts);
     wireless/bluetooth/bt_hcicore.c: nxsem_init(&g_btdev.ncmd_sem, 0,
1);
     wireless/ieee802154/mac802154.c: nxsem_init(&priv->txdesc_sem, 0,
     CONFIG_MAC802154_NTXDESC);
     wireless/ieee802154/mac802154.c: nxsem_init(&mac->opsem, 0, 1);

Maybe:

     arch/risc-v/src/bl602/bl602_os_hal.c: ret = nxsem_init(sem, 0,
init);
     arch/risc-v/src/esp32c3/esp32c3_ble_adapter.c: ret =
     sem_init(&bt_sem->sem, 0, init);
     arch/risc-v/src/esp32c3/esp32c3_wifi_adapter.c: ret =
     nxsem_init(sem, 0, init);
     arch/xtensa/src/esp32/esp32_ble_adapter.c: ret = sem_init(sem, 0,
init);
     arch/xtensa/src/esp32/esp32_wifi_adapter.c: ret = nxsem_init(sem,
0,
     init);
     arch/xtensa/src/esp32s3/esp32s3_wifi_adapter.c: ret =
     nxsem_init(sem, 0, init);





Reply via email to