When the priority inheritance is enabled on a semaphore, sem_wait will add
the current thread to the semaphore's holder table and expect that the same
thread will call sem_post later to remove it from the holder table.
If we mess this fundamental assumption by waiting/posting from different
threads, many strange things will happen. For example, let's consider
what's happen when a program send a TCP packet:

   1. The send task call sem_wait to become a holder and get IOB
   2. Network subsystem copy the user buffer into IOB and add IOB to the
   queue
   3. The send task exit and then semphare contain a dangling pointer to
   the sending tcb
   4. After network subsystem send IOB to the wire and return it  the pool,
   sem_post is called and will touch the dangling pointer

Zeng Zhaoxiu provides a patch(https://github.com/apache/nuttx/pull/5171) to
workaround this issue.
But, the semaphore holder tracking can't work as we expect anymore.

On Sat, Apr 1, 2023 at 3:52 AM Petro Karashchenko <
petro.karashche...@gmail.com> wrote:

> Xiang Xiao, is that still true for the latest code in master branch?
> And by "system will
> crash if  the priority inheritance enabled semaphore is waited or posted
> from different threads" do you mean at the point of sem_post/sem_wait or
> some system instability in general?
>
> Best regards,
> Petro
>
> On Fri, Mar 31, 2023, 10:39 PM Xiang Xiao <xiaoxiang781...@gmail.com>
> wrote:
>
> > BTW, https://github.com/apache/nuttx/pull/5070 report that the system
> will
> > crash if  the priority inheritance enabled semaphore is waited or posted
> > from different threads.
> >
> > On Sat, Apr 1, 2023 at 3:20 AM Xiang Xiao <xiaoxiang781...@gmail.com>
> > wrote:
> >
> > >
> > >
> > > On Sat, Apr 1, 2023 at 12:34 AM Gregory Nutt <spudan...@gmail.com>
> > wrote:
> > >
> > >> On 3/31/2023 8:56 AM, Gregory Nutt wrote:
> > >> >
> > >> >> Even more. In my previous example if semaphore is posted from the
> > >> >> interrupt
> > >> >> we do not know which of TaskA or TaskB is no longer a "holder l"
> of a
> > >> >> semaphore.
> > >> >>
> > >> > You are right.  In this usage case, the counting semaphore is not
> > >> > being used for locking; it is being used for signalling an event per
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/NUTTX/Signaling+Semaphores+and+Priority+Inheritance
> > >> >
> > >> > In that case, priority inheritance must be turned off.
> > >> >
> > >> You example is really confusing because you are mixing two different
> > >> concepts, just subtly enough to be really confusing.  If a counting
> > >> semaphore is used for locking a set of multiple resources, the posting
> > >> the semaphore does not release the resource.  That is not the way that
> > >> it is done.  Here is a more believable example of how this would work:
> > >>
> > >>  1. TaskA with priority 10 allocates DMA0 channel by calling a DMA
> > >>     channel allocation function.  If a DMA channel is available, it is
> > >>     allocated and the allocation function takes the semaphore. TaskA
> > >>     then starts DMA activity.
> > >>  2. TaskA waits on another signaling semaphore for the DMA to
> complete.
> > >>  3. TaskB with priority 20 does the same.
> > >>  4. TaskC with priority 30 wants to allocate a DMA channel.  It calls
> > >>     the channel allocation function which waits on the sempahore for a
> > >>     count to be available.  This boost the priority of TaskA and TaskB
> > >>     to 30.
> > >>  5. When the DMA started by TaskA completes, it signals TaskA which
> > >>     calls the resource deallocation function which increments the
> > >>     counting semaphore, giving the count to TaskC and storing the base
> > >>     priorities.
> > >>
> > >>
> > >
> > > Normally, the resource(dma channel here) is allocated from one
> > > thread/task, but may be freed in another thread/task. Please consider
> how
> > > we malloc and free memory.
> > >
> > >
> > >> The confusion arises because you are mixing the signaling logic with
> the
> > >> resource deallocation logic.
> > >>
> > >> The mm/iob logic works basically this way.  The logic more complex
> then
> > >> you would think from above.  IOBs is an example of a critical system
> > >> resource that has multiple copies and utilizes a counting semaphore
> with
> > >> priority inheritance to achieve good real time performance.   IOB
> > >> handling is key logic for the correct real time operation of the
> overall
> > >> system.  Nothing we do must risk this.
> > >>
> > >>
> > > IOB is a very good example to demonstrate why it's a bad and dangerous
> > > idea to enable priority inheritance for the counting semaphore. IOB is
> > > normally allocated in the send thread but free in the work thread. If
> we
> > > want the priority inheritance to work as expected instead of crashing
> the
> > > system, sem_wait/sem_post must come from the same thread, which is a
> kind
> > > of lock.
> > >
> > >
> > >> Other places where this logic is (probably) used:
> > >>
> > >>     arch/arm/src/rp2040/rp2040_i2s.c: nxsem_init(&priv->bufsem, 0,
> > >>     CONFIG_RP2040_I2S_MAXINFLIGHT);
> > >>     arch/arm/src/rtl8720c/amebaz_depend.c:  if (sem_init(_sema, 0,
> > >>     init_val))
> > >>     arch/arm/src/sama5/sam_ssc.c: nxsem_init(&priv->bufsem, 0,
> > >>     CONFIG_SAMA5_SSC_MAXINFLIGHT);
> > >>     arch/arm/src/samv7/sam_ssc.c: nxsem_init(&priv->bufsem, 0,
> > >>     CONFIG_SAMV7_SSC_MAXINFLIGHT);
> > >>     arch/arm/src/stm32/stm32_i2s.c: nxsem_init(&priv->bufsem, 0,
> > >>     CONFIG_STM32_I2S_MAXINFLIGHT);
> > >>     drivers/can/mcp2515.c: nxsem_init(&priv->txfsem, 0,
> > >>     MCP2515_NUM_TX_BUFFERS);
> > >>     drivers/video/vnc/vnc_server.c: nxsem_init(&session->freesem, 0,
> > >>     CONFIG_VNCSERVER_NUPDATES);
> > >>     sched/pthread/pthread_completejoin.c: nxsem_init(&pjoin->data_sem,
> > >>     0, (ntasks_waiting + 1));
> > >>     wireless/bluetooth/bt_hcicore.c: nxsem_init(&g_btdev.le_pkts_sem,
> 0,
> > >>     g_btdev.le_pkts);
> > >>     wireless/bluetooth/bt_hcicore.c: nxsem_init(&g_btdev.ncmd_sem, 0,
> > 1);
> > >>     wireless/ieee802154/mac802154.c: nxsem_init(&priv->txdesc_sem, 0,
> > >>     CONFIG_MAC802154_NTXDESC);
> > >>     wireless/ieee802154/mac802154.c: nxsem_init(&mac->opsem, 0, 1);
> > >>
> > >> Maybe:
> > >>
> > >>     arch/risc-v/src/bl602/bl602_os_hal.c: ret = nxsem_init(sem, 0,
> > init);
> > >>     arch/risc-v/src/esp32c3/esp32c3_ble_adapter.c: ret =
> > >>     sem_init(&bt_sem->sem, 0, init);
> > >>     arch/risc-v/src/esp32c3/esp32c3_wifi_adapter.c: ret =
> > >>     nxsem_init(sem, 0, init);
> > >>     arch/xtensa/src/esp32/esp32_ble_adapter.c: ret = sem_init(sem, 0,
> > >> init);
> > >>     arch/xtensa/src/esp32/esp32_wifi_adapter.c: ret = nxsem_init(sem,
> 0,
> > >>     init);
> > >>     arch/xtensa/src/esp32s3/esp32s3_wifi_adapter.c: ret =
> > >>     nxsem_init(sem, 0, init);
> > >>
> > >>
> > >
> > >
> > >
> >
>

Reply via email to