On 19 April 2023 09:06:28 CEST, Lipeng Zhu via Fortran <[email protected]>
wrote:
>This patch try to introduce the rwlock and split the read/write to
>unit_root tree and unit_cache with rwlock instead of the mutex to
>increase CPU efficiency. In the get_gfc_unit function, the percentage
>to step into the insert_unit function is around 30%, in most instances,
>we can get the unit in the phase of reading the unit_cache or unit_root
>tree. So split the read/write phase by rwlock would be an approach to
>make it more parallel.
>
>BTW, the IPC metrics can gain around 9x in our test
>server with 220 cores. The benchmark we used is
>https://github.com/rwesson/NEAT
>
>+#define RD_TO_WRLOCK(rwlock) \
>+ RWUNLOCK (rwlock);\
>+ WRLOCK (rwlock);
>+#endif
>+
>diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
>index 82664dc5f98..4312c5f36de 100644
>--- a/libgfortran/io/unit.c
>+++ b/libgfortran/io/unit.c
>@@ -329,7 +335,7 @@ get_gfc_unit (int n, int do_create)
> int c, created = 0;
>
> NOTE ("Unit n=%d, do_create = %d", n, do_create);
>- LOCK (&unit_lock);
>+ RDLOCK (&unit_rwlock);
>
> retry:
> for (c = 0; c < CACHE_SIZE; c++)
>@@ -350,6 +356,7 @@ retry:
> if (c == 0)
> break;
> }
>+ RD_TO_WRLOCK (&unit_rwlock);
So I'm trying to convince myself why it's safe to unlock and only then take the
write lock.
Can you please elaborate/confirm why that's ok?
I wouldn't mind a comment like
We can release the unit and cache read lock now. We might have to allocate a
(locked) unit, below in do_create.
Either way, we will manipulate the cache and/or the unit list so we have to
take a write lock now.
We don't take the write bit in *addition* to the read lock because..
(that needlessly complicates releasing the respective locks / it triggers too
much contention when we.. / ...?)
thanks,
>
> if (p == NULL && do_create)
> {
>@@ -368,8 +375,8 @@ retry:
> if (created)
> {
> /* Newly created units have their lock held already
>- from insert_unit. Just unlock UNIT_LOCK and return. */
>- UNLOCK (&unit_lock);
>+ from insert_unit. Just unlock UNIT_RWLOCK and return. */
>+ RWUNLOCK (&unit_rwlock);
> return p;
> }
>
>@@ -380,7 +387,7 @@ found:
> if (! TRYLOCK (&p->lock))
> {
> /* assert (p->closed == 0); */
>- UNLOCK (&unit_lock);
>+ RWUNLOCK (&unit_rwlock);
> return p;
> }
>
>@@ -388,14 +395,14 @@ found:
> }
>
>
>- UNLOCK (&unit_lock);
>+ RWUNLOCK (&unit_rwlock);
>
> if (p != NULL && (p->child_dtio == 0))
> {
> LOCK (&p->lock);
> if (p->closed)
> {
>- LOCK (&unit_lock);
>+ WRLOCK (&unit_rwlock);
> UNLOCK (&p->lock);
> if (predec_waiting_locked (p) == 0)
> destroy_unit_mutex (p);
>@@ -593,8 +600,8 @@ init_units (void)
> #endif
> #endif
>
>-#ifndef __GTHREAD_MUTEX_INIT
>- __GTHREAD_MUTEX_INIT_FUNCTION (&unit_lock);
>+#if (!defined(__GTHREAD_RWLOCK_INIT) && !defined(__GTHREAD_MUTEX_INIT))
>+ __GTHREAD_MUTEX_INIT_FUNCTION (&unit_rwlock);
> #endif
>
> if (sizeof (max_offset) == 8)