https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96327

            Bug ID: 96327
           Summary: Inefficient increment through pointer to volatile on
                    x86
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: paulmckrcu at gmail dot com
  Target Milestone: ---

Although the code generation for increment (++, --) through a pointer to
volatile has improved greatly over the past 15 years, there is a case in which
the address calculation is needlessly done separately instead of by the x86
increment instruction itself.  Here is some example code:

struct task {
    int other;
    int rcu_count;
};

struct task *current;

void rcu_read_lock()
{
    (*(volatile int*)&current->rcu_count)++;
}

As can be seen in godbolt.org (https://godbolt.org/z/fGze8E), the address
calculation is split by GCC. The shorter code sequence generated by clang/LLVM
is preferable.

Fixing this would allow the Linux kernel to use safer code sequences for
certain fastpaths, in this example, rcu_read_lock() and rcu_read_unlock() for
kernels built with CONFIG_PREEMPT=y.

Reply via email to