On Tue, Oct 8, 2019 at 1:10 PM <i...@apache.org> wrote:
>
> Author: ivan
> Date: Tue Oct  8 11:10:32 2019
> New Revision: 1868129
>
> URL: http://svn.apache.org/viewvc?rev=1868129&view=rev
> Log:
> apr_atomic_read64(): Fix non-atomic read on 32-bit Windows.
[]
> --- apr/apr/trunk/atomic/win32/apr_atomic64.c (original)
> +++ apr/apr/trunk/atomic/win32/apr_atomic64.c Tue Oct  8 11:10:32 2019
> @@ -51,7 +51,9 @@ APR_DECLARE(void) apr_atomic_set64(volat
>
>  APR_DECLARE(apr_uint64_t) apr_atomic_read64(volatile apr_uint64_t *mem)
>  {
> -    return *mem;
> +    /* 64-bit read is not atomic on 32-bit platform: use 
> InterlockedCompareExchange
> +       to perform atomic read. */
> +    return InterlockedCompareExchange64((volatile LONG64 *)mem, 0, 0);
>  }

Shouldn't we make this #ifdef'd on WIN32/WIN64? Despite the win32
dirname, I think both Windows(es) share the same atomic/ code (same
applies to apr_atomic_set64() I think, where *mem = val should be fine
on WIN64).

Then I don't know if we care enough about WIN32, but if so the
InterlockedCompareExchange64() call here is maybe a bit of a hammer
for usual apr_atomic_read64() usage. For things like lightly check for
the current 64bit shared value and do the heavy CAS work for some
special/unlikely condition only, this change is going to kill
performances. Since we provide _cas64() already, I fell that
read64/set64() should always be lightweight (when/if possible).

I know that [2] says "Simple reads and writes to properly aligned
64-bit variables are atomic on 64-bit Windows. Reads and writes to
64-bit values are not guaranteed to be atomic on 32-bit Windows.", but
it looks like in practice for CPUs that Windows cares about, an
aligned 8-bytes (eg. our apr_uint64_t type) load/store can always be
atomic. The load/store are likely compiled using x87 registers or xmm
or sse or at worse lock cmpxchg8b (i.e. the above), and supposedly
some 64bit or more registers available on ARM too.

I'm no Windows expert, but wonder if we could use inline
ReadAcquire64/WriteRelease64() functions from winnt.h/wdm.h. They seem
to pair with Interlocked* builtins (same API), and have the semantics
we want. I found them by naïvely (and unsuccessfully) searching for
InterlockedRead/InterlockedWrite() functions, and falling into the
discussion in [1]. It seems that there is no such functions in the
windows API because .. there are not needed (well, IMHO, if atomic
load64/store64 should be implemented using cas64, they look needed to
me..).
This seems to be confirmed by the "winnt.h" file I downloaded from the
net, for the CPUs handled (and cared about) there,
ReadAcquire64/WriteRelease64() are implemented using volatile access
(i.e. *(volatile LONG64*)mem [= val]), plus a barrier for arm(s?), so
relying on the native acquire/release semantics from the CPU/cache.

So possibly r1868129 is not necessary, or if
ReadAcquire64/WriteRelease64() can be used easily could we do that
instead? (having a look at the WIN32 assembly code for
apr_atomic_read64/set64() using ReadAcquire64/WriteRelease64() could
be instructive, though)

Regards,
Yann.

[1] https://community.osr.com/discussion/comment/254905/#Comment_254905
[2] 
https://docs.microsoft.com/en-us/windows/win32/sync/interlocked-variable-access

Reply via email to