On Thu, Aug 28, 2014 at 4:39 PM, Jarno Rajahalme <[email protected]> wrote:
>
> On Aug 28, 2014, at 9:57 AM, Gurucharan Shetty <[email protected]> wrote:
>> diff --git a/lib/ovs-atomic-msvc.h b/lib/ovs-atomic-msvc.h
>> new file mode 100644
>> index 0000000..f357545
>> --- /dev/null
>> +++ b/lib/ovs-atomic-msvc.h
>> @@ -0,0 +1,370 @@
>> +/*
>> + * Copyright (c) 2014 Nicira, Inc.
>> + *
>> + * Licensed under the Apache License, Version 2.0 (the "License");
>> + * you may not use this file except in compliance with the License.
>> + * You may obtain a copy of the License at:
>> + *
>> + * http://www.apache.org/licenses/LICENSE-2.0
>> + *
>> + * Unless required by applicable law or agreed to in writing, software
>> + * distributed under the License is distributed on an "AS IS" BASIS,
>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>> + * See the License for the specific language governing permissions and
>> + * limitations under the License.
>> + */
>> +
>> +/* This header implements atomic operation primitives using pthreads. */
>> +#ifndef IN_OVS_ATOMIC_H
>> +#error "This header should only be included indirectly via ovs-atomic.h."
>> +#endif
>> +
>> +/* From msdn documentation: With Visual Studio 2003, volatile to volatile
>> +references are ordered; the compiler will not re-order volatile variable
>> +access. With Visual Studio 2005, the compiler also uses acquire semantics
>> +for read operations on volatile variables and release semantics for write
>> +operations on volatile variables (when supported by the CPU). */
>
> Is this still the same for MSVC 2012 or 2013?
There is no explicit documentation stating that, but looking at MSVC's
c++ c11 atomic implementation, it looks clear that it is the case. I
will add an explicit comment for this.
>> +
>> +static inline void
>> +atomic_compiler_barrier(memory_order order OVS_UNUSED)
>> +{
>> + _ReadWriteBarrier();
>> +}
>
> You could do “if (order > memory_order_consume) {“ to avoid unnecessary
> barriers.
Okay
>
>> +
>> +static inline void
>> +atomic_thread_fence(memory_order order)
>> +{
>> + if (order == memory_order_seq_cst) {
>> + MemoryBarrier();
>> + }
>> +}
>
> This needs an “else { atomic_compiler_barrier(order); }"
Yes. I will fix it up.
>
>> +
>> +static inline void
>> +atomic_signal_fence(memory_order order)
>> +{
>> + atomic_compiler_barrier(order);
>> +}
>> +
>> +/* 1, 2 and 4 bytes loads and stores are atomic on aligned memory. In
>> addition,
>> + * since the compiler automatically takes acquire and release semantics on
>> + * volatile variables, for any order lesser than 'memory_order_seq_cst', we
>> + * can directly assign or read values. */
>> +
>> +#define atomic_store32(DST, SRC, ORDER) \
>> + if (ORDER == memory_order_seq_cst) { \
>> + InterlockedExchange((int32_t volatile *) DST, \
>> + (int32_t ) SRC); \
>> + } else { \
>> + *DST = SRC; \
>> + }
>> +
>> +/* 64 bit reads and write are not atomic on x86.
>
> Just noticed this:
>
> “8.1.1 Guaranteed Atomic Operations
>
> The Intel486 processor (and newer processors since) guarantees that the
> following
> basic memory operations will always be carried out atomically:
> - Reading or writing a byte
> - Reading or writing a word aligned on a 16-bit boundary
> - Reading or writing a doubleword aligned on a 32-bit boundary
>
> The Pentium processor (and newer processors since) guarantees that the
> following
> additional memory operations will always be carried out atomically:
> - Reading or writing a quadword aligned on a 64-bit boundary
> - 16-bit accesses to uncached memory locations that fit within a 32-bit data
> bus
>
> The P6 family processors (and newer processors since) guarantee that the
> following
> additional memory operation will always be carried out atomically:
> - Unaligned 16-, 32-, and 64-bit accesses to cached memory that fit within a
> cache line”
>
> So, it might be worth it to limit the support for i586, and check alignment
> at run time to avoid the lock on 64-bit load and store. Or it may be that
> InterlockedExchange64() already does that?
Okay. I will include this file only for >= i586. It is unlikely that
anyone still uses older machines than that. So it shouldn't be a
problem. I also see with sample programs that 64 bit variables are 64
bit aligned even with 32 bit builds. But to make sure that it is
always the case, I will add an abort() if it is not 64 bit aligned.
>> +
>> +#define atomic_read32(SRC, DST, ORDER) \
>> + if (ORDER == memory_order_seq_cst) { \
>> + *DST = InterlockedOr((int32_t volatile *) SRC, 0); \
>> + } else { \
>> + *DST = *SRC; \
>> + }
>> +
>
> On x86, since seq_cst stores are locked, the corresponding reads don’t need
> to be:
>
> “Locked operations are atomic with respect to all other memory operations and
> all externally visible events. Only instruction fetch and page table accesses
> can pass locked instructions. Locked instructions can be used to synchronize
> data written by one processor and read by another processor. For the P6
> family processors, locked operations serialize all outstanding load and store
> operations (that is, wait for them to complete). This rule is also true for
> the Pentium 4 and Intel Xeon processors, with one exception. Load operations
> that reference weakly ordered memory types (such as the WC memory type) may
> not be serialized."
Okay. I will use this.
>
>> +/* 64 bit reads and writes are not atomic on x86. */
>> +#define atomic_read64(SRC, DST, ORDER) \
>> + *DST = InterlockedOr64((int64_t volatile *) SRC, 0);
>> +
>
> On 586 this could be just an assignment, if SRC is quad-word (64-bit) aligned.
Okay. I changed it.
>
>> +/* For 8 and 16 bit variations. */
>> +#define atomic_readX(X, SRC, DST, ORDER) \
>> + if (ORDER == memory_order_seq_cst) { \
>> + *DST = InterlockedOr##X((int##X##_t volatile *) SRC, 0); \
>> + } else { \
>> + *DST = *SRC; \
>> + }
>> +
>
> Same here.
>
Okay.
Thanks for the review. I will send in a v2.
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev