From: EXT Ola Liljedahl [mailto:[email protected]] Sent: Tuesday, October 20, 2015 11:49 AM To: Savolainen, Petri (Nokia - FI/Espoo) Cc: LNG ODP Mailman List Subject: Re: [lng-odp] [API-NEXT PATCH 5/6] api: atomic: added cas operations
On 16 October 2015 at 09:56, Savolainen, Petri (Nokia - FI/Espoo) <[email protected]<mailto:[email protected]>> wrote: From: EXT Ola Liljedahl [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, October 15, 2015 11:00 PM To: Savolainen, Petri (Nokia - FI/Espoo) Cc: LNG ODP Mailman List Subject: Re: [lng-odp] [API-NEXT PATCH 5/6] api: atomic: added cas operations On 15 October 2015 at 10:45, Petri Savolainen <[email protected]<mailto:[email protected]>> wrote: Added cas operations for 32 and 64 bit atomic variables. These use relaxed memory order (as all other operations). Do you have actual use cases for CAS where *only* relaxed memory order is required? Or you need CAS with acquire ordering per the other patch? But you don't need CAS with release ordering? CAS with relaxed order is for implementing e.g. min / max of a counter. User inc / dec a count and update atomic variables min/max with CAS. This is what Barry needed in one of the calls. I don't know exactly how this is used by Barry but I think we should investigate the need for some new atomic operations. ARMv8.1 introduces atomic min and max operations that might be relevant in this context. It would be good if these could be utilised by the TM implementation (if indeed this is the function we are looking for). http://community.arm.com/groups/processors/blog/2014/12/02/the-armv8-a-architecture-and-its-ongoing-development The atomic instructions can be used as an alternative to Load-exclusive/Store-exclusive instructions, by example to ease the implementation of atomic memory updates in very large systems. This could be in a closely coupled cache, sometimes referred to as near atomics, or further out in the memory system as far atomics. The instructions provide atomic update of register content with memory for a range of conditions: · Compare and swap of 8-, 16-, 32-, 64- or a pair of 32- or 64-bit registers as a conditional update of a value in memory. · ADD, BitClear, ExclusiveOR, BitSet signed and unsigned MAXimum or MINimum value data processing operations on -8, 16-, 32- or 64-bit values in memory. These can occur with or without copying the original value in memory to a register. · Swap of an 8-, 16-, 32- or 64-bit value between a register and value in memory. · The instructions also include controls associated with influencing the order properties, based on acquire and release semantics. I think the use case this. A number of threads increment and decrement a counter (fetch_add, fetch_sub) and compare the returned value to an atomic max variable. If the new count is larger than max, the thread tries to update the max. Don’t know he needs to know if the new value was actually a new max. These could be added for simple min/max … void odp_atomic_max_u32(odp_atomic_u32_t *atom, uint32_t new_max); void odp_atomic_min_u32(odp_atomic_u32_t *atom, uint32_t new_min); … and these for fetch and min/max. uint32_t odp_atomic_fetch_max_u32(odp_atomic_u32_t *atom, uint32_t new_max); uint32_t odp_atomic_fetch_min_u32(odp_atomic_u32_t *atom, uint32_t new_min); I guess relaxed CAS is still needed for more advanced algorithms like update an average of multiple counter. -Petri
_______________________________________________ lng-odp mailing list [email protected] https://lists.linaro.org/mailman/listinfo/lng-odp
