Re: [lng-odp] [PATCHv3] configure.ac check for atomic operations support

Bill Fischofer Wed, 24 Dec 2014 07:06:58 -0800

I suspect the long-term answer will be to use the autotools to construct
the actual include hierarchy (and possibly the files themelves) needed for
a given --with-platform specification.  That would mean the canonical ODP
API files may need to be meta-files that are used as input to ./configure
to generate the actual ones used.


On Wed, Dec 24, 2014 at 9:03 AM, Bill Fischofer <[email protected]>
wrote:

> That makes sense.  The only problem is GCC's problems with forward
> declarations and inlining.  IMO a design issue with the compiler, but it is
> what it is.  But we do need a general solution to this problem in 2015
> since proper inlining support will be critical for production-grade
> implementations.
>
> On Wed, Dec 24, 2014 at 8:59 AM, Mike Holmes <[email protected]>
> wrote:
>
>>
>>
>> On 24 December 2014 at 06:19, Maxim Uvarov <[email protected]>
>> wrote:
>>
>>> On 12/23/2014 10:40 PM, Mike Holmes wrote:
>>>
>>>>
>>>>
>>>> On 23 December 2014 at 13:49, Bill Fischofer <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>>     That's good to know and certainly the code is cleaner that way.  I
>>>>     think we should adopt the practice that any architecture-specific
>>>>     changes to linux-generic code (adds, updates, or removals) must
>>>>     either originate from or have a Reviewed-by a representative of
>>>>     that architecture (in this case, Bala).  That way there's no
>>>>     confusion as to whether the known experts agree with the change,
>>>>     since these sort of things can have subtle implications that are
>>>>     not obvious to the layman.
>>>>
>>>>
>>>> I think linux-generic should be devoid of any per core optimizations in
>>>> its default state.
>>>> There could be in an arch directory and it would be supported as you
>>>> say Bill by someone who will care for that that variant, this also avoids
>>>> filling linux-generic with a possibly large list of preprocessor flags that
>>>> we can't test.
>>>>
>>>> In this case we don't have public access to the octeon to even know if
>>>> this code works so separating or deletion feels right to me, although
>>>> Cavium are adding their board in this case.
>>>>
>>>>  To have optimizations we should understand which per arch optimization
>>> we need. I see that if newer gcc supports good optimizations then we need
>>> to use that gcc. Most of people use Open Embedded and Linaro gcc which is
>>> almost latest version. So if there is no optimization in standard
>>> environment (gcc, kernel, libc) then arch implementation should go to
>>> platform specific git variant. If it's covered by standard environment then
>>> we simple updating minimal version requirement for required component.
>>>
>>>
>> Lets take an example ODP API file
>>
>> static inline void odp_sync_stores(void)
>> {
>> #if defined __x86_64__ || defined __i386__
>>
>> >-------__asm__  __volatile__ ("sfence\n" : : : "memory");
>>
>> #elif defined(__arm__)
>> #if __ARM_ARCH == 6
>> >-------__asm__ __volatile__ ("mcr p15, 0, %0, c7, c10, 5" \
>> >------->------->-------: : "r" (0) : "memory");
>> #elif __ARM_ARCH >= 7 || defined __aarch64__
>>
>> >-------__asm__ __volatile__ ("dmb st" : : : "memory");
>> #else
>> >-------__asm__ __volatile__ ("" : : : "memory");
>> #endif
>>
>> #elif defined __OCTEON__
>>
>> >-------__asm__  __volatile__ ("syncws\n" : : : "memory");
>>
>> #else
>> >-------__sync_synchronize();
>> #endif
>>
>>
>> How do we accommodate this without resorting to architecture specific
>> processor flags / assembly?
>>
>> I would prefer to see platform/linux-generic/include/api/odp_sync.h
>> changed to
>>
>> platform/linux-generic/include/api/x86/odp_sync.h
>> platform/linux-generic/include/api/arm/odp_sync.h
>> platform/linux-generic/include/api/octeon/odp_sync.h
>>
>> This assumes that there is currently no pure c code that can be used that
>> is acceptable on all three platforms.
>>
>> Or since linux-generic is not a performance platform but rather a
>> reference whenever we need to make a trade off between simplicity and
>> performance, we alter this to be a regular function and move to:-
>>
>> platform/linux-generic/include/api/odp_sync.h <- now a pure forward
>> declaration of the API with doxygen documentation
>> and implementations correctly broken down by arch
>>
>> arch/x86/odp_sync.h
>> arch/arm/odp_sync.h
>> arch/octeon/odp_sync.h
>>
>>
>>
>>> Maxim.
>>>
>>>
>>>
>>>>     On Tue, Dec 23, 2014 at 12:38 PM, Ola Liljedahl
>>>>     <[email protected] <mailto:[email protected]>> wrote:
>>>>
>>>>         I had some discussions with Bala and we think the GCC
>>>>         generates the
>>>>         appropriate code for OCTEON so there should be no need for any
>>>>         OCTEON-specific inline assembler in odp_atomic.h. I just
>>>>         didn't want
>>>>         to remove this piece of code when I redesigned the atomic
>>>> support
>>>>         because I didn't understand why there is a SYNCW instruction
>>>>         *before*
>>>>         the load. Load-acquire type of loads may need sync or barrier
>>>>         instructions but they come after the load, not before.
>>>>
>>>>         I think we should trust GCC to generate better code for OCTEON
>>>>         than us
>>>>         mere humans, at least until proven wrong. So please remove this
>>>>         OCTEON-specific snippet.
>>>>
>>>>         -- Ola
>>>>
>>>>
>>>>         On 23 December 2014 at 18:01, Mike Holmes
>>>>         <[email protected] <mailto:[email protected]>> wrote:
>>>>         >
>>>>         >
>>>>         > On 23 December 2014 at 11:15, Taras Kondratiuk
>>>>         <[email protected] <mailto:taras.kondratiuk@
>>>> linaro.org>>
>>>>
>>>>         > wrote:
>>>>         >>
>>>>         >> On 12/23/2014 05:28 PM, Mike Holmes wrote:
>>>>         >> > how will this work, Octeon does not care about
>>>>          __atomic_fetch_add(&v,
>>>>         >> > 1, __ATOMIC_RELAXED); and yet you will fail configure I
>>>>         assume
>>>>         >> >
>>>>         >> > static inline uint32_t
>>>>         odp_atomic_fetch_inc_u32(odp_atomic_u32_t *atom)
>>>>         >> > {
>>>>         >> > #if defined __OCTEON__
>>>>         >> >>-------uint32_t ret;
>>>>         >> >>-------__asm__ __volatile__ ("syncws");
>>>>         >> >>-------__asm__ __volatile__ ("lai %0,(%2)" : "=r" (ret),
>>>>         "+m" (atom) :
>>>>         >> >>------->------->------- "r" (atom));
>>>>         >> >>-------return ret;
>>>>         >> > #else
>>>>         >> >>-------return __atomic_fetch_add(&atom->v, 1,
>>>>         __ATOMIC_RELAXED);
>>>>         >> > #endif
>>>>         >> > }
>>>>         >>
>>>>         >> Linux-generic shouldn't have arch specific parts, because it
>>>> is
>>>>         >> *generic*. Instead it can fall back to strong __sync
>>>>         functions if
>>>>         >> __atomic are not available.
>>>>         >>
>>>>         >
>>>>         > Linux kernel is generic but has core specifics - is this not
>>>>         the same ?
>>>>         >
>>>>         > What about these cases ?
>>>>         >
>>>>         > platform/linux-generic/odp_system_info.c:static int
>>>>         cpuinfo_octeon(FILE
>>>>         > *file, odp_system_info_t *sysinfo)
>>>>         > platform/linux-generic/odp_system_info.c:    .cpu_arch_str =
>>>>         "octeon",
>>>>         > platform/linux-generic/odp_system_info.c:    .cpuinfo_parser
>>>> =
>>>>         > cpuinfo_octeon
>>>>         > platform/linux-generic/include/api/odp_align.h:#if defined
>>>>         __x86_64__ ||
>>>>         > defined __i386__
>>>>         > platform/linux-generic/include/api/odp_sync.h:#if defined
>>>>         __x86_64__ ||
>>>>         > defined __i386__
>>>>         > platform/linux-generic/include/odp_spin_internal.h:#if
>>>>         defined __x86_64__ ||
>>>>         > defined __i386__
>>>>         > platform/linux-generic/odp_system_info.c:#if defined
>>>>         __x86_64__ || defined
>>>>         > __i386__ || defined __OCTEON__ || \
>>>>         > platform/linux-generic/odp_system_info.c:#if defined
>>>>         __x86_64__ || defined
>>>>         > __i386__
>>>>         > platform/linux-generic/odp_system_info.c:static int
>>>>         cpuinfo_x86(FILE *file,
>>>>         > odp_system_info_t *sysinfo)
>>>>         > platform/linux-generic/odp_system_info.c:    #if defined
>>>>         __x86_64__ ||
>>>>         > defined __i386__
>>>>         > platform/linux-generic/odp_system_info.c:    .cpu_arch_str =
>>>>         "x86",
>>>>         > platform/linux-generic/odp_system_info.c:    .cpuinfo_parser
>>>> =
>>>>         > cpuinfo_x86
>>>>         > platform/linux-generic/odp_system_info.c:#if defined
>>>>         __x86_64__ || defined
>>>>         > __i386__ || defined __OCTEON__ || \
>>>>         > platform/linux-generic/odp_time.c:#if defined __x86_64__ ||
>>>>         defined __i386__
>>>>         > platform/linux-generic/include/api/odp_align.h:#elif defined
>>>>         __arm__ ||
>>>>         > defined __aarch64__
>>>>         > platform/linux-generic/include/api/odp_sync.h:#elif
>>>>         defined(__arm__)
>>>>         > platform/linux-generic/include/odp_spin_internal.h:#elif
>>>>         defined __arm__
>>>>         > platform/linux-generic/odp_system_info.c:#elif defined
>>>>         __arm__ || defined
>>>>         > __aarch64__
>>>>         > platform/linux-generic/odp_system_info.c:static int
>>>>         cpuinfo_arm(FILE *file
>>>>         > ODP_UNUSED,
>>>>         > platform/linux-generic/odp_system_info.c:    #elif defined
>>>>         __arm__ ||
>>>>         > defined __aarch64__
>>>>         > platform/linux-generic/odp_system_info.c:    .cpu_arch_str =
>>>>         "arm",
>>>>         > platform/linux-generic/odp_system_info.c:    .cpuinfo_parser
>>>> =
>>>>         > cpuinfo_arm
>>>>         >
>>>>         >
>>>>         >>
>>>>         >>
>>>>         >> Taras Kondratiuk
>>>>         >
>>>>         >
>>>>         >
>>>>         >
>>>>         > --
>>>>         > Mike Holmes
>>>>         > Linaro  Sr Technical Manager
>>>>         > LNG - ODP
>>>>
>>>>         _______________________________________________
>>>>         lng-odp mailing list
>>>>         [email protected] <mailto:[email protected]>
>>>>         http://lists.linaro.org/mailman/listinfo/lng-odp
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Mike Holmes*
>>>> Linaro  Sr Technical Manager
>>>> LNG - ODP
>>>>
>>>>
>>>> _______________________________________________
>>>> lng-odp mailing list
>>>> [email protected]
>>>> http://lists.linaro.org/mailman/listinfo/lng-odp
>>>>
>>>
>>>
>>> _______________________________________________
>>> lng-odp mailing list
>>> [email protected]
>>> http://lists.linaro.org/mailman/listinfo/lng-odp
>>>
>>
>>
>>
>> --
>> *Mike Holmes*
>> Linaro  Sr Technical Manager
>> LNG - ODP
>>
>> _______________________________________________
>> lng-odp mailing list
>> [email protected]
>> http://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
>

_______________________________________________
lng-odp mailing list
[email protected]
http://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] [PATCHv3] configure.ac check for atomic operations support

Reply via email to