That makes sense. The only problem is GCC's problems with forward declarations and inlining. IMO a design issue with the compiler, but it is what it is. But we do need a general solution to this problem in 2015 since proper inlining support will be critical for production-grade implementations.
On Wed, Dec 24, 2014 at 8:59 AM, Mike Holmes <[email protected]> wrote: > > > On 24 December 2014 at 06:19, Maxim Uvarov <[email protected]> > wrote: > >> On 12/23/2014 10:40 PM, Mike Holmes wrote: >> >>> >>> >>> On 23 December 2014 at 13:49, Bill Fischofer <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> That's good to know and certainly the code is cleaner that way. I >>> think we should adopt the practice that any architecture-specific >>> changes to linux-generic code (adds, updates, or removals) must >>> either originate from or have a Reviewed-by a representative of >>> that architecture (in this case, Bala). That way there's no >>> confusion as to whether the known experts agree with the change, >>> since these sort of things can have subtle implications that are >>> not obvious to the layman. >>> >>> >>> I think linux-generic should be devoid of any per core optimizations in >>> its default state. >>> There could be in an arch directory and it would be supported as you say >>> Bill by someone who will care for that that variant, this also avoids >>> filling linux-generic with a possibly large list of preprocessor flags that >>> we can't test. >>> >>> In this case we don't have public access to the octeon to even know if >>> this code works so separating or deletion feels right to me, although >>> Cavium are adding their board in this case. >>> >>> To have optimizations we should understand which per arch optimization >> we need. I see that if newer gcc supports good optimizations then we need >> to use that gcc. Most of people use Open Embedded and Linaro gcc which is >> almost latest version. So if there is no optimization in standard >> environment (gcc, kernel, libc) then arch implementation should go to >> platform specific git variant. If it's covered by standard environment then >> we simple updating minimal version requirement for required component. >> >> > Lets take an example ODP API file > > static inline void odp_sync_stores(void) > { > #if defined __x86_64__ || defined __i386__ > > >-------__asm__ __volatile__ ("sfence\n" : : : "memory"); > > #elif defined(__arm__) > #if __ARM_ARCH == 6 > >-------__asm__ __volatile__ ("mcr p15, 0, %0, c7, c10, 5" \ > >------->------->-------: : "r" (0) : "memory"); > #elif __ARM_ARCH >= 7 || defined __aarch64__ > > >-------__asm__ __volatile__ ("dmb st" : : : "memory"); > #else > >-------__asm__ __volatile__ ("" : : : "memory"); > #endif > > #elif defined __OCTEON__ > > >-------__asm__ __volatile__ ("syncws\n" : : : "memory"); > > #else > >-------__sync_synchronize(); > #endif > > > How do we accommodate this without resorting to architecture specific > processor flags / assembly? > > I would prefer to see platform/linux-generic/include/api/odp_sync.h > changed to > > platform/linux-generic/include/api/x86/odp_sync.h > platform/linux-generic/include/api/arm/odp_sync.h > platform/linux-generic/include/api/octeon/odp_sync.h > > This assumes that there is currently no pure c code that can be used that > is acceptable on all three platforms. > > Or since linux-generic is not a performance platform but rather a > reference whenever we need to make a trade off between simplicity and > performance, we alter this to be a regular function and move to:- > > platform/linux-generic/include/api/odp_sync.h <- now a pure forward > declaration of the API with doxygen documentation > and implementations correctly broken down by arch > > arch/x86/odp_sync.h > arch/arm/odp_sync.h > arch/octeon/odp_sync.h > > > >> Maxim. >> >> >> >>> On Tue, Dec 23, 2014 at 12:38 PM, Ola Liljedahl >>> <[email protected] <mailto:[email protected]>> wrote: >>> >>> I had some discussions with Bala and we think the GCC >>> generates the >>> appropriate code for OCTEON so there should be no need for any >>> OCTEON-specific inline assembler in odp_atomic.h. I just >>> didn't want >>> to remove this piece of code when I redesigned the atomic support >>> because I didn't understand why there is a SYNCW instruction >>> *before* >>> the load. Load-acquire type of loads may need sync or barrier >>> instructions but they come after the load, not before. >>> >>> I think we should trust GCC to generate better code for OCTEON >>> than us >>> mere humans, at least until proven wrong. So please remove this >>> OCTEON-specific snippet. >>> >>> -- Ola >>> >>> >>> On 23 December 2014 at 18:01, Mike Holmes >>> <[email protected] <mailto:[email protected]>> wrote: >>> > >>> > >>> > On 23 December 2014 at 11:15, Taras Kondratiuk >>> <[email protected] <mailto:[email protected] >>> >> >>> >>> > wrote: >>> >> >>> >> On 12/23/2014 05:28 PM, Mike Holmes wrote: >>> >> > how will this work, Octeon does not care about >>> __atomic_fetch_add(&v, >>> >> > 1, __ATOMIC_RELAXED); and yet you will fail configure I >>> assume >>> >> > >>> >> > static inline uint32_t >>> odp_atomic_fetch_inc_u32(odp_atomic_u32_t *atom) >>> >> > { >>> >> > #if defined __OCTEON__ >>> >> >>-------uint32_t ret; >>> >> >>-------__asm__ __volatile__ ("syncws"); >>> >> >>-------__asm__ __volatile__ ("lai %0,(%2)" : "=r" (ret), >>> "+m" (atom) : >>> >> >>------->------->------- "r" (atom)); >>> >> >>-------return ret; >>> >> > #else >>> >> >>-------return __atomic_fetch_add(&atom->v, 1, >>> __ATOMIC_RELAXED); >>> >> > #endif >>> >> > } >>> >> >>> >> Linux-generic shouldn't have arch specific parts, because it >>> is >>> >> *generic*. Instead it can fall back to strong __sync >>> functions if >>> >> __atomic are not available. >>> >> >>> > >>> > Linux kernel is generic but has core specifics - is this not >>> the same ? >>> > >>> > What about these cases ? >>> > >>> > platform/linux-generic/odp_system_info.c:static int >>> cpuinfo_octeon(FILE >>> > *file, odp_system_info_t *sysinfo) >>> > platform/linux-generic/odp_system_info.c: .cpu_arch_str = >>> "octeon", >>> > platform/linux-generic/odp_system_info.c: .cpuinfo_parser = >>> > cpuinfo_octeon >>> > platform/linux-generic/include/api/odp_align.h:#if defined >>> __x86_64__ || >>> > defined __i386__ >>> > platform/linux-generic/include/api/odp_sync.h:#if defined >>> __x86_64__ || >>> > defined __i386__ >>> > platform/linux-generic/include/odp_spin_internal.h:#if >>> defined __x86_64__ || >>> > defined __i386__ >>> > platform/linux-generic/odp_system_info.c:#if defined >>> __x86_64__ || defined >>> > __i386__ || defined __OCTEON__ || \ >>> > platform/linux-generic/odp_system_info.c:#if defined >>> __x86_64__ || defined >>> > __i386__ >>> > platform/linux-generic/odp_system_info.c:static int >>> cpuinfo_x86(FILE *file, >>> > odp_system_info_t *sysinfo) >>> > platform/linux-generic/odp_system_info.c: #if defined >>> __x86_64__ || >>> > defined __i386__ >>> > platform/linux-generic/odp_system_info.c: .cpu_arch_str = >>> "x86", >>> > platform/linux-generic/odp_system_info.c: .cpuinfo_parser = >>> > cpuinfo_x86 >>> > platform/linux-generic/odp_system_info.c:#if defined >>> __x86_64__ || defined >>> > __i386__ || defined __OCTEON__ || \ >>> > platform/linux-generic/odp_time.c:#if defined __x86_64__ || >>> defined __i386__ >>> > platform/linux-generic/include/api/odp_align.h:#elif defined >>> __arm__ || >>> > defined __aarch64__ >>> > platform/linux-generic/include/api/odp_sync.h:#elif >>> defined(__arm__) >>> > platform/linux-generic/include/odp_spin_internal.h:#elif >>> defined __arm__ >>> > platform/linux-generic/odp_system_info.c:#elif defined >>> __arm__ || defined >>> > __aarch64__ >>> > platform/linux-generic/odp_system_info.c:static int >>> cpuinfo_arm(FILE *file >>> > ODP_UNUSED, >>> > platform/linux-generic/odp_system_info.c: #elif defined >>> __arm__ || >>> > defined __aarch64__ >>> > platform/linux-generic/odp_system_info.c: .cpu_arch_str = >>> "arm", >>> > platform/linux-generic/odp_system_info.c: .cpuinfo_parser = >>> > cpuinfo_arm >>> > >>> > >>> >> >>> >> >>> >> Taras Kondratiuk >>> > >>> > >>> > >>> > >>> > -- >>> > Mike Holmes >>> > Linaro Sr Technical Manager >>> > LNG - ODP >>> >>> _______________________________________________ >>> lng-odp mailing list >>> [email protected] <mailto:[email protected]> >>> http://lists.linaro.org/mailman/listinfo/lng-odp >>> >>> >>> >>> >>> -- >>> *Mike Holmes* >>> Linaro Sr Technical Manager >>> LNG - ODP >>> >>> >>> _______________________________________________ >>> lng-odp mailing list >>> [email protected] >>> http://lists.linaro.org/mailman/listinfo/lng-odp >>> >> >> >> _______________________________________________ >> lng-odp mailing list >> [email protected] >> http://lists.linaro.org/mailman/listinfo/lng-odp >> > > > > -- > *Mike Holmes* > Linaro Sr Technical Manager > LNG - ODP > > _______________________________________________ > lng-odp mailing list > [email protected] > http://lists.linaro.org/mailman/listinfo/lng-odp > >
_______________________________________________ lng-odp mailing list [email protected] http://lists.linaro.org/mailman/listinfo/lng-odp
