I suspect the long-term answer will be to use the autotools to construct the actual include hierarchy (and possibly the files themelves) needed for a given --with-platform specification. That would mean the canonical ODP API files may need to be meta-files that are used as input to ./configure to generate the actual ones used.
On Wed, Dec 24, 2014 at 9:03 AM, Bill Fischofer <[email protected]> wrote: > That makes sense. The only problem is GCC's problems with forward > declarations and inlining. IMO a design issue with the compiler, but it is > what it is. But we do need a general solution to this problem in 2015 > since proper inlining support will be critical for production-grade > implementations. > > On Wed, Dec 24, 2014 at 8:59 AM, Mike Holmes <[email protected]> > wrote: > >> >> >> On 24 December 2014 at 06:19, Maxim Uvarov <[email protected]> >> wrote: >> >>> On 12/23/2014 10:40 PM, Mike Holmes wrote: >>> >>>> >>>> >>>> On 23 December 2014 at 13:49, Bill Fischofer <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> That's good to know and certainly the code is cleaner that way. I >>>> think we should adopt the practice that any architecture-specific >>>> changes to linux-generic code (adds, updates, or removals) must >>>> either originate from or have a Reviewed-by a representative of >>>> that architecture (in this case, Bala). That way there's no >>>> confusion as to whether the known experts agree with the change, >>>> since these sort of things can have subtle implications that are >>>> not obvious to the layman. >>>> >>>> >>>> I think linux-generic should be devoid of any per core optimizations in >>>> its default state. >>>> There could be in an arch directory and it would be supported as you >>>> say Bill by someone who will care for that that variant, this also avoids >>>> filling linux-generic with a possibly large list of preprocessor flags that >>>> we can't test. >>>> >>>> In this case we don't have public access to the octeon to even know if >>>> this code works so separating or deletion feels right to me, although >>>> Cavium are adding their board in this case. >>>> >>>> To have optimizations we should understand which per arch optimization >>> we need. I see that if newer gcc supports good optimizations then we need >>> to use that gcc. Most of people use Open Embedded and Linaro gcc which is >>> almost latest version. So if there is no optimization in standard >>> environment (gcc, kernel, libc) then arch implementation should go to >>> platform specific git variant. If it's covered by standard environment then >>> we simple updating minimal version requirement for required component. >>> >>> >> Lets take an example ODP API file >> >> static inline void odp_sync_stores(void) >> { >> #if defined __x86_64__ || defined __i386__ >> >> >-------__asm__ __volatile__ ("sfence\n" : : : "memory"); >> >> #elif defined(__arm__) >> #if __ARM_ARCH == 6 >> >-------__asm__ __volatile__ ("mcr p15, 0, %0, c7, c10, 5" \ >> >------->------->-------: : "r" (0) : "memory"); >> #elif __ARM_ARCH >= 7 || defined __aarch64__ >> >> >-------__asm__ __volatile__ ("dmb st" : : : "memory"); >> #else >> >-------__asm__ __volatile__ ("" : : : "memory"); >> #endif >> >> #elif defined __OCTEON__ >> >> >-------__asm__ __volatile__ ("syncws\n" : : : "memory"); >> >> #else >> >-------__sync_synchronize(); >> #endif >> >> >> How do we accommodate this without resorting to architecture specific >> processor flags / assembly? >> >> I would prefer to see platform/linux-generic/include/api/odp_sync.h >> changed to >> >> platform/linux-generic/include/api/x86/odp_sync.h >> platform/linux-generic/include/api/arm/odp_sync.h >> platform/linux-generic/include/api/octeon/odp_sync.h >> >> This assumes that there is currently no pure c code that can be used that >> is acceptable on all three platforms. >> >> Or since linux-generic is not a performance platform but rather a >> reference whenever we need to make a trade off between simplicity and >> performance, we alter this to be a regular function and move to:- >> >> platform/linux-generic/include/api/odp_sync.h <- now a pure forward >> declaration of the API with doxygen documentation >> and implementations correctly broken down by arch >> >> arch/x86/odp_sync.h >> arch/arm/odp_sync.h >> arch/octeon/odp_sync.h >> >> >> >>> Maxim. >>> >>> >>> >>>> On Tue, Dec 23, 2014 at 12:38 PM, Ola Liljedahl >>>> <[email protected] <mailto:[email protected]>> wrote: >>>> >>>> I had some discussions with Bala and we think the GCC >>>> generates the >>>> appropriate code for OCTEON so there should be no need for any >>>> OCTEON-specific inline assembler in odp_atomic.h. I just >>>> didn't want >>>> to remove this piece of code when I redesigned the atomic >>>> support >>>> because I didn't understand why there is a SYNCW instruction >>>> *before* >>>> the load. Load-acquire type of loads may need sync or barrier >>>> instructions but they come after the load, not before. >>>> >>>> I think we should trust GCC to generate better code for OCTEON >>>> than us >>>> mere humans, at least until proven wrong. So please remove this >>>> OCTEON-specific snippet. >>>> >>>> -- Ola >>>> >>>> >>>> On 23 December 2014 at 18:01, Mike Holmes >>>> <[email protected] <mailto:[email protected]>> wrote: >>>> > >>>> > >>>> > On 23 December 2014 at 11:15, Taras Kondratiuk >>>> <[email protected] <mailto:taras.kondratiuk@ >>>> linaro.org>> >>>> >>>> > wrote: >>>> >> >>>> >> On 12/23/2014 05:28 PM, Mike Holmes wrote: >>>> >> > how will this work, Octeon does not care about >>>> __atomic_fetch_add(&v, >>>> >> > 1, __ATOMIC_RELAXED); and yet you will fail configure I >>>> assume >>>> >> > >>>> >> > static inline uint32_t >>>> odp_atomic_fetch_inc_u32(odp_atomic_u32_t *atom) >>>> >> > { >>>> >> > #if defined __OCTEON__ >>>> >> >>-------uint32_t ret; >>>> >> >>-------__asm__ __volatile__ ("syncws"); >>>> >> >>-------__asm__ __volatile__ ("lai %0,(%2)" : "=r" (ret), >>>> "+m" (atom) : >>>> >> >>------->------->------- "r" (atom)); >>>> >> >>-------return ret; >>>> >> > #else >>>> >> >>-------return __atomic_fetch_add(&atom->v, 1, >>>> __ATOMIC_RELAXED); >>>> >> > #endif >>>> >> > } >>>> >> >>>> >> Linux-generic shouldn't have arch specific parts, because it >>>> is >>>> >> *generic*. Instead it can fall back to strong __sync >>>> functions if >>>> >> __atomic are not available. >>>> >> >>>> > >>>> > Linux kernel is generic but has core specifics - is this not >>>> the same ? >>>> > >>>> > What about these cases ? >>>> > >>>> > platform/linux-generic/odp_system_info.c:static int >>>> cpuinfo_octeon(FILE >>>> > *file, odp_system_info_t *sysinfo) >>>> > platform/linux-generic/odp_system_info.c: .cpu_arch_str = >>>> "octeon", >>>> > platform/linux-generic/odp_system_info.c: .cpuinfo_parser >>>> = >>>> > cpuinfo_octeon >>>> > platform/linux-generic/include/api/odp_align.h:#if defined >>>> __x86_64__ || >>>> > defined __i386__ >>>> > platform/linux-generic/include/api/odp_sync.h:#if defined >>>> __x86_64__ || >>>> > defined __i386__ >>>> > platform/linux-generic/include/odp_spin_internal.h:#if >>>> defined __x86_64__ || >>>> > defined __i386__ >>>> > platform/linux-generic/odp_system_info.c:#if defined >>>> __x86_64__ || defined >>>> > __i386__ || defined __OCTEON__ || \ >>>> > platform/linux-generic/odp_system_info.c:#if defined >>>> __x86_64__ || defined >>>> > __i386__ >>>> > platform/linux-generic/odp_system_info.c:static int >>>> cpuinfo_x86(FILE *file, >>>> > odp_system_info_t *sysinfo) >>>> > platform/linux-generic/odp_system_info.c: #if defined >>>> __x86_64__ || >>>> > defined __i386__ >>>> > platform/linux-generic/odp_system_info.c: .cpu_arch_str = >>>> "x86", >>>> > platform/linux-generic/odp_system_info.c: .cpuinfo_parser >>>> = >>>> > cpuinfo_x86 >>>> > platform/linux-generic/odp_system_info.c:#if defined >>>> __x86_64__ || defined >>>> > __i386__ || defined __OCTEON__ || \ >>>> > platform/linux-generic/odp_time.c:#if defined __x86_64__ || >>>> defined __i386__ >>>> > platform/linux-generic/include/api/odp_align.h:#elif defined >>>> __arm__ || >>>> > defined __aarch64__ >>>> > platform/linux-generic/include/api/odp_sync.h:#elif >>>> defined(__arm__) >>>> > platform/linux-generic/include/odp_spin_internal.h:#elif >>>> defined __arm__ >>>> > platform/linux-generic/odp_system_info.c:#elif defined >>>> __arm__ || defined >>>> > __aarch64__ >>>> > platform/linux-generic/odp_system_info.c:static int >>>> cpuinfo_arm(FILE *file >>>> > ODP_UNUSED, >>>> > platform/linux-generic/odp_system_info.c: #elif defined >>>> __arm__ || >>>> > defined __aarch64__ >>>> > platform/linux-generic/odp_system_info.c: .cpu_arch_str = >>>> "arm", >>>> > platform/linux-generic/odp_system_info.c: .cpuinfo_parser >>>> = >>>> > cpuinfo_arm >>>> > >>>> > >>>> >> >>>> >> >>>> >> Taras Kondratiuk >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > Mike Holmes >>>> > Linaro Sr Technical Manager >>>> > LNG - ODP >>>> >>>> _______________________________________________ >>>> lng-odp mailing list >>>> [email protected] <mailto:[email protected]> >>>> http://lists.linaro.org/mailman/listinfo/lng-odp >>>> >>>> >>>> >>>> >>>> -- >>>> *Mike Holmes* >>>> Linaro Sr Technical Manager >>>> LNG - ODP >>>> >>>> >>>> _______________________________________________ >>>> lng-odp mailing list >>>> [email protected] >>>> http://lists.linaro.org/mailman/listinfo/lng-odp >>>> >>> >>> >>> _______________________________________________ >>> lng-odp mailing list >>> [email protected] >>> http://lists.linaro.org/mailman/listinfo/lng-odp >>> >> >> >> >> -- >> *Mike Holmes* >> Linaro Sr Technical Manager >> LNG - ODP >> >> _______________________________________________ >> lng-odp mailing list >> [email protected] >> http://lists.linaro.org/mailman/listinfo/lng-odp >> >> >
_______________________________________________ lng-odp mailing list [email protected] http://lists.linaro.org/mailman/listinfo/lng-odp
