On 24 December 2014 at 10:05, Bill Fischofer <[email protected]> wrote:
> I suspect the long-term answer will be to use the autotools to construct > the actual include hierarchy (and possibly the files themelves) needed for > a given --with-platform specification. > Let me check I have the same terminology in my head - we should have this defined in the arch doc but I did not look. platform = a physical board with interfaces etc and a core of some architecture as a complete solution arch = a cpu core for example octeon, arm, x86 In this case I then see a need to pick up arch independently from --with-platform so that the optimizations can be reused by all platforms with that architecture core. If we dont I would see a need for platform/linux-x86-generic platform/linux-arm-generic platform/linux-octeon-generic > That would mean the canonical ODP API files may need to be meta-files that > are used as input to ./configure to generate the actual ones used. > If the work Taras and Anders are doing to clean out the API and move it back to the root succeeds, and it looks good to me, then it opens the door to some more wriggle room in solving this issue I think. But I could see the configure step helping take a single API definition and attaching it to the required optimizations as a solution if it cannot be done within C / GCC > > On Wed, Dec 24, 2014 at 9:03 AM, Bill Fischofer <[email protected] > > wrote: > >> That makes sense. The only problem is GCC's problems with forward >> declarations and inlining. IMO a design issue with the compiler, but it is >> what it is. But we do need a general solution to this problem in 2015 >> since proper inlining support will be critical for production-grade >> implementations. >> >> On Wed, Dec 24, 2014 at 8:59 AM, Mike Holmes <[email protected]> >> wrote: >> >>> >>> >>> On 24 December 2014 at 06:19, Maxim Uvarov <[email protected]> >>> wrote: >>> >>>> On 12/23/2014 10:40 PM, Mike Holmes wrote: >>>> >>>>> >>>>> >>>>> On 23 December 2014 at 13:49, Bill Fischofer < >>>>> [email protected] <mailto:[email protected]>> wrote: >>>>> >>>>> That's good to know and certainly the code is cleaner that way. I >>>>> think we should adopt the practice that any architecture-specific >>>>> changes to linux-generic code (adds, updates, or removals) must >>>>> either originate from or have a Reviewed-by a representative of >>>>> that architecture (in this case, Bala). That way there's no >>>>> confusion as to whether the known experts agree with the change, >>>>> since these sort of things can have subtle implications that are >>>>> not obvious to the layman. >>>>> >>>>> >>>>> I think linux-generic should be devoid of any per core optimizations >>>>> in its default state. >>>>> There could be in an arch directory and it would be supported as you >>>>> say Bill by someone who will care for that that variant, this also avoids >>>>> filling linux-generic with a possibly large list of preprocessor flags >>>>> that >>>>> we can't test. >>>>> >>>>> In this case we don't have public access to the octeon to even know if >>>>> this code works so separating or deletion feels right to me, although >>>>> Cavium are adding their board in this case. >>>>> >>>>> To have optimizations we should understand which per arch >>>> optimization we need. I see that if newer gcc supports good optimizations >>>> then we need to use that gcc. Most of people use Open Embedded and Linaro >>>> gcc which is almost latest version. So if there is no optimization in >>>> standard environment (gcc, kernel, libc) then arch implementation should go >>>> to platform specific git variant. If it's covered by standard environment >>>> then we simple updating minimal version requirement for required component. >>>> >>>> >>> Lets take an example ODP API file >>> >>> static inline void odp_sync_stores(void) >>> { >>> #if defined __x86_64__ || defined __i386__ >>> >>> >-------__asm__ __volatile__ ("sfence\n" : : : "memory"); >>> >>> #elif defined(__arm__) >>> #if __ARM_ARCH == 6 >>> >-------__asm__ __volatile__ ("mcr p15, 0, %0, c7, c10, 5" \ >>> >------->------->-------: : "r" (0) : "memory"); >>> #elif __ARM_ARCH >= 7 || defined __aarch64__ >>> >>> >-------__asm__ __volatile__ ("dmb st" : : : "memory"); >>> #else >>> >-------__asm__ __volatile__ ("" : : : "memory"); >>> #endif >>> >>> #elif defined __OCTEON__ >>> >>> >-------__asm__ __volatile__ ("syncws\n" : : : "memory"); >>> >>> #else >>> >-------__sync_synchronize(); >>> #endif >>> >>> >>> How do we accommodate this without resorting to architecture specific >>> processor flags / assembly? >>> >>> I would prefer to see platform/linux-generic/include/api/odp_sync.h >>> changed to >>> >>> platform/linux-generic/include/api/x86/odp_sync.h >>> platform/linux-generic/include/api/arm/odp_sync.h >>> platform/linux-generic/include/api/octeon/odp_sync.h >>> >>> This assumes that there is currently no pure c code that can be used >>> that is acceptable on all three platforms. >>> >>> Or since linux-generic is not a performance platform but rather a >>> reference whenever we need to make a trade off between simplicity and >>> performance, we alter this to be a regular function and move to:- >>> >>> platform/linux-generic/include/api/odp_sync.h <- now a pure forward >>> declaration of the API with doxygen documentation >>> and implementations correctly broken down by arch >>> >>> arch/x86/odp_sync.h >>> arch/arm/odp_sync.h >>> arch/octeon/odp_sync.h >>> >>> >>> >>>> Maxim. >>>> >>>> >>>> >>>>> On Tue, Dec 23, 2014 at 12:38 PM, Ola Liljedahl >>>>> <[email protected] <mailto:[email protected]>> >>>>> wrote: >>>>> >>>>> I had some discussions with Bala and we think the GCC >>>>> generates the >>>>> appropriate code for OCTEON so there should be no need for any >>>>> OCTEON-specific inline assembler in odp_atomic.h. I just >>>>> didn't want >>>>> to remove this piece of code when I redesigned the atomic >>>>> support >>>>> because I didn't understand why there is a SYNCW instruction >>>>> *before* >>>>> the load. Load-acquire type of loads may need sync or barrier >>>>> instructions but they come after the load, not before. >>>>> >>>>> I think we should trust GCC to generate better code for OCTEON >>>>> than us >>>>> mere humans, at least until proven wrong. So please remove this >>>>> OCTEON-specific snippet. >>>>> >>>>> -- Ola >>>>> >>>>> >>>>> On 23 December 2014 at 18:01, Mike Holmes >>>>> <[email protected] <mailto:[email protected]>> >>>>> wrote: >>>>> > >>>>> > >>>>> > On 23 December 2014 at 11:15, Taras Kondratiuk >>>>> <[email protected] <mailto:taras.kondratiuk@ >>>>> linaro.org>> >>>>> >>>>> > wrote: >>>>> >> >>>>> >> On 12/23/2014 05:28 PM, Mike Holmes wrote: >>>>> >> > how will this work, Octeon does not care about >>>>> __atomic_fetch_add(&v, >>>>> >> > 1, __ATOMIC_RELAXED); and yet you will fail configure I >>>>> assume >>>>> >> > >>>>> >> > static inline uint32_t >>>>> odp_atomic_fetch_inc_u32(odp_atomic_u32_t *atom) >>>>> >> > { >>>>> >> > #if defined __OCTEON__ >>>>> >> >>-------uint32_t ret; >>>>> >> >>-------__asm__ __volatile__ ("syncws"); >>>>> >> >>-------__asm__ __volatile__ ("lai %0,(%2)" : "=r" (ret), >>>>> "+m" (atom) : >>>>> >> >>------->------->------- "r" (atom)); >>>>> >> >>-------return ret; >>>>> >> > #else >>>>> >> >>-------return __atomic_fetch_add(&atom->v, 1, >>>>> __ATOMIC_RELAXED); >>>>> >> > #endif >>>>> >> > } >>>>> >> >>>>> >> Linux-generic shouldn't have arch specific parts, because >>>>> it is >>>>> >> *generic*. Instead it can fall back to strong __sync >>>>> functions if >>>>> >> __atomic are not available. >>>>> >> >>>>> > >>>>> > Linux kernel is generic but has core specifics - is this not >>>>> the same ? >>>>> > >>>>> > What about these cases ? >>>>> > >>>>> > platform/linux-generic/odp_system_info.c:static int >>>>> cpuinfo_octeon(FILE >>>>> > *file, odp_system_info_t *sysinfo) >>>>> > platform/linux-generic/odp_system_info.c: .cpu_arch_str = >>>>> "octeon", >>>>> > platform/linux-generic/odp_system_info.c: >>>>> .cpuinfo_parser = >>>>> > cpuinfo_octeon >>>>> > platform/linux-generic/include/api/odp_align.h:#if defined >>>>> __x86_64__ || >>>>> > defined __i386__ >>>>> > platform/linux-generic/include/api/odp_sync.h:#if defined >>>>> __x86_64__ || >>>>> > defined __i386__ >>>>> > platform/linux-generic/include/odp_spin_internal.h:#if >>>>> defined __x86_64__ || >>>>> > defined __i386__ >>>>> > platform/linux-generic/odp_system_info.c:#if defined >>>>> __x86_64__ || defined >>>>> > __i386__ || defined __OCTEON__ || \ >>>>> > platform/linux-generic/odp_system_info.c:#if defined >>>>> __x86_64__ || defined >>>>> > __i386__ >>>>> > platform/linux-generic/odp_system_info.c:static int >>>>> cpuinfo_x86(FILE *file, >>>>> > odp_system_info_t *sysinfo) >>>>> > platform/linux-generic/odp_system_info.c: #if defined >>>>> __x86_64__ || >>>>> > defined __i386__ >>>>> > platform/linux-generic/odp_system_info.c: .cpu_arch_str = >>>>> "x86", >>>>> > platform/linux-generic/odp_system_info.c: >>>>> .cpuinfo_parser = >>>>> > cpuinfo_x86 >>>>> > platform/linux-generic/odp_system_info.c:#if defined >>>>> __x86_64__ || defined >>>>> > __i386__ || defined __OCTEON__ || \ >>>>> > platform/linux-generic/odp_time.c:#if defined __x86_64__ || >>>>> defined __i386__ >>>>> > platform/linux-generic/include/api/odp_align.h:#elif defined >>>>> __arm__ || >>>>> > defined __aarch64__ >>>>> > platform/linux-generic/include/api/odp_sync.h:#elif >>>>> defined(__arm__) >>>>> > platform/linux-generic/include/odp_spin_internal.h:#elif >>>>> defined __arm__ >>>>> > platform/linux-generic/odp_system_info.c:#elif defined >>>>> __arm__ || defined >>>>> > __aarch64__ >>>>> > platform/linux-generic/odp_system_info.c:static int >>>>> cpuinfo_arm(FILE *file >>>>> > ODP_UNUSED, >>>>> > platform/linux-generic/odp_system_info.c: #elif defined >>>>> __arm__ || >>>>> > defined __aarch64__ >>>>> > platform/linux-generic/odp_system_info.c: .cpu_arch_str = >>>>> "arm", >>>>> > platform/linux-generic/odp_system_info.c: >>>>> .cpuinfo_parser = >>>>> > cpuinfo_arm >>>>> > >>>>> > >>>>> >> >>>>> >> >>>>> >> Taras Kondratiuk >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > Mike Holmes >>>>> > Linaro Sr Technical Manager >>>>> > LNG - ODP >>>>> >>>>> _______________________________________________ >>>>> lng-odp mailing list >>>>> [email protected] <mailto:[email protected]> >>>>> http://lists.linaro.org/mailman/listinfo/lng-odp >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *Mike Holmes* >>>>> Linaro Sr Technical Manager >>>>> LNG - ODP >>>>> >>>>> >>>>> _______________________________________________ >>>>> lng-odp mailing list >>>>> [email protected] >>>>> http://lists.linaro.org/mailman/listinfo/lng-odp >>>>> >>>> >>>> >>>> _______________________________________________ >>>> lng-odp mailing list >>>> [email protected] >>>> http://lists.linaro.org/mailman/listinfo/lng-odp >>>> >>> >>> >>> >>> -- >>> *Mike Holmes* >>> Linaro Sr Technical Manager >>> LNG - ODP >>> >>> _______________________________________________ >>> lng-odp mailing list >>> [email protected] >>> http://lists.linaro.org/mailman/listinfo/lng-odp >>> >>> >> > -- *Mike Holmes* Linaro Sr Technical Manager LNG - ODP
_______________________________________________ lng-odp mailing list [email protected] http://lists.linaro.org/mailman/listinfo/lng-odp
