Package: src:atlas Version: 3.10.2-4 Tags: patch User: debian-powe...@lists.debian.org Usertags: ppc64el
Hi atlas maintainers, This patch adds support for the ppc64el port. It contains: 1) the patch-set authored by Michael Normand (submitted upstream, also documented in [1]) 2) a packaging change, to restrict the patch-set to ppc64el builds only; it touches common powerpc code, which is certainly not desirable for other powerpc-based ports at this moment (bugs/freeze/jessie). 3) an archdef tarball (attached separately) May you please consider it for an upload? (specially for making jessie) Thank you, [1] https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40 -- Mauricio Faria de Oliveira IBM Linux Technology Center
GENERIC64LE.tar.bz2
Description: application/bzip
diff -Nru atlas-3.10.2/debian/archdefs/README atlas-3.10.2/debian/archdefs/README --- atlas-3.10.2/debian/archdefs/README 2014-07-12 07:23:26.000000000 -0300 +++ atlas-3.10.2/debian/archdefs/README 2014-10-24 19:45:37.000000000 -0200 @@ -16,5 +16,6 @@ - mips: ATLAS 3.10.1 / gabrielli.debian.org / sid / 2013-07-27 - mipsel: ATLAS 3.10.1 / eder.debian.org / sid / 2013-06-07 - powerpc: ATLAS 3.10.1 / partch.debian.org / sid / 2013-06-06 +- ppc64el: ATLAS 3.10.2 / pastel.debian.net / sid / 2014-10-24 - s390x: ATLAS 3.10.1 / zelenka.debian.org / sid / 2013-06-06 - sparc: ATLAS 3.10.1 / smetana.debian.org / wheezy / 2013-06-06 diff -Nru atlas-3.10.2/debian/changelog atlas-3.10.2/debian/changelog --- atlas-3.10.2/debian/changelog 2014-10-15 16:35:41.000000000 -0300 +++ atlas-3.10.2/debian/changelog 2014-10-24 19:45:37.000000000 -0200 @@ -1,3 +1,15 @@ +atlas (3.10.2-4ppc64el1) UNRELEASED; urgency=medium + + * Add ppc64el support (work in progress) + - debian/patches/ppc64el/ (thanks, Michael Normand et al). + - debian/rules: restrict ppc64el patches to ppc64el builds. + - debian/rules: different 'GENERIC' first number in ARCHs due to POWER8. + - debian/archdefs/ppc64el/GENERIC64LE.tar.bz2: archdefs/timings, + currently the same file for POWER7, POWER7+ and POWER8 systems. + - debian/archdefs/README: updated accordingly. + + -- Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Thu, 24 Oct 2014 20:02:00 -0200 + atlas (3.10.2-4) unstable; urgency=medium [ Alastair McKinstry ] diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas-new_archdef_for_ppc64le.patch atlas-3.10.2/debian/patches/ppc64el/atlas-new_archdef_for_ppc64le.patch --- atlas-3.10.2/debian/patches/ppc64el/atlas-new_archdef_for_ppc64le.patch 1969-12-31 21:00:00.000000000 -0300 +++ atlas-3.10.2/debian/patches/ppc64el/atlas-new_archdef_for_ppc64le.patch 2014-10-24 19:45:37.000000000 -0200 @@ -0,0 +1,38 @@ +Origin: https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c43 +Forwarded: http://sourceforge.net/p/math-atlas/patches/66/ +Description: Append 'LE' to archdef on little-endian PowerPC64 + For more details, see: + https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40 +Last-Update: 2014-10-24 +Subject: atlas new archdef for ppc64le +From: Michel Normand <norm...@linux.vnet.ibm.com> +Date: Sun, 13 Jun 2014 18:02:47 +0200 + +Need to define different archdef names +for ppc64 (that is Big Endian) and ppc64le (that is Little Endian). +This is already done upstream in atlas 3.11.30 with issue +https://sourceforge.net/p/math-atlas/patches/66/ + +Required at least as long as I need the bypass of +atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch + +Signed-off-by: Michel Normand <norm...@linux.vnet.ibm.com> +--- + CONFIG/src/SpewMakeInc.c | 4 ++++ + 1 file changed, 4 insertions(+) + +Index: ATLAS/CONFIG/src/SpewMakeInc.c +=================================================================== +--- ATLAS.orig/CONFIG/src/SpewMakeInc.c ++++ ATLAS/CONFIG/src/SpewMakeInc.c +@@ -542,6 +542,10 @@ int main(int nargs, char **args) + fprintf(fpout, "# -------------------------------------------------\n"); + fprintf(fpout, " ARCH = %s", machnam[mach]); + fprintf(fpout, "%d", ptrbits); ++ /* for ppc64le archi add 'LE' characters */ ++ #if defined(__powerpc64__) && (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__) ++ fprintf(fpout, "%s", "LE"); ++ #endif + if (ISAX) + fprintf(fpout, "%s", ISAXNAM[ISAX]); + if (!USEIEEE) diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-add_power8_cpu.patch atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-add_power8_cpu.patch --- atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-add_power8_cpu.patch 1969-12-31 21:00:00.000000000 -0300 +++ atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-add_power8_cpu.patch 2014-10-24 19:45:37.000000000 -0200 @@ -0,0 +1,138 @@ +Origin: https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c37 +Forwarded: http://sourceforge.net/p/math-atlas/patches/67/ +Description: Add IBM POWER8 pieces + The original patch for 3.10.2 was backported to apply on top + of 'debian/patches/generic.diff' - trivial changes to hunks + of 'ATLAS/CONFIG/include/atlconf.h'. + For more details, see: + https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40 +Last-Update: 2014-10-24 +From: Michel Normand <norm...@linux.vnet.ibm.com> +Subject: atlas.3.10.2 add power8 cpu +Date: Thu, 18 Sep 2014 15:13:24 +0200 + +atlas.3.10.2 add Power8 cpu + +Signed-off-by: Michel Normand <norm...@linux.vnet.ibm.com> +--- + CONFIG/ARCHS/Make.ext | 7 +++++++ + CONFIG/include/atlconf.h | 6 +++--- + CONFIG/src/atlcomp.txt | 6 ++++++ + CONFIG/src/backend/archinfo_aix.c | 2 ++ + CONFIG/src/backend/archinfo_linux.c | 1 + + include/atlas_pca.h | 2 +- + 6 files changed, 20 insertions(+), 4 deletions(-) + +Index: ATLAS/CONFIG/ARCHS/Make.ext +=================================================================== +--- ATLAS.orig/CONFIG/ARCHS/Make.ext ++++ ATLAS/CONFIG/ARCHS/Make.ext +@@ -33,6 +33,7 @@ files = AMD64K10h32SSE3.tar.bz2 AMD64K10 + MIPSR1xK64.tar.bz2 Makefile P432SSE2.tar.bz2 P4E32SSE3.tar.bz2 \ + P4E64SSE3.tar.bz2 PIII32SSE1.tar.bz2 POWER432.tar.bz2 \ + POWER464.tar.bz2 POWER564.tar.bz2 POWER764VSX.tar.bz2 \ ++ POWER864VSX.tar.bz2 \ + PPCG432AltiVec.tar.bz2 PPCG532AltiVec.tar.bz2 PPCG564AltiVec.tar.bz2 \ + PPRO32.tar.bz2 USIII32.tar.bz2 USIII64.tar.bz2 USIV32.tar.bz2 \ + USIV64.tar.bz2 UST232.tar.bz2 UST264.tar.bz2 atlas_test1.1.3.tar.bz2 \ +@@ -308,6 +309,12 @@ POWER764VSX.tar.bz2 : $(basdr)/POWER764V + /tmp/POWER764VSX.tar POWER764VSX + bzip2 /tmp/POWER764VSX.tar + mv /tmp/POWER764VSX.tar.bz2 ./. ++POWER864VSX.tar.bz2 : $(basdr)/POWER864VSX ++ - rm -f /tmp/POWER864VSX.tar /tmp/POWER864VSX.tar.bz2 ++ cd $(basdr) ; tar --dereference --exclude 'CVS' -c -f \ ++ /tmp/POWER864VSX.tar POWER864VSX ++ bzip2 /tmp/POWER864VSX.tar ++ mv /tmp/POWER864VSX.tar.bz2 ./. + IBMz1032.tar.bz2 : $(basdr)/IBMz1032 + - rm -f /tmp/IBMz1032.tar /tmp/IBMz1032.tar.bz2 + cd $(basdr) ; tar --dereference --exclude 'CVS' -c -f \ +Index: ATLAS/CONFIG/include/atlconf.h +=================================================================== +--- ATLAS.orig/CONFIG/include/atlconf.h ++++ ATLAS/CONFIG/include/atlconf.h +@@ -18,10 +18,10 @@ enum OSTYPE {OSOther=0, OSLinux, OSSunOS + enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS, + AFARM, AFS390}; + +-#define NMACH 53 ++#define NMACH 54 + static char *machnam[NMACH] = + {"UNKNOWN", "POWER3", "POWER4", "POWER5", "PPCG4", "PPCG5", +- "POWER6", "POWER7", "POWERe6500", "IBMz9", "IBMz10", "IBMz196", ++ "POWER6", "POWER7", "POWER8", "POWERe6500", "IBMz9", "IBMz10", "IBMz196", + "x86x87", "x86SSE1", "x86SSE2", "x86SSE3", + "P5", "P5MMX", "PPRO", "PII", "PIII", "PM", "CoreSolo", + "CoreDuo", "Core2Solo", "Core2", "Corei1", "Corei2", "Corei3", +@@ -31,7 +31,7 @@ static char *machnam[NMACH] = + "USI", "USII", "USIII", "USIV", "UST1", "UST2", "UnknownUS", + "MIPSR1xK", "MIPSICE9", "ARMv7", "GENERIC"}; + enum MACHTYPE {MACHOther, IbmPwr3, IbmPwr4, IbmPwr5, PPCG4, PPCG5, +- IbmPwr6, IbmPwr7, Pwre6500, ++ IbmPwr6, IbmPwr7, IbmPwr8, Pwre6500, + IbmZ9, IbmZ10, IbmZ196, /* s390(x) in Linux */ + x86x87, x86SSE1, x86SSE2, x86SSE3, /* generic targets */ + IntP5, IntP5MMX, IntPPRO, IntPII, IntPIII, IntPM, IntCoreS, +Index: ATLAS/CONFIG/src/atlcomp.txt +=================================================================== +--- ATLAS.orig/CONFIG/src/atlcomp.txt ++++ ATLAS/CONFIG/src/atlcomp.txt +@@ -190,6 +190,10 @@ MACH=PPCG5 OS=ALL LVL=1000 COMPS=dmc,icc + 'gcc' '-mpowerpc64 -maltivec -mabi=altivec -mcpu=970 -mtune=970 -O2' + MACH=PPCG5 OS=ALL LVL=1000 COMPS=skc + 'gcc' '-mpowerpc64 -maltivec -mabi=altivec -mcpu=970 -mtune=970 -O2 -mvrsave' ++MACH=POWER8 OS=ALL LVL=1010 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc ++ 'gcc' '-O2 -mvsx -mcpu=power8 -mtune=power8 -m64 -mvrsave -funroll-all-loops' ++MACH=POWER8 OS=ALL LVL=1010 COMPS=f77 ++ 'gfortran' '-O2 -mvsx -mcpu=power8 -mtune=power8 -m64 -mvrsave -funroll-all-loops' + MACH=POWER7 OS=ALL LVL=1010 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc + 'gcc' '-O2 -mvsx -mcpu=power7 -mtune=power7 -m64 -mvrsave -funroll-all-loops' + MACH=POWER7 OS=ALL LVL=1010 COMPS=f77 +@@ -210,6 +214,8 @@ MACH=POWER4 OS=ALL LVL=1010 COMPS=icc,dm + 'gcc' '-mcpu=power4 -mtune=power4 -O3 -fno-schedule-insns -fno-rerun-loop-opt' + MACH=POWER4 OS=ALL LVL=1010 COMPS=f77 + 'xlf' '-qtune=pwr4 -qarch=pwr4 -O3 -qmaxmem=-1 -qfloat=hsflt' ++MACH=POWER8 OS=ALL LVL=1010 COMPS=f77 ++ 'xlf' '-qtune=pwr8 -qarch=pwr8 -O3 -qmaxmem=-1 -qfloat=hsflt' + # + # IBM System z or zEnterprise. + # These compiler flags given by IBM; -O3 -funroll-loops are chosen because +Index: ATLAS/CONFIG/src/backend/archinfo_linux.c +=================================================================== +--- ATLAS.orig/CONFIG/src/backend/archinfo_linux.c ++++ ATLAS/CONFIG/src/backend/archinfo_linux.c +@@ -77,6 +77,7 @@ enum MACHTYPE ProbeArch() + else if (strstr(res, "7455")) mach = PPCG4; + else if (strstr(res, "PPC970FX")) mach = PPCG5; + else if (strstr(res, "PPC970MP")) mach = PPCG5; ++ else if (strstr(res, "POWER8")) mach = IbmPwr8; + else if (strstr(res, "POWER7")) mach = IbmPwr7; + else if (strstr(res, "POWER6")) mach = IbmPwr6; + else if (strstr(res, "POWER5")) mach = IbmPwr5; +Index: ATLAS/include/atlas_pca.h +=================================================================== +--- ATLAS.orig/include/atlas_pca.h ++++ ATLAS/include/atlas_pca.h +@@ -26,7 +26,7 @@ + #endif + #elif defined(ATL_ARCH_POWER3) || defined(ATL_ARCH_POWER4) || \ + defined(ATL_ARCH_POWER5) || defined(ATL_ARCH_POWER6) || \ +- defined(ATL_ARCH_POWER7) ++ defined(ATL_ARCH_POWER7) || defined(ATL_ARCH_POWER8) + #ifdef __GNUC__ + #define ATL_membarrier __asm__ __volatile__ ("dcs") + /* #define ATL_USEPCA 1 */ +Index: ATLAS/CONFIG/src/backend/archinfo_aix.c +=================================================================== +--- ATLAS.orig/CONFIG/src/backend/archinfo_aix.c ++++ ATLAS/CONFIG/src/backend/archinfo_aix.c +@@ -67,6 +67,8 @@ enum MACHTYPE ProbeArch() + { + if (strstr(res, "PowerPC_POWER5")) + mach = IbmPwr5; ++ else if (strstr(res, "PowerPC_POWER8")) ++ mach = IbmPwr8; + else if (strstr(res, "PowerPC_POWER7")) + mach = IbmPwr7; + else if (strstr(res, "PowerPC_POWER6")) diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch --- atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch 1969-12-31 21:00:00.000000000 -0300 +++ atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch 2014-10-24 19:45:37.000000000 -0200 @@ -0,0 +1,70 @@ +Origin: http://sourceforge.net/p/math-atlas/patches/65/#3cb1 +Forwarded: http://sourceforge.net/p/math-atlas/patches/65/ +Description: ELFv2 ABI changes (2/3) + For more details, see: + https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40 +Last-Update: 2014-10-24 +From: Michel Normand <norm...@linux.vnet.ibm.com> +Subject: atlas.3.10.2 ppc64le abiv2 step2 patch +Date: Mon, 28 Jul 2014 04:29:05 -0400 + +atlas.ppc64le abiv2 step2 complete the changes already present in atlas 3.10.2 +* still some files with opd ABI V1 to be disabled for ABI V2 + tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c + tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c + tune/blas/gemm/CASES/ATL_smm4x4x128_av.c + +Signed-off-by: Michel Normand <norm...@linux.vnet.ibm.com> +--- + tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c | 2 +- + tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c | 3 ++- + tune/blas/gemm/CASES/ATL_smm4x4x128_av.c | 2 +- + 3 files changed, 4 insertions(+), 3 deletions(-) + +Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c ++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c +@@ -268,7 +268,7 @@ Mjoin(.,ATL_USERMM): + .globl Mjoin(_,ATL_USERMM) + Mjoin(_,ATL_USERMM): + #else +- #if defined(ATL_USE64BITS) ++ #if defined(ATL_USE64BITS) && _CALL_ELF != 2 + /* + * Official Program Descripter section, seg fault w/o it on Linux/PPC64 + */ +Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c ++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c +@@ -202,7 +202,7 @@ Mjoin(.,ATL_USERMM): + .globl Mjoin(_,ATL_USERMM) + Mjoin(_,ATL_USERMM): + #else +- #if defined(ATL_USE64BITS) ++ #if defined(ATL_USE64BITS) && _CALL_ELF != 2 + /* + * Official Program Descripter section, seg fault w/o it on Linux/PPC64 + */ +@@ -257,6 +257,7 @@ ATL_USERMM: + #endif + #endif + ++ + #if defined (ATL_USE64BITS) + ld pC0, 120(r1) + ld incCn, 128(r1) +Index: ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c ++++ ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c +@@ -196,7 +196,7 @@ void ATL_USERMM(const int M, const int N + .globl Mjoin(_,ATL_USERMM) + Mjoin(_,ATL_USERMM): + #else +- #if defined(ATL_USE64BITS) ++ #if defined(ATL_USE64BITS) && _CALL_ELF != 2 + /* + * Official Program Descripter section, seg fault w/o it on Linux/PPC64 + */ diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch --- atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch 1969-12-31 21:00:00.000000000 -0300 +++ atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch 2014-10-24 19:45:37.000000000 -0200 @@ -0,0 +1,191 @@ +Origin: http://sourceforge.net/p/math-atlas/patches/65/#3cb1 +Forwarded: http://sourceforge.net/p/math-atlas/patches/65/ +Description: ELFv2 ABI changes (3/3) + For more details, see: + https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40 +Last-Update: 2014-10-24 +From: Michel Normand <norm...@linux.vnet.ibm.com> +Subject: atlas.3.10.2 ppc64le abiv2 step3 +Date: Tue, 29 Jul 2014 15:33:18 +0200 + +atlas.3.10.2 ppc64le abiv2 step3 +* change offsets of parameters read from stack to avoid some segfaults. + (values changes 120 => 104 and 128 => 112 identified by gdb investigation) + +Despite this step3 patch there are two Remaining problems for ppc64le archi: +* TODO: still have seg-faults in console during build/check +but is not critical (without make check) and rpm are generated on fedora. +unable to investigate because of problem tracked by issue 950 +https://sourceforge.net/p/math-atlas/support-requests/950/ + +* TODO: make check failure because xsslvtst execution failure +related to vector assembly code that assumes big-endian env +as written in ATL_cmm4x4x128_av.c and ATL_smm4x4x128_av.c. +Would need significant work to support little-endian as per +endianess comments of all PowerPC vector instructions in: +https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/FBFA164F824370F987256D6A006F424D/$file/vector_simd_pem.ppc.2005AUG23.pdf + +Signed-off-by: Michel Normand <norm...@linux.vnet.ibm.com> +--- + tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c | 7 +++++++ + tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c | 7 +++++++ + tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c | 7 +++++++ + tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c | 17 ++++++++++++++++- + tune/blas/gemm/CASES/ATL_smm4x4x128_av.c | 21 +++++++++++++++++++++ + 5 files changed, 58 insertions(+), 1 deletion(-) + +Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c ++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c +@@ -405,8 +405,15 @@ Mjoin(_,ATL_USERMM): + */ + #ifdef ATL_GAS_LINUX_PPC + #ifdef ATL_USE64BITS ++ #if _CALL_ELF == 2 ++ /* ABIv2 */ ++ ld pC0, 104(r1) ++ ld incCn, 112(r1) ++ #else ++ /* ABIv1 */ + ld pC0, 120(r1) + ld incCn, 128(r1) ++ #endif + #else + lwz incCn, FSIZE+8(r1) + #endif +Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c ++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c +@@ -324,8 +324,15 @@ ATL_USERMM: + #endif + + #ifdef ATL_USE64BITS ++#if _CALL_ELF == 2 ++/* ABIv2 */ ++ ld pC0, 104(r1) ++ ld incCn, 112(r1) ++#else ++/* ABIv1 */ + ld pC0, 120(r1) + ld incCn, 128(r1) ++#endif + #elif defined(ATL_AS_OSX_PPC) || defined(ATL_AS_AIX_PPC) + lwz pC0, 68(r1) + lwz incCn, 72(r1) +Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c ++++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c +@@ -170,13 +170,21 @@ void ATL_USERMM(const int M, const int N + const TYPE beta, TYPE *C, const int ldc) + (r10) 8(r1) + ******************************************************************************* +-64 bit ABIs: ++64 bit ABIv1s: + r3 r4 r5 r6/f1 + void ATL_USERMM(const int M, const int N, const int K, const TYPE alpha, + r7 r8 r9 r10 + const TYPE *A, const int lda, const TYPE *B, const int ldb, + f2 120(r1) 128(r1) + const TYPE beta, TYPE *C, const int ldc) ++ ++64 bit ABIv2s: ++ r3 r4 r5 r6/f1 ++void ATL_USERMM(const int M, const int N, const int K, const TYPE alpha, ++ r7 r8 r9 r10 ++ const TYPE *A, const int lda, const TYPE *B, const int ldb, ++ f2 104(r1) 112(r1) ++ const TYPE beta, TYPE *C, const int ldc) + #endif + #ifdef ATL_AS_AIX_PPC + .csect .text[PR] +@@ -259,8 +267,15 @@ ATL_USERMM: + + + #if defined (ATL_USE64BITS) ++#if _CALL_ELF == 2 ++/* ABIv2 */ ++ ld pC0, 104(r1) ++ ld incCn, 112(r1) ++#else ++/* ABIv1 */ + ld pC0, 120(r1) + ld incCn, 128(r1) ++#endif + #elif defined(ATL_AS_OSX_PPC) || defined(ATL_AS_AIX_PPC) + lwz pC0, 68(r1) + lwz incCn, 72(r1) +Index: ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c ++++ ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c +@@ -221,8 +221,15 @@ ATL_USERMM: + * kernel instead + */ + #if defined (ATL_USE64BITS) ++#if _CALL_ELF == 2 ++/* ABIv2 */ ++ ld r10, 104(r1) ++ ld r5, 112(r1) ++#else ++/* ABIv1 */ + ld r10, 120(r1) + ld r5, 128(r1) ++#endif + #elif defined(ATL_AS_OSX_PPC) + lwz r10, 60(r1) + lwz r5, 64(r1) +@@ -285,8 +292,15 @@ ATL_USERMM: + eqv r0, r0, r0 /* all 1s */ + ATL_WriteVRSAVE(r0) /* signal we use all vector regs */ + #if defined (ATL_USE64BITS) ++#if _CALL_ELF == 2 ++ /* ABIv2 */ ++ ld pC0, FSIZE+104(r1) ++ ld ldc, FSIZE+112(r1) ++#else ++ /* ABIv1 */ + ld pC0, FSIZE+120(r1) + ld ldc, FSIZE+128(r1) ++#endif + #elif defined(ATL_AS_OSX_PPC) + lwz pC0, FSIZE+60(r1) + lwz ldc, FSIZE+64(r1) +@@ -4258,8 +4272,15 @@ UNALIGNED_C: + eqv r0, r0, r0 /* all 1s */ + ATL_WriteVRSAVE(r0) /* signal we use all vector regs */ + #if defined (ATL_USE64BITS) ++#if _CALL_ELF == 2 ++ /* ABIv2 */ ++ ld pC0, FSIZE+104(r1) ++ ld ldc, FSIZE+112(r1) ++#else ++ /* ABIv1 */ + ld pC0, FSIZE+120(r1) + ld ldc, FSIZE+128(r1) ++#endif + #elif defined(ATL_AS_OSX_PPC) + lwz pC0, FSIZE+60(r1) + lwz ldc, FSIZE+64(r1) +Index: ATLAS/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c ++++ ATLAS/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c +@@ -258,8 +258,15 @@ ATL_USERMM: + eqv r0, r0, r0 /* all 1s */ + ATL_WriteVRSAVE(r0) /* signal we use all vector regs */ + #if defined (ATL_USE64BITS) ++#if _CALL_ELF == 2 ++/* ABIv2 */ ++ ld pC0, FSIZE+104(r1) ++ ld ldc, FSIZE+112(r1) ++#else ++/* ABIv1 */ + ld pC0, FSIZE+120(r1) + ld ldc, FSIZE+128(r1) ++#endif + #elif defined(ATL_AS_OSX_PPC) + lwz pC0, FSIZE+60(r1) + lwz ldc, FSIZE+64(r1) diff -Nru atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch --- atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch 1969-12-31 21:00:00.000000000 -0300 +++ atlas-3.10.2/debian/patches/ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch 2014-10-24 19:45:37.000000000 -0200 @@ -0,0 +1,163 @@ +Origin: https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c42 +Forwarded: http://sourceforge.net/p/math-atlas/patches/65/ +Description: Skip optimizations for big-endian PowerPC. + Some of the existing optimized files/cases for PowerPC + contain assembly instructions with implicit big-endian + behavior - thus incorrect for the little-endian mode - + incurring tests failures during the build (i.e., FTBFS). + This is being worked on; this is the workaround for now. + Author's comments in patch 'abiv2 step3'. + For more details, see: + https://bugzilla.redhat.com/show_bug.cgi?id=1080073#c40 +Last-Update: 2014-10-24 +From: Michel Normand <norm...@linux.vnet.ibm.com> +Subject: atlas.3.10.2 ppc64le do not use files with lvx +Date: Tue, 12 Aug 2014 16:07:06 +0200 + +ppc64le do not use files with lvx +This is a temporary patch as long as the related files +are not ported yet to ppc64 little-endian. + +Warning: patch to be applied only for ppc64le architecture +and will also need atlas-new_archdef_for_ppc64le.patch + +Signed-off-by: Michel Normand <norm...@linux.vnet.ibm.com> +--- + tune/blas/gemm/CASES/ccases.flg | 6 +----- + tune/blas/gemm/CASES/dcases.flg | 8 +------- + tune/blas/gemm/CASES/dcases.vnb | 4 ---- + tune/blas/gemm/CASES/scases.flg | 9 +-------- + tune/blas/gemm/CASES/scases.vnb | 3 --- + tune/blas/gemm/CASES/zcases.flg | 8 +------- + 6 files changed, 4 insertions(+), 34 deletions(-) + +Index: ATLAS/tune/blas/gemm/CASES/ccases.flg +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/ccases.flg ++++ ATLAS/tune/blas/gemm/CASES/ccases.flg +@@ -1,5 +1,5 @@ + <ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>" +-24 ++22 + 304 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \ + gcc + -mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O +@@ -48,13 +48,9 @@ gcc + 328 480 8 8 2 1 1 8 8 2 ATL_mm8x8x2.c "R. Clint Whaley" \ + gcc + -fomit-frame-pointer -O2 -fno-tree-loop-optimize +-329 192 4 4 4 1 16 4 4 4 ATL_cmm4x4x128_av.c "R. Clint Whaley" \ +-gcc +--x assembler-with-cpp + 331 192 4 4 1 1 1 4 4 1 ATL_smm4x4xURx_mips.c "R. Clint Whaley" \ + gcc + -x assembler-with-cpp -mips4 +-332 192 8 2 4 1 0 8 2 4 ATL_smm8x2x4_av.c "IBM" + 333 448 4 4 2 1 1 4 4 2 ATL_smm4x4x2pf_arm.c "R. Clint Whaley" \ + gcc + -x assembler-with-cpp -mfpu=vfpv3 +Index: ATLAS/tune/blas/gemm/CASES/scases.flg +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/scases.flg ++++ ATLAS/tune/blas/gemm/CASES/scases.flg +@@ -1,5 +1,5 @@ + <ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>" +-25 ++22 + 304 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \ + gcc + -mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O +@@ -48,16 +48,9 @@ gcc + 328 480 8 8 2 1 1 8 8 2 ATL_mm8x8x2.c "R. Clint Whaley" \ + gcc + -fomit-frame-pointer -O2 -fno-tree-loop-optimize +-329 192 4 4 4 1 16 4 4 4 ATL_smm4x4x128_av.c "R. Clint Whaley" \ +-gcc +--x assembler-with-cpp +-330 200 92 92 92 1 16 92 92 92 ATL_smm4x4x128_av.c "R. Clint Whaley" \ +-gcc +--x assembler-with-cpp + 331 192 4 4 1 1 1 4 4 1 ATL_smm4x4xURx_mips.c "R. Clint Whaley" \ + gcc + -x assembler-with-cpp -mips4 +-332 192 8 2 4 1 0 8 2 4 ATL_smm8x2x4_av.c "IBM" + 333 448 4 4 2 1 1 4 4 2 ATL_smm4x4x2pf_arm.c "R. Clint Whaley" \ + gcc + -x assembler-with-cpp -mfpu=vfpv3 +Index: ATLAS/tune/blas/gemm/CASES/scases.vnb +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/scases.vnb ++++ ATLAS/tune/blas/gemm/CASES/scases.vnb +@@ -31,9 +31,6 @@ + # Defaults: TA='t', TB='n', SSE=0, X87=0, LDBOT=1, RTKU=0, AOUTER=0, + # KBMAX=KU, KBMIN=KU, BETAN1=0, RTMN=1 + # +-ID=1 ROUT='ATL_smm4x4x128_av.c' AUTH='R. Clint Whaley' MU=4 NU=4 KU=4 \ +- LDKB=1 LDBOT=1 KBMIN=4 KBMAX=128 ASM=GAS_PPC \ +- COMP='gcc' FLAGS='-x assembler-with-cpp' + ID=2 ROUT='ATL_smm4x4x16_av.c' AUTH='R. Clint Whaley' MU=4 NU=4 KU=16 \ + LDKB=1 LDBOT=0 KBMIN=16 KBMAX=2048 ASM=GAS_SPARC \ + COMP='gcc' FLAGS='-x assembler-with-cpp' +Index: ATLAS/tune/blas/gemm/CASES/dcases.flg +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/dcases.flg ++++ ATLAS/tune/blas/gemm/CASES/dcases.flg +@@ -1,5 +1,5 @@ + <ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>" +-32 ++30 + 306 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \ + gcc + -mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O -fno-schedule-insns -fno-schedule-insns2 +@@ -79,12 +79,6 @@ gcc + 336 192 4 4 1 1 1 4 4 1 ATL_dmm4x4xURx_mips.c "R. Clint Whaley" \ + gcc + -x assembler-with-cpp -mips4 +-337 192 4 4 1 1 16 4 4 1 ATL_dmm4x4x80_ppc.c "Whaley & Castaldo" \ +-gcc +--x assembler-with-cpp +-338 192 8 4 2 1 0 8 4 2 ATL_dmm8x4x2_vsx.c "IBM" \ +-gcc +--O3 -mvsx + 339 448 4 4 2 1 1 4 4 2 ATL_dmm4x4x2pf_arm.c "R. Clint Whaley" \ + gcc + -x assembler-with-cpp -mfpu=vfpv3 +Index: ATLAS/tune/blas/gemm/CASES/dcases.vnb +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/dcases.vnb ++++ ATLAS/tune/blas/gemm/CASES/dcases.vnb +@@ -53,10 +53,6 @@ ID=6 ROUT='ATL_dmm4x1x90_x87.c' AUTH='R + ID=7 ROUT='ATL_dmm8x1x120_sse2.c' AUTH='R. Clint Whaley' \ + MU=8 NU=1 KU=1 KBMAX=512 ASM=GAS_x8664 BETAN1=1 \ + COMP='gcc' FLAGS='-m64 -x assembler-with-cpp' +-ID=70 ROUT='ATL_dmm4x4x80_ppc.c' AUTH='R. Clint Whaley' TA='T', TB='N' \ +- MU=4 NU=4 KU=1 KBMIN=1 KBMAX=80 ASM=GAS_PPC BETAN1=0 LDBOT=0 \ +- LDAB=0 LDISKB=1 RTN=1 RTM=1 RTK=0 \ +- COMP='gcc' FLAGS='-x assembler-with-cpp' + ID=80 ROUT='ATL_dmm4x4x16r8_US.c' AUTH='R. Clint Whaley' TA='T', TB='N' \ + MU=4 NU=4 KU=24 KBMIN=24 KBMAX=512 ASM=GAS_SPARC BETAN1=0 \ + LDAB=0 RTK=1 RTN=1 RTM=1 LDBOT=0 LDISKB=1 LDAB=1 \ +Index: ATLAS/tune/blas/gemm/CASES/zcases.flg +=================================================================== +--- ATLAS.orig/tune/blas/gemm/CASES/zcases.flg ++++ ATLAS/tune/blas/gemm/CASES/zcases.flg +@@ -1,5 +1,5 @@ + <ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>" +-31 ++29 + 306 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \ + gcc + -mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O -fno-schedule-insns -fno-schedule-insns2 +@@ -76,12 +76,6 @@ gcc + 336 192 4 4 1 1 1 4 4 1 ATL_dmm4x4xURx_mips.c "R. Clint Whaley" \ + gcc + -x assembler-with-cpp -mips4 +-337 192 4 4 1 1 16 4 4 1 ATL_dmm4x4x80_ppc.c "Whaley & Castaldo" \ +-gcc +--x assembler-with-cpp +-338 192 8 4 2 1 0 8 4 2 ATL_dmm8x4x2_vsx.c "IBM" \ +-gcc +--O3 -mvsx + 339 448 4 4 2 1 1 4 4 2 ATL_dmm4x4x2pf_arm.c "R. Clint Whaley" \ + gcc + -x assembler-with-cpp -mfpu=vfpv3 diff -Nru atlas-3.10.2/debian/patches/series.ppc64el atlas-3.10.2/debian/patches/series.ppc64el --- atlas-3.10.2/debian/patches/series.ppc64el 1969-12-31 21:00:00.000000000 -0300 +++ atlas-3.10.2/debian/patches/series.ppc64el 2014-10-24 19:45:37.000000000 -0200 @@ -0,0 +1,5 @@ +ppc64el/atlas.3.10.2-ppc64le_abiv2_step2.patch +ppc64el/atlas.3.10.2-ppc64le_abiv2_step3.patch +ppc64el/atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch +ppc64el/atlas-new_archdef_for_ppc64le.patch +ppc64el/atlas.3.10.2-add_power8_cpu.patch diff -Nru atlas-3.10.2/debian/rules atlas-3.10.2/debian/rules --- atlas-3.10.2/debian/rules 2014-10-15 16:36:14.000000000 -0300 +++ atlas-3.10.2/debian/rules 2014-10-24 19:45:37.000000000 -0200 @@ -17,6 +17,8 @@ # - 51 means ARMv7: for armhf (but not for armel, which is ARM >= v4) # - 52 means GENERIC: the same than 0 (UNKNOWN), except that it does not try autodetection # See debian/patches/generic.diff +# [ppc64el] *53* means GENERIC, as POWER8 processor is introduced. +# See debian/patches/ppc64el/ # Second number in ARCHS: # - 1 means no instruction set extension # - 384 means SSE1+SSE2 (always available on amd64) @@ -31,8 +33,12 @@ else ifeq ($(DEB_HOST_ARCH),armhf) ARCHS=base_51_1 else +ifeq ($(DEB_HOST_ARCH),ppc64el) +ARCHS=base_53_1 +else ARCHS=base_52_1 endif +endif # Pointer bitwidth MODE_BITWIDTH = $(shell dpkg-architecture -qDEB_HOST_ARCH_BITS) @@ -86,6 +92,37 @@ (test -f CONFIG/ARCHS/ARMv732.tar.bz2.old && mv CONFIG/ARCHS/ARMv732.tar.bz2.old CONFIG/ARCHS/ARMv732.tar.bz2) || true (test -f CONFIG/ARCHS/ARMv732NEON.tar.bz2.old && mv CONFIG/ARCHS/ARMv732NEON.tar.bz2.old CONFIG/ARCHS/ARMv732NEON.tar.bz2) || true +# The ppc64el patches affect general powerpc files, and are work in progress. +# So, for now, restrict them to the ppc64el builds. (debian/patches/ppc64el/) +PATCH_SERIES_CURRENT := debian/patches/series +PATCH_SERIES_PPC64EL := debian/patches/series.ppc64el +PATCH_SERIES_ORIG := debian/patches/series.ppc64el-orig + +patch-ppc64el: +ifeq ($(DEB_HOST_ARCH),ppc64el) + dh_testdir + # Store original patch series, append ppc64el patches, and apply them. + if ! test -f $(PATCH_SERIES_ORIG); then \ + cp -a $(PATCH_SERIES_CURRENT) $(PATCH_SERIES_ORIG); \ + cat $(PATCH_SERIES_PPC64EL) >> $(PATCH_SERIES_CURRENT); \ + while quilt --quiltrc /dev/null next | grep '^ppc64el/'; do \ + quilt --quiltrc /dev/null push; \ + done; \ + fi +endif + +unpatch-ppc64el: +ifeq ($(DEB_HOST_ARCH),ppc64el) + dh_testdir + # Unapply ppc64el patches, and restore original patch series. + if test -f $(PATCH_SERIES_ORIG); then \ + while quilt --quiltrc /dev/null top | grep '^ppc64el/'; do \ + quilt --quiltrc /dev/null pop -R; \ + done; \ + mv $(PATCH_SERIES_ORIG) $(PATCH_SERIES_CURRENT); \ + fi +endif + # Build a custom package optimized for the current arch custom: custom-stamp .PHONY: custom @@ -122,7 +159,7 @@ touch $@ common-configure-arch common-configure-indep:: configure-stamp -configure-stamp: +configure-stamp: patch-ppc64el dh_testdir set -e; \ @@ -167,7 +204,7 @@ touch $@ clean:: clean-work -clean-work: restore-armhf-archdef +clean-work: restore-armhf-archdef unpatch-ppc64el dh_testdir dh_testroot rm -rf build check diff -Nru atlas-3.10.2/debian/source/include-binaries atlas-3.10.2/debian/source/include-binaries --- atlas-3.10.2/debian/source/include-binaries 2014-07-12 07:23:26.000000000 -0300 +++ atlas-3.10.2/debian/source/include-binaries 2014-10-24 19:45:37.000000000 -0200 @@ -6,5 +6,6 @@ debian/archdefs/mips/GENERIC32.tar.bz2 debian/archdefs/mipsel/GENERIC32.tar.bz2 debian/archdefs/powerpc/GENERIC32.tar.bz2 +debian/archdefs/ppc64el/GENERIC64LE.tar.bz2 debian/archdefs/s390x/IBMz964.tar.bz2 debian/archdefs/sparc/USI32.tar.bz2
-- debian-science-maintainers mailing list debian-science-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/debian-science-maintainers