On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot <[email protected]> wrote: > > Hi Mathieu, > > Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit : > > > > On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <[email protected]> > > wrote: > > > > > > The wiki page that synthesizes architecture specificities indicates > > > that Altivec is included in the baseline for the ppc64 port: > > > https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64 > > > > > > However my understanding is that this port supports any powerpc64 CPU, > > > including some that don’t have Altivec (e.g. POWER4 or POWER5). This is > > > also what the main wiki page for PPC64 says: > > > https://wiki.debian.org/PPC64 > > > > > > Can someone please clarify the situation? > > > > > > (I’m asking because I’m the maintainer of the openblas package, and > > > knowing whether Altivec is available or not, and more generally what is > > > in the baseline, is essential for proper packaging). > > > > I do not believe that you can do much as a packager. You cannot assume > > anything on the target arch. You need to do the same thing as ffmpeg > > is doing for avx2/sse4 on amd64, you need to do runtime detection. So > > unless upstream is doing something very clever you cannot compile blas > > using any of the fancy altivec instructions :( > > > > The man page for ld.so mentions something about optimized libraries > > (search for "/usr/lib/sse2/"), but this is currently not in use in > > Debian (AFAIK). > > Actually OpenBLAS has its own runtime detection mechanism, which is > used to select the best linear algebra kernel for the current CPU > (those kernels are mainly written in assembly, and take advantage of > available ISA extensions). This mechanism is used on several archs, > including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and > a POWER9 kernel; there is even a POWER10 kernel already available). > > However, I cannot enable this mechanism on ppc64 and powerpc, because > the runtime detection only works for POWER6 and above, and my > understanding is that for these two ports the baseline is lower. Hence > on these two archs, only one kernel is included in the package binaries > (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal > performance, users should recompile OpenBLAS locally (as indicated in > the package description and in README.Debian).
There are plenty of people on this mailing list that could test/verify that. Is there a quick way to check that your openblas package is compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you do any experiment on perotto.debian.net ? > I am however not sure that my current choices for the ppc64 and powerpc > baselines are optimal, hence this thread. > > -- > ⢀⣴⠾⠻⢶⣦⠀ Sébastien Villemot > ⣾⠁⢠⠒⠀⣿⡁ Debian Developer > ⢿⡄⠘⠷⠚⠋⠀ https://sebastien.villemot.name > ⠈⠳⣄⠀⠀⠀⠀ https://www.debian.org >

