[replying on the debian-med list with permission. Please keep Martin and
Milot CC'd as they do not subscribe]

On Fri, May 8, 2020 at 7:36 PM Milot Mirdita <mi...@mirdita.de> wrote:

> Hi Michael,
>
> I am a developer on the MMseqs2 team and I saw your tweet regarding the
> AWS ARM64 machines earlier and checked on Debian Salsa if it would be a lot
> of work enabling ARM64 support with the next release as we worked on that
> recently.
>

Hey Milot, thanks for your email!

I saw that Debian's MMseqs2 now uses SIMDe to abstract away different
> architectures. While this is a very cool technical achievement, I am very
> uncomfortable with it without being properly integrated into and monitored
> by our CI regression testing.
>
> During ARM64 development I found that there are a lot of subtle issues
> that can result in differing sensitivity between architectures (e.g.
> ARM64's default unsigned char type causes issues, there are many crashes on
> 32-bit ARM). I am also worried that our two most important platforms
> (SSE4.1 and AVX2) might suffer from performance regressions.
>

Interesting! On Debian we have to provide binaries that respect the
architecture baseline. That means no SSE-, SSE2-, only binaries on
i386/i686 and no SSE3+ only binaries on AMD64. So that's why we compile
mmseqs2 multiple times, so there is a version that doesn't violate the
baseline, along with versions that should match the highest level of SIMD
support available on the user's CPU.

https://salsa.debian.org/med-team/mmseqs2/-/blob/master/debian/rules#L22

https://salsa.debian.org/med-team/mmseqs2/-/blob/master/debian/bin/simd-dispatch


>
> We will have ARM64 and hopefully also PPC64LE support in the next release.
> I would suggest to either wait and use our upstream code, or submit a PR
> with your changes to us and see how we can integrate everything correctly.
>

Sure, happy to send the patches! I meant to, but hadn't gotten around to it
yet

https://wiki.debian.org/SIMDEverywhere#Packages_Status


>
> Also I would be very glad if you could integrate the full regression suite
> to spot if all architectures  produce consistent results. You can run the
> regression by calling from the repository:
> git submodule update --init
> ./util/regression/run_regression.sh ./path-to-mmseqs-binary
> scratch-directory
>

Oh yeah, would love to! Except we need all the upstream sources in a single
tarball, which git submodules + GitHub releases makes difficult. So if you
can add a pure source (with all git submodules) tarball to
https://github.com/soedinglab/MMseqs2/releases that would be appreciated!


>
> We had refactored this test suite to make it as easy as possible to use
> for Shayan who initially had proposed to package MMseqs2 for Debian. The
> test subfolder is badly named and contains scratch scripts for feature
> development. They don't do anything useful for testing such as finding
> performance or sensitivity drops.
>

Noted.


> Thanks for your work and best regards,
>

Thank you for sharing your work under a F/OSS license and for your
contributions to Open Science!


> Milot
>


-- 
Michael R. Crusoe

Reply via email to