Sometime between 6-18 November, BiocNeighbors’ BioC-devel builds began failing 
on Windows 64-bit, and have continued to fail since:

http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/ 
<http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/>

The most interesting part is the nature of the failures. They are not 
segmentation faults but rather “incorrect” output in the unit tests:

- BiocNeighbors uses the Annoy algorithm for approximate nearest neighbor 
search, which is provided as a header-only C++ library in the RcppAnnoy package.

- I have compiled the BiocNeighhbors C++ code with an “#include" for these 
libraries to use the Annoy routines. For testing, I compared the output of my 
C++ code to the output of the code in the RcppAnnoy package.

- It is these tests that are failing (i.e., the output does not match up) 
during CHECK on Windows 64-bit only, despite the fact that the same library is 
being “#include”d in both the BiocNeighbors and RcppAnnoy sources!

What makes this particularly intriguing is that the differences between 
BiocNeighbors and RcppAnnoy are very minor. Less than 1% of the neighbor 
identities differ, and only for some of the scenarios, so it’s not an obvious 
bug that would be changing the output en masse. Now, the package also 
uses/tests Annoy in BioC-release but builds fine on tokay1:

http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/ 
<http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/>

The major difference between the Bioc-release/devel builds is the compilation 
flags, which have changed from “-O2 -mtune=generic” to “-O3 -march=native 
-mtune=native” in tokay2. I am told (thanks Val) that the timing of this change 
is consistent with the start of the BiocNeighbors build failures on tokay2. I 
would guess that RcppAnnoy is also compiled with “-O2 -mtune=generic” on the 
CRAN build systems, introducing differences in optimization levels between the 
BiocNeighbors and RcppAnnoy binaries. These could be responsible for the 
discrepancies in the search results.

I was able to reproduce this on my Unix cluster (gcc 6.5.0) where setting 
“-march=native” with either “-O3” or “-O2” caused a difference in the 
calculations. After much trial and error, I eventually narrowed this down to 
the “-mfma” flag, which seems to change the precision of multiply-and-add 
operations and thus the search results. This occurs even when AVX support is 
turned off; I guess the compiler tries to be smart if it detects you are doing 
some kind of simultaneous multiply and addition, which is a pretty common thing 
to do when computing Euclidean distances. 

In summary: can we not use “-march=native” on tokay2? (Val, I know we discussed 
this, but whatever changes you made to the compilation flags don’t seem to have 
propagated to the build machines.) As the case study with BiocNeighbors shows, 
this leads to inconsistencies between the CRAN and BioC-devel binaries for the 
same code, which unnecessarily complicates downstream usage and unit tests. I 
also wonder how binaries specialized for tokay2’s architecture would behave on 
other CPUs with different instruction sets, if they would run at all.

Cheers,

Aaron
        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to