On Sun, Jan 8, 2023 at 12:24 AM Hervé Pagès <hpages.on.git...@gmail.com> wrote:
> On 05/01/2023 18:52, Vincent Carey wrote: > > > > On Thu, Jan 5, 2023 at 7:08 PM Vincent Carey <st...@channing.harvard.edu> > wrote: > >> >> >> On Thu, Jan 5, 2023 at 1:44 PM Hervé Pagès <hpages.on.git...@gmail.com> >> wrote: >> >>> Hi Martin, >>> >>> Linux runs on many architectures, ARM64 is just one of them. >>> >>> Our daily builds have traditionally focused on 3 platforms: Intel-based >>> Linux (Ubuntu 22.04), Windows, and Intel-based Mac. Note that we >>> recently added ARM64-based Mac to our daily builds. >>> >>> One big difference between Linux and the other platforms is that we only >>> produce binary packages for the latter. More precisely: >>> >>> - on the Linux builders: the daily builds only run 'R CMD INSTALL', 'R >>> CMD build', and 'R CMD check', on each Bioconductor package, >>> >>> - on the Windows and Mac builders: the daily builds run all the above >>> plus an additional step that we call the BUILD BIN step that produces a >>> binary for each Bioconductor package. >>> >>> This means that on Linux, as well as on any other Unix-like OS that is >>> not macOS (e.g. FreeBSD, OpenBSD, Solaris, HP-UX, etc...), users will >>> install all their packages (Bioconductor and CRAN) **from source**. This >>> should work as long as they are on a platform where R is supported and >>> have the required compilers (C, C++, and Fortran). >>> >>> Note that if officially supporting a given platform means running the >>> daily builds on that particular platform, then there's no way for us to >>> do that because platform == OS + architecture, and the list of >>> combinations of Unix-like OS's (Linux, FreeBSD, Solaris, etc...) + >>> architectures (Intel, ARM64, Sparc, powerpc) is endless. Even if we >>> narrow this list to Intel-based Linux, there are hundreds of Linux >>> distributions around that use different kernel, compilers, package >>> managers, etc... >>> >>> All this to say that, as far as the daily builds are concerned, we had >>> to make choices, and those choices are based on the most commonly used >>> platforms. Since all Bioconductor packages are tested daily on >>> Intel-based Linux (Ubuntu 22.04), Windows, Intel-based Mac, and >>> ARM64-based Mac, we have some reasonable confidence that they will work >>> properly on these 4 platforms (still not a 100% guarantee of course, >>> there's nothing like that). >>> >>> My understanding is that ARM64-based Linux is still a marginally used >>> platform so probably not worth for us to allocate resources on adding it >>> to our daily builds at the moment. If it ever becomes more mainstream in >>> the future, then we will certainly reconsider. That does not mean that >>> you can't use Bioconductor on a ARM64-based Linux machine **now**. I see >>> no reason a priori why you couldn't install (from source) Bioconductor >>> packages on this platform, and use them, as long as: >>> >>> >> Thanks Hervé for a good overview of the issues. I think there are a >> couple >> of reasons to keep this dialogue going (and there is now a community >> slack channel >> for further discussion: #arm-linux at community-bioc.slack.com.) >> >> The first reason is Martin's offer of resources to accomplish the support >> aim. What >> exactly that support aim is remains to be made precise. As you note, a >> properly >> configured system with R can use BiocManager::install to build from >> source, but >> there are a few additional things that can be done to produce binaries, >> and perhaps >> some of our software in BBS or some of the binary repo generation tools >> could be >> useful for Martin's group to make a relevant binary repo. The >> package-management >> oriented process of Dirk Eddelbuettel's r2u >> <https://github.com/eddelbuettel/r2u> also seems potentially relevant. >> We also >> have tooling to build all the CRAN dependencies that Bioc packages >> declare. This >> is all in the open and it would be interesting to see how much work is >> needed to >> get solutions for ARM64 linux. It could lead to some robustification of >> the existing >> build machinery. I am not offering to do it, but the fact that all the >> tooling is out in >> the open may not be fully clear and I am just mentioning this. >> >> The second reason to stay engaged is the nature of the ARM platform, >> which is >> said to require lower power consumption for equivalent throughput. It >> may be >> environmentally beneficial to be ahead of the curve in being able to work >> with >> this platform. Earlier I linked to a github issue indicating that rocker >> now has a dual >> platform container image including arm64 support but I don't know if that >> really >> addresses the issue at hand. Maybe I need to go onto a graviton machine >> to find out. >> > > So I did this, and here are some notes: > > 1) it is easy to get such a machine in AWS, a1.2xlarge > Linux 10a568f32a1c 4.14.296-222.539.amzn2.aarch64 #1 SMP Wed Oct 26 > 20:36:51 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux > 2) using the rocker/rstudio:latest-daily I could get DESeq2 installed in > about 20 minutes of > compilation of dependent packages > 3) to get a checkable version of DESeq2 I needed to enhance the rocker > environment > 4 apt-get install libxml2-dev > 8 apt install libpng-dev > 12 apt install libgit2-dev > 14 apt install -y libmagick++-dev > 16 apt install -y libharfbuzz-dev libfribidi-dev > 4) DESeq2 check in release version (1.38.2) failed (but it passes on intel > linux): > > Running examples in ‘DESeq2-Ex.R’ failed > The error most likely occurred in: > > > ### Name: unmix > > ### Title: Unmix samples using loss in a variance stabilized space > > ### Aliases: unmix > > > > ### ** Examples > > > > > > # some artificial data > > cts <- matrix(c(80,50,1,100, > + 1,1,60,100, > + 0,50,60,100), ncol=4, byrow=TRUE) > > # make a DESeqDataSet > > dds <- DESeqDataSetFromMatrix(cts, > + data.frame(row.names=seq_len(ncol(cts))), ~1) > converting counts to integer mode > > colnames(dds) <- paste0("sample",1:4) > > > > # note! here you would instead use > > # estimateSizeFactors() to do actual normalization > > sizeFactors(dds) <- rep(1, ncol(dds)) > > > > norm.cts <- counts(dds, normalized=TRUE) > > > > # 'pure' should also have normalized counts... > > pure <- matrix(c(10,0,0, > + 0,0,10, > + 0,10,0), ncol=3, byrow=TRUE) > > colnames(pure) <- letters[1:3] > > > > # for real data, you need to find alpha after fitting > estimateDispersions() > > mix <- unmix(norm.cts, pure, alpha=0.01) > Warning in sqrt(alpha * q) : NaNs produced > Error in optim(par = rep(1, ncol(pure)), fn = sumLossVST, gr = NULL, i, : > L-BFGS-B needs finite values of 'fn' > Calls: unmix -> lapply -> lapply -> FUN -> optim > > Hmm.. this ain't good :-( > > > Is there bugged/nonportable code somewhere in the stack underlying this > example? > > Probably. > > That could take some time to figure out. > > I conclude that the mechanics of working with ARM64 and R to process > Bioconductor > packages are very tractable, but the work needed to get the whole > ecosystem to a > favorable state, as usable as it is for intel linux or mac or windows, may > be laborious. > > OK so maybe a good start would be to try to set up the daily builds (BBS) > on one of those AWS 1.2xlarge or a1.4xlarge instances, or, even better, on > one of the VMs that Martin is offering? If we use ARM64 Ubuntu on it, > setting up the builds there should be very similar to what we do for our > current Intel Ubuntu build machines, which is easy and well-documented. > I've sent you privately the SSH details for an Ubuntu 22.04 ARM64 VM! Please let me know if I can help you anyhow with the setup and testing ! Kind regards, Martin > H. > > > > > > >> In any case it is not so often that we get a request for enhancements >> that includes >> an offer of VMs and person power so I want to be sure we don't lose the >> thread >> prematurely. >> >> >> >> >> >> >> >>> - R is supported on your ARM64-based Linux machine >>> >>> - you have compilers that are supported by R >>> >>> - you have the external libraries that are required by some CRAN and/or >>> Bioconductor packages. >>> >>> Hope this helps, >>> >>> H. >>> >>> On 05/01/2023 02:01, Martin Grigorov wrote: >>> > Dear community, >>> > >>> > Happy and successful new year! >>> > >>> > Appologies if this has been discussed before but >>> > https://stat.ethz.ch/pipermail/bioc-devel/ does not provide search >>> > facilities and my googling didn't help much! >>> > >>> > I'd like to ask whether Linux ARM64 is officially supported ? >>> > I know that Mac ARM64 is supported since 3.16 [1] [2]. >>> > I cannot find such test results for Linux ARM64 and the site search [3] >>> > also mentions "arm64" only in context of "macOS". >>> > In addition the Docker images are also single-platform [4] >>> (linux/amd64). >>> > >>> > How can we help to add support for Linux ARM64 ? >>> > My employer is willing to donate VMs and man power if the community is >>> > interested in adding support for Linux ARM64! >>> > >>> > >>> > Regards, >>> > Martin >>> > >>> > 1. https://bioconductor.org/news/bioc_3_16_release/ >>> > 2. https://bioconductor.org/checkResults/3.17/bioc-mac-arm64-LATEST/ >>> > 3. https://bioconductor.org/help/search/index.html?q=arm64/ >>> > 4. https://hub.docker.com/r/bioconductor/bioconductor_docker/tags >>> > >>> > [[alternative HTML version deleted]] >>> > >>> > _______________________________________________ >>> > Bioc-devel@r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> >>> -- >>> Hervé Pagès >>> >>> Bioconductor Core Team >>> hpages.on.git...@gmail.com >>> >>> _______________________________________________ >>> Bioc-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> >> > The information in this e-mail is intended only for th...{{dropped:23}} _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel