Re: [Bioc-devel] Support for Linux ARM64

2023-01-05 Thread Martin Grigorov
Hello Hervé,

Thank you for your detailed response!

I very much agree with you that the number of combinations (OSes, OS
flavors, CPU architectures, ...) are endless and it is impossible to test
them all!

I work for one of the cloud providers and I can say that the demand for
Linux ARM64 deployments increases steadily in the last few years!
All of the major cloud providers started offering Linux ARM64 compute
instances in addition to the "standard" Linux x86_64 ([1], [2], [3], [4],
[5], [6]). Only IBM offers its own s390x [7] in addition to x86_64.

Microsoft is pushing hard for Windows on ARM64 [8] [9] too! I guess sooner
or later someone will contact the Bioconductor community with a request for
Windows on ARM64.

I hope the Bioconductor community will recognize Linux ARM64 as an
important platform to be supported!

1. https://aws.amazon.com/ec2/instance-types/ (A1)
2. https://cloud.google.com/compute/docs/instances/arm-on-compute (Tau T2A)
3.
https://azure.microsoft.com/en-us/blog/azure-virtual-machines-with-ampere-altra-arm-based-processors-generally-available/

4. https://www.oracle.com/cloud/compute/arm/
5.
https://www.alibabacloud.com/product/ecs?spm=a3c0i.7938564.8215766810.5.20ff441eQxRIsn
6.
https://support.huaweicloud.com/intl/en-us/productdesc-ecs/en-us_topic_0035470096.html
(Kunpeng)
7. https://cloud.ibm.com/vpc-ext/provision/vs
8. https://www.microsoft.com/en-us/search/explore?q=arm64+windows
9.
https://devblogs.microsoft.com/visualstudio/arm64-visual-studio-is-officially-here/

Regards,
Martin

P.S. I am not a native english speaker! I recognize that "man power" is not
the most appropriate wording in my previous message. I meant "people"!


On Thu, Jan 5, 2023 at 8:43 PM Hervé Pagès 
wrote:

> Hi Martin,
>
> Linux runs on many architectures, ARM64 is just one of them.
>
> Our daily builds have traditionally focused on 3 platforms: Intel-based
> Linux (Ubuntu 22.04), Windows, and Intel-based Mac. Note that we
> recently added ARM64-based Mac to our daily builds.
>
> One big difference between Linux and the other platforms is that we only
> produce binary packages for the latter. More precisely:
>
> - on the Linux builders: the daily builds only run 'R CMD INSTALL', 'R
> CMD build', and 'R CMD check', on each Bioconductor package,
>
> - on the Windows and Mac builders: the daily builds run all the above
> plus an additional step that we call the BUILD BIN step that produces a
> binary for each Bioconductor package.
>
> This means that on Linux, as well as on any other Unix-like OS that is
> not macOS (e.g. FreeBSD, OpenBSD, Solaris, HP-UX, etc...), users will
> install all their packages (Bioconductor and CRAN) **from source**. This
> should work as long as they are on a platform where R is supported and
> have the required compilers (C, C++, and Fortran).
>
> Note that if officially supporting a given platform means running the
> daily builds on that particular platform, then there's no way for us to
> do that because platform == OS + architecture, and the list of
> combinations of Unix-like OS's (Linux, FreeBSD, Solaris, etc...) +
> architectures (Intel, ARM64, Sparc, powerpc) is endless. Even if we
> narrow this list to Intel-based Linux, there are hundreds of Linux
> distributions around that use different kernel, compilers, package
> managers, etc...
>
> All this to say that, as far as the daily builds are concerned, we had
> to make choices, and those choices are based on the most commonly used
> platforms. Since all Bioconductor packages are tested daily on
> Intel-based Linux (Ubuntu 22.04), Windows, Intel-based Mac, and
> ARM64-based Mac, we have some reasonable confidence that they will work
> properly on these 4 platforms (still not a 100% guarantee of course,
> there's nothing like that).
>
> My understanding is that ARM64-based Linux is still a marginally used
> platform so probably not worth for us to allocate resources on adding it
> to our daily builds at the moment. If it ever becomes more mainstream in
> the future, then we will certainly reconsider. That does not mean that
> you can't use Bioconductor on a ARM64-based Linux machine **now**. I see
> no reason a priori why you couldn't install (from source) Bioconductor
> packages on this platform, and use them, as long as:
>
> - R is supported on your ARM64-based Linux machine
>
> - you have compilers that are supported by R
>
> - you have the external libraries that are required by some CRAN and/or
> Bioconductor packages.
>
> Hope this helps,
>
> H.
>
> On 05/01/2023 02:01, Martin Grigorov wrote:
> > Dear community,
> >
> > Happy and successful new year!
> >
> > Appologies if this has been discussed before but
> > https://stat.ethz.ch/pipermail/bioc-devel/ does not provide search
> > facilities and my googling didn't help much!
> >
> > I'd like to ask whether Linux ARM64 is officially supported ?
> > I know that Mac ARM64 is supported since 3.16 [1] [2].
> > I cannot find such test results for Linux ARM64 and the site 

Re: [Bioc-devel] Support for Linux ARM64

2023-01-05 Thread Vincent Carey
On Thu, Jan 5, 2023 at 7:08 PM Vincent Carey 
wrote:

>
>
> On Thu, Jan 5, 2023 at 1:44 PM Hervé Pagès 
> wrote:
>
>> Hi Martin,
>>
>> Linux runs on many architectures, ARM64 is just one of them.
>>
>> Our daily builds have traditionally focused on 3 platforms: Intel-based
>> Linux (Ubuntu 22.04), Windows, and Intel-based Mac. Note that we
>> recently added ARM64-based Mac to our daily builds.
>>
>> One big difference between Linux and the other platforms is that we only
>> produce binary packages for the latter. More precisely:
>>
>> - on the Linux builders: the daily builds only run 'R CMD INSTALL', 'R
>> CMD build', and 'R CMD check', on each Bioconductor package,
>>
>> - on the Windows and Mac builders: the daily builds run all the above
>> plus an additional step that we call the BUILD BIN step that produces a
>> binary for each Bioconductor package.
>>
>> This means that on Linux, as well as on any other Unix-like OS that is
>> not macOS (e.g. FreeBSD, OpenBSD, Solaris, HP-UX, etc...), users will
>> install all their packages (Bioconductor and CRAN) **from source**. This
>> should work as long as they are on a platform where R is supported and
>> have the required compilers (C, C++, and Fortran).
>>
>> Note that if officially supporting a given platform means running the
>> daily builds on that particular platform, then there's no way for us to
>> do that because platform == OS + architecture, and the list of
>> combinations of Unix-like OS's (Linux, FreeBSD, Solaris, etc...) +
>> architectures (Intel, ARM64, Sparc, powerpc) is endless. Even if we
>> narrow this list to Intel-based Linux, there are hundreds of Linux
>> distributions around that use different kernel, compilers, package
>> managers, etc...
>>
>> All this to say that, as far as the daily builds are concerned, we had
>> to make choices, and those choices are based on the most commonly used
>> platforms. Since all Bioconductor packages are tested daily on
>> Intel-based Linux (Ubuntu 22.04), Windows, Intel-based Mac, and
>> ARM64-based Mac, we have some reasonable confidence that they will work
>> properly on these 4 platforms (still not a 100% guarantee of course,
>> there's nothing like that).
>>
>> My understanding is that ARM64-based Linux is still a marginally used
>> platform so probably not worth for us to allocate resources on adding it
>> to our daily builds at the moment. If it ever becomes more mainstream in
>> the future, then we will certainly reconsider. That does not mean that
>> you can't use Bioconductor on a ARM64-based Linux machine **now**. I see
>> no reason a priori why you couldn't install (from source) Bioconductor
>> packages on this platform, and use them, as long as:
>>
>>
> Thanks Hervé for a good overview of the issues.  I think there are a couple
> of reasons to keep this dialogue going (and there is now a community slack
> channel
> for further discussion: #arm-linux at community-bioc.slack.com.)
>
> The first reason is Martin's offer of resources to accomplish the support
> aim.  What
> exactly that support aim is remains to be made precise.  As you note, a
> properly
> configured system with R can use BiocManager::install to build from
> source, but
> there are a few additional things that can be done to produce binaries,
> and perhaps
> some of our software in BBS or some of the binary repo generation tools
> could be
> useful for Martin's group to make a relevant binary repo.  The
> package-management
> oriented process of Dirk Eddelbuettel's r2u
>  also seems potentially relevant.
> We also
> have tooling to build all the CRAN dependencies that Bioc packages
> declare.  This
> is all in the open and it would be interesting to see how much work is
> needed to
> get solutions for ARM64 linux.  It could lead to some robustification of
> the existing
> build machinery.  I am not offering to do it, but the fact that all the
> tooling is out in
> the open may not be fully clear and I am just mentioning this.
>
> The second reason to stay engaged is the nature of the ARM platform, which
> is
> said to require lower power consumption for equivalent throughput.  It may
> be
> environmentally beneficial to be ahead of the curve in being able to work
> with
> this platform.  Earlier I linked to a github issue indicating that rocker
> now has a dual
> platform container image including arm64 support but I don't know if that
> really
> addresses the issue at hand. Maybe I need to go onto a graviton machine to
> find out.
>

So I did this, and here are some notes:

1) it is easy to get such a machine in AWS, a1.2xlarge
Linux 10a568f32a1c 4.14.296-222.539.amzn2.aarch64 #1 SMP Wed Oct 26
20:36:51 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
2) using the rocker/rstudio:latest-daily I could get DESeq2 installed in
about 20 minutes of
compilation of dependent packages
3)  to get a checkable version of DESeq2 I needed to enhance the rocker
environment
4  apt-get install libxml2-dev
  

Re: [Bioc-devel] Support for Linux ARM64

2023-01-05 Thread Vincent Carey
On Thu, Jan 5, 2023 at 1:44 PM Hervé Pagès 
wrote:

> Hi Martin,
>
> Linux runs on many architectures, ARM64 is just one of them.
>
> Our daily builds have traditionally focused on 3 platforms: Intel-based
> Linux (Ubuntu 22.04), Windows, and Intel-based Mac. Note that we
> recently added ARM64-based Mac to our daily builds.
>
> One big difference between Linux and the other platforms is that we only
> produce binary packages for the latter. More precisely:
>
> - on the Linux builders: the daily builds only run 'R CMD INSTALL', 'R
> CMD build', and 'R CMD check', on each Bioconductor package,
>
> - on the Windows and Mac builders: the daily builds run all the above
> plus an additional step that we call the BUILD BIN step that produces a
> binary for each Bioconductor package.
>
> This means that on Linux, as well as on any other Unix-like OS that is
> not macOS (e.g. FreeBSD, OpenBSD, Solaris, HP-UX, etc...), users will
> install all their packages (Bioconductor and CRAN) **from source**. This
> should work as long as they are on a platform where R is supported and
> have the required compilers (C, C++, and Fortran).
>
> Note that if officially supporting a given platform means running the
> daily builds on that particular platform, then there's no way for us to
> do that because platform == OS + architecture, and the list of
> combinations of Unix-like OS's (Linux, FreeBSD, Solaris, etc...) +
> architectures (Intel, ARM64, Sparc, powerpc) is endless. Even if we
> narrow this list to Intel-based Linux, there are hundreds of Linux
> distributions around that use different kernel, compilers, package
> managers, etc...
>
> All this to say that, as far as the daily builds are concerned, we had
> to make choices, and those choices are based on the most commonly used
> platforms. Since all Bioconductor packages are tested daily on
> Intel-based Linux (Ubuntu 22.04), Windows, Intel-based Mac, and
> ARM64-based Mac, we have some reasonable confidence that they will work
> properly on these 4 platforms (still not a 100% guarantee of course,
> there's nothing like that).
>
> My understanding is that ARM64-based Linux is still a marginally used
> platform so probably not worth for us to allocate resources on adding it
> to our daily builds at the moment. If it ever becomes more mainstream in
> the future, then we will certainly reconsider. That does not mean that
> you can't use Bioconductor on a ARM64-based Linux machine **now**. I see
> no reason a priori why you couldn't install (from source) Bioconductor
> packages on this platform, and use them, as long as:
>
>
Thanks Hervé for a good overview of the issues.  I think there are a couple
of reasons to keep this dialogue going (and there is now a community slack
channel
for further discussion: #arm-linux at community-bioc.slack.com.)

The first reason is Martin's offer of resources to accomplish the support
aim.  What
exactly that support aim is remains to be made precise.  As you note, a
properly
configured system with R can use BiocManager::install to build from source,
but
there are a few additional things that can be done to produce binaries, and
perhaps
some of our software in BBS or some of the binary repo generation tools
could be
useful for Martin's group to make a relevant binary repo.  The
package-management
oriented process of Dirk Eddelbuettel's r2u
 also seems potentially relevant.  We
also
have tooling to build all the CRAN dependencies that Bioc packages
declare.  This
is all in the open and it would be interesting to see how much work is
needed to
get solutions for ARM64 linux.  It could lead to some robustification of
the existing
build machinery.  I am not offering to do it, but the fact that all the
tooling is out in
the open may not be fully clear and I am just mentioning this.

The second reason to stay engaged is the nature of the ARM platform, which
is
said to require lower power consumption for equivalent throughput.  It may
be
environmentally beneficial to be ahead of the curve in being able to work
with
this platform.  Earlier I linked to a github issue indicating that rocker
now has a dual
platform container image including arm64 support but I don't know if that
really
addresses the issue at hand. Maybe I need to go onto a graviton machine to
find out.

In any case it is not so often that we get a request for enhancements that
includes
an offer of VMs and person power so I want to be sure we don't lose the
thread
prematurely.







> - R is supported on your ARM64-based Linux machine
>
> - you have compilers that are supported by R
>
> - you have the external libraries that are required by some CRAN and/or
> Bioconductor packages.
>
> Hope this helps,
>
> H.
>
> On 05/01/2023 02:01, Martin Grigorov wrote:
> > Dear community,
> >
> > Happy and successful new year!
> >
> > Appologies if this has been discussed before but
> > https://stat.ethz.ch/pipermail/bioc-devel/ does not provide search
> > 

Re: [Bioc-devel] Support for Linux ARM64

2023-01-05 Thread Hervé Pagès

Hi Martin,

Linux runs on many architectures, ARM64 is just one of them.

Our daily builds have traditionally focused on 3 platforms: Intel-based 
Linux (Ubuntu 22.04), Windows, and Intel-based Mac. Note that we 
recently added ARM64-based Mac to our daily builds.


One big difference between Linux and the other platforms is that we only 
produce binary packages for the latter. More precisely:


- on the Linux builders: the daily builds only run 'R CMD INSTALL', 'R 
CMD build', and 'R CMD check', on each Bioconductor package,


- on the Windows and Mac builders: the daily builds run all the above 
plus an additional step that we call the BUILD BIN step that produces a 
binary for each Bioconductor package.


This means that on Linux, as well as on any other Unix-like OS that is 
not macOS (e.g. FreeBSD, OpenBSD, Solaris, HP-UX, etc...), users will 
install all their packages (Bioconductor and CRAN) **from source**. This 
should work as long as they are on a platform where R is supported and 
have the required compilers (C, C++, and Fortran).


Note that if officially supporting a given platform means running the 
daily builds on that particular platform, then there's no way for us to 
do that because platform == OS + architecture, and the list of 
combinations of Unix-like OS's (Linux, FreeBSD, Solaris, etc...) + 
architectures (Intel, ARM64, Sparc, powerpc) is endless. Even if we 
narrow this list to Intel-based Linux, there are hundreds of Linux 
distributions around that use different kernel, compilers, package 
managers, etc...


All this to say that, as far as the daily builds are concerned, we had 
to make choices, and those choices are based on the most commonly used 
platforms. Since all Bioconductor packages are tested daily on 
Intel-based Linux (Ubuntu 22.04), Windows, Intel-based Mac, and 
ARM64-based Mac, we have some reasonable confidence that they will work 
properly on these 4 platforms (still not a 100% guarantee of course, 
there's nothing like that).


My understanding is that ARM64-based Linux is still a marginally used 
platform so probably not worth for us to allocate resources on adding it 
to our daily builds at the moment. If it ever becomes more mainstream in 
the future, then we will certainly reconsider. That does not mean that 
you can't use Bioconductor on a ARM64-based Linux machine **now**. I see 
no reason a priori why you couldn't install (from source) Bioconductor 
packages on this platform, and use them, as long as:


- R is supported on your ARM64-based Linux machine

- you have compilers that are supported by R

- you have the external libraries that are required by some CRAN and/or 
Bioconductor packages.


Hope this helps,

H.

On 05/01/2023 02:01, Martin Grigorov wrote:

Dear community,

Happy and successful new year!

Appologies if this has been discussed before but
https://stat.ethz.ch/pipermail/bioc-devel/ does not provide search
facilities and my googling didn't help much!

I'd like to ask whether Linux ARM64 is officially supported ?
I know that Mac ARM64 is supported since 3.16 [1] [2].
I cannot find such test results for Linux ARM64 and the site search [3]
also mentions "arm64" only in context of "macOS".
In addition the Docker images are also single-platform [4] (linux/amd64).

How can we help to add support for Linux ARM64 ?
My employer is willing to donate VMs and man power if the community is
interested in adding support for Linux ARM64!


Regards,
Martin

1. https://bioconductor.org/news/bioc_3_16_release/
2. https://bioconductor.org/checkResults/3.17/bioc-mac-arm64-LATEST/
3. https://bioconductor.org/help/search/index.html?q=arm64/
4. https://hub.docker.com/r/bioconductor/bioconductor_docker/tags

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Bioconductor Core Team
hpages.on.git...@gmail.com

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Bioconductor data packages containing very large files

2023-01-05 Thread Vincent Carey
I would like us to discuss this in the context of the HDF Scalable Data
Service that we have
running on the NSF cloud in jetstream2.  Let's discuss off line and then
report back to the list.


On Thu, Jan 5, 2023 at 7:11 AM Heery Richard  wrote:

> Thank you Lori.
>
> I was wondering if there is someone I could  ask if what I am working on
> could be of interest and suitable for Bioconductor before investing more
> time developing it and uploading the data? What I have is a HDF5 file
> constructed using the Bioconductor methrix package for all of the
> methylation data for TCGA downloaded from GDC. It allows rapid querying of
> methylation data for regions of interest (e.g. enhancers or promoters)
> provided as a GRanges object across the ~10,000 samples in TCGA, completing
> a pan-cancer analysis in minutes . Otherwise, downloading and querying data
> for regions of interest in several cancer types could take several days or
> longer.
>
> There are already packages for downloading TCGA data on Bioconductor, but
> what I think is novel here is the speed and ease with which methylation
> data can be retrieved for a large number of samples.
>
> Please let me know if this is something that could be useful for
> Bioconductor.
>
> Best wishes,
>
> Richard
> 
> From: Kern, Lori 
> Sent: Wednesday, January 4, 2023 3:22 PM
> To: Heery Richard ; bioc-devel@r-project.org <
> bioc-devel@r-project.org>
> Subject: Re: Bioconductor data packages containing very large files
>
> A package cannot be that large directly.  Please see information on
> creating an ExperimentHub package to provide the data for use in the
> package:
>
>
> https://bioconductor.org/packages/release/bioc/vignettes/HubPub/inst/doc/CreateAHubPackage.html
> <
> https://urlsand.esvalabs.com/?u=https%3A%2F%2Fbioconductor.org%2Fpackages%2Frelease%2Fbioc%2Fvignettes%2FHubPub%2Finst%2Fdoc%2FCreateAHubPackage.html=8c00e339=ddd80e7a=y=n
> >
>
> Cheers,
>
>
>
>
> Lori Shepherd - Kern
>
> Bioconductor Core Team
>
> Roswell Park Comprehensive Cancer Center
>
> Department of Biostatistics & Bioinformatics
>
> Elm & Carlton Streets
>
> Buffalo, New York 14263
>
> 
> From: Bioc-devel  on behalf of Heery
> Richard 
> Sent: Wednesday, January 4, 2023 9:07 AM
> To: bioc-devel@r-project.org 
> Subject: [Bioc-devel] Bioconductor data packages containing very large
> files
>
> Hi Bioconductor,
>
> I have made a database that is 27 GB that I would like to share as part of
> a Bioconductor package and I was just wondering if it is possible to submit
> very large files like this to Bioconductor or if there may be any
> alternative ways of sharing the file as part of a package?
>
> Best wishes,
>
> Richard Heery
>
> IEO, Milan, Italy
> [5x1000]<
> http://secure-web.cisco.com/1AxunlkMfMiOLWUzSr4U79PCQR-m5gepMk3mVQ7uLXKMCAeTNzkyygmALFERcsVxhVSX7zkAFcllkjIZnZKxUh80DdGusfFIL4XGxaAJXK-2sG43sKOXakdJVd8cDp7HQArb01uoPUuJlVHlAaVSxuLW-ZABWxKwog7MokYLudEkL7-ib-hPb7R2WajUM6LmVWyXT51DcWzhFVIHJ4LNCbfelON_k_SA2ybm5NUGX7cKNLFpUIW2cmp2rhue-arnJ30_cFqdxWgDzajh8Nt87OFWo51fE4_OyTrtBO-CG555adYVMuYutSliHgvl1_BGs/http%3A%2F%2Fwww.ieo.it%2Fit%2FSCIENCE-IN-SOCIETY%2FLe-nostre-iniziative%2F5-per-mille%2F
> ><
> https://urlsand.esvalabs.com/?u=http%3A%2F%2Fsecure-web.cisco.com%2F1AxunlkMfMiOLWUzSr4U79PCQR-m5gepMk3mVQ7uLXKMCAeTNzkyygmALFERcsVxhVSX7zkAFcllkjIZnZKxUh80DdGusfFIL4XGxaAJXK-2sG43sKOXakdJVd8cDp7HQArb01uoPUuJlVHlAaVSxuLW-ZABWxKwog7MokYLudEkL7-ib-hPb7R2WajUM6LmVWyXT51DcWzhFVIHJ4LNCbfelON_k_SA2ybm5NUGX7cKNLFpUIW2cmp2rhue-arnJ30_cFqdxWgDzajh8Nt87OFWo51fE4_OyTrtBO-CG555adYVMuYutSliHgvl1_BGs%2Fhttp%253A%252F%252Fwww.ieo.it%252Fit%252FSCIENCE-IN-SOCIETY%252FLe-nostre-iniziative%252F5-per-mille%252F%26gt%3B=8c00e339=13a49eac=y=n
> >
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
>
> https://secure-web.cisco.com/139fG07AH98RfoK_SoTpu-tcev3I6LbfqnNToVDRGIvCJsmi1AvcbL1c_t7Dd-rZAXrbqZqUyjb-Sim4Tlgxui3zBHM-ntzSY3xE-0nyd4prF3cuito1iGsDjrgMaAqQ35mIgJeRu2NgXkmQYh5E_wUGoyaoiTz5wLOF2f_rz5wXX3QfIIeUKae7OPTyPuN7OoBJ_gqHxxZ0pK0K6ZyHmOqaF5vc7CmBgK26UgmtjXgat8_vjnbfbDbp_rO_0k1IdDLJjIkyBoSdmFO6wmG6H4Y4r1CzG9PyRLXXG8CMX4PHGc4DMXCTsBYxv6T3GOTpW/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel
> <
> https://urlsand.esvalabs.com/?u=https%3A%2F%2Fsecure-web.cisco.com%2F139fG07AH98RfoK_SoTpu-tcev3I6LbfqnNToVDRGIvCJsmi1AvcbL1c_t7Dd-rZAXrbqZqUyjb-Sim4Tlgxui3zBHM-ntzSY3xE-0nyd4prF3cuito1iGsDjrgMaAqQ35mIgJeRu2NgXkmQYh5E_wUGoyaoiTz5wLOF2f_rz5wXX3QfIIeUKae7OPTyPuN7OoBJ_gqHxxZ0pK0K6ZyHmOqaF5vc7CmBgK26UgmtjXgat8_vjnbfbDbp_rO_0k1IdDLJjIkyBoSdmFO6wmG6H4Y4r1CzG9PyRLXXG8CMX4PHGc4DMXCTsBYxv6T3GOTpW%2Fhttps%253A%252F%252Fstat.ethz.ch%252Fmailman%252Flistinfo%252Fbioc-devel=8c00e339=11e86087=y=n
> >
>
>
> This email message may contain legally privileged and/or confidential
> information. If you are not the intended recipient(s), or the employee or
> agent responsible 

Re: [Bioc-devel] Bioconductor data packages containing very large files

2023-01-05 Thread Heery Richard
Thank you Lori.

I was wondering if there is someone I could  ask if what I am working on could 
be of interest and suitable for Bioconductor before investing more time 
developing it and uploading the data? What I have is a HDF5 file constructed 
using the Bioconductor methrix package for all of the methylation data for TCGA 
downloaded from GDC. It allows rapid querying of methylation data for regions 
of interest (e.g. enhancers or promoters) provided as a GRanges object across 
the ~10,000 samples in TCGA, completing a pan-cancer analysis in minutes . 
Otherwise, downloading and querying data for regions of interest in several 
cancer types could take several days or longer.

There are already packages for downloading TCGA data on Bioconductor, but what 
I think is novel here is the speed and ease with which methylation data can be 
retrieved for a large number of samples.

Please let me know if this is something that could be useful for Bioconductor.

Best wishes,

Richard

From: Kern, Lori 
Sent: Wednesday, January 4, 2023 3:22 PM
To: Heery Richard ; bioc-devel@r-project.org 

Subject: Re: Bioconductor data packages containing very large files

A package cannot be that large directly.  Please see information on creating an 
ExperimentHub package to provide the data for use in the package:

 
https://bioconductor.org/packages/release/bioc/vignettes/HubPub/inst/doc/CreateAHubPackage.html

Cheers,




Lori Shepherd - Kern

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Heery Richard 

Sent: Wednesday, January 4, 2023 9:07 AM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] Bioconductor data packages containing very large files

Hi Bioconductor,

I have made a database that is 27 GB that I would like to share as part of a 
Bioconductor package and I was just wondering if it is possible to submit very 
large files like this to Bioconductor or if there may be any alternative ways 
of sharing the file as part of a package?

Best wishes,

Richard Heery

IEO, Milan, Italy
[5x1000]

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://secure-web.cisco.com/139fG07AH98RfoK_SoTpu-tcev3I6LbfqnNToVDRGIvCJsmi1AvcbL1c_t7Dd-rZAXrbqZqUyjb-Sim4Tlgxui3zBHM-ntzSY3xE-0nyd4prF3cuito1iGsDjrgMaAqQ35mIgJeRu2NgXkmQYh5E_wUGoyaoiTz5wLOF2f_rz5wXX3QfIIeUKae7OPTyPuN7OoBJ_gqHxxZ0pK0K6ZyHmOqaF5vc7CmBgK26UgmtjXgat8_vjnbfbDbp_rO_0k1IdDLJjIkyBoSdmFO6wmG6H4Y4r1CzG9PyRLXXG8CMX4PHGc4DMXCTsBYxv6T3GOTpW/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel


This email message may contain legally privileged and/or confidential 
information. If you are not the intended recipient(s), or the employee or agent 
responsible for the delivery of this message to the intended recipient(s), you 
are hereby notified that any disclosure, copying, distribution, or use of this 
email message is prohibited. If you have received this message in error, please 
notify the sender immediately by e-mail and delete this email message from your 
computer. Thank you.

[[alternative HTML version deleted]]

___

Re: [Bioc-devel] Support for Linux ARM64

2023-01-05 Thread Vincent Carey
On Thu, Jan 5, 2023 at 5:02 AM Martin Grigorov 
wrote:

> Dear community,
>
> Happy and successful new year!
>
> Appologies if this has been discussed before but
> https://stat.ethz.ch/pipermail/bioc-devel/ does not provide search
> facilities and my googling didn't help much!
>
> I'd like to ask whether Linux ARM64 is officially supported ?
> I know that Mac ARM64 is supported since 3.16 [1] [2].
> I cannot find such test results for Linux ARM64 and the site search [3]
> also mentions "arm64" only in context of "macOS".
> In addition the Docker images are also single-platform [4] (linux/amd64).
>
> How can we help to add support for Linux ARM64 ?
> My employer is willing to donate VMs and man power if the community is
> interested in adding support for Linux ARM64!
>

Thanks Martin, very interesting question.  Can you tell us a little more
about basic
support for R for this environment?  I would have to remark that even
expanding our
support to mac ARM was out of scope for our core project, but the demand was
significant.

Superficial searching in response to your question led to

https://github.com/rocker-org/rocker-versioned2/issues/144#issuecomment-1363791740

which suggests to me that we (or you!) could get a dual-platform container
out without
too much hassle.  Would that help meet your need?  Once the container is
produced there
is then the implied requirement that a compatible set of package binaries
is made and
maintained.  That's a place where your offer of machinery and staff time
could be very
beneficial.  We have tooling for binary repository production that should
be re-usable on
that platform.

(Is arm64 linux a build target supported on github actions, by the way?)

This is a pretty specialized topic so I think I will make a slack channel
for it in the
community bioconductor slack and we can carry on the discussion there,
although I am
fine with further discussion here as well.







>
>
> Regards,
> Martin
>
> 1. https://bioconductor.org/news/bioc_3_16_release/
> 2. https://bioconductor.org/checkResults/3.17/bioc-mac-arm64-LATEST/
> 3. https://bioconductor.org/help/search/index.html?q=arm64/
> 4. https://hub.docker.com/r/bioconductor/bioconductor_docker/tags
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
The information in this e-mail is intended only for the ...{{dropped:18}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Support for Linux ARM64

2023-01-05 Thread Martin Grigorov
Dear community,

Happy and successful new year!

Appologies if this has been discussed before but
https://stat.ethz.ch/pipermail/bioc-devel/ does not provide search
facilities and my googling didn't help much!

I'd like to ask whether Linux ARM64 is officially supported ?
I know that Mac ARM64 is supported since 3.16 [1] [2].
I cannot find such test results for Linux ARM64 and the site search [3]
also mentions "arm64" only in context of "macOS".
In addition the Docker images are also single-platform [4] (linux/amd64).

How can we help to add support for Linux ARM64 ?
My employer is willing to donate VMs and man power if the community is
interested in adding support for Linux ARM64!


Regards,
Martin

1. https://bioconductor.org/news/bioc_3_16_release/
2. https://bioconductor.org/checkResults/3.17/bioc-mac-arm64-LATEST/
3. https://bioconductor.org/help/search/index.html?q=arm64/
4. https://hub.docker.com/r/bioconductor/bioconductor_docker/tags

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel