Re: [Bioc-devel] data extension preference

2023-01-04 Thread Hervé Pagès
Agree with Lori. If you have tabulated data, csv is better than rda 
because it's plain text and can easily be imported by many software, not 
just R. People can also use an editor or command line tool like grep or 
head to inspect the file. OTOH rda/rds files are R specific so can only 
be used within the R ecosystem. Furthermore these are binary formats so 
not as practical as plain text. If size is an issue, you can always 
compress the csv file (-> csv.gz).


H.

On 03/01/2023 08:24, Kern, Lori wrote:

Personally:  I think it depends on the use case and the type of data being 
provided as well as the size.  Raw data is probably more adaptable for various 
usage even extending beyond expected use however it can be large;  processed 
data can be saved compressed and greatly reduce package/data size.  Again it 
will largely depend on what you are doing with the data provided.



Lori Shepherd - Kern

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Park, Adam Keebum 

Sent: Wednesday, December 21, 2022 11:42 PM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] data extension preference

Dear whom it may concern,

I want to check if reviewers have a preference in data formats for data 
included in a package.
Is rda strongly recommended over csv?

Sincerely,
Adam.

 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://secure-web.cisco.com/1TPdwAVqVZuSe4StGWM5cZ069tkkEd2TzRyilsxSECEjqUKvno8Zdb1ADlxRO16lYeghEAcxQKUbYlHrN2KDXvokdAcDZyHHQnbzpmvtEdR5-LnOnPQTBydfgzsrqt_wvHLCCkfIR2dca1h0yDT6Tx8KsV-vXAoxjLyP0YTvVpcrOsqRabwJF7P7y5ntz29eeFVAqTNjXA6XilRqlKO8HMidxVRRAWSypOTO7X1oscKhmxUKWQgBLGrn9p8rF7XA9tjsaOAg5-1kEcC4Cbqu0T7qqPX0bGD1spEdDqAv2q2Vz-Cpwm-xniK7e7w7xeI5J/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Bioconductor Core Team
hpages.on.git...@gmail.com

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Large vignettes

2023-01-04 Thread Vincent Carey
I am glad you brought this up here, and I welcome further discussion on
this mailing list.  It is important to understand the constraints on
development
that arise from Bioconductor's package guidelines.

I don't think we want to change the limits on package payload size without
understanding the consequences for users and our build system.  The split
approach mentioned by Lambda seems sensible to me, and I hope it is
not too burdensome.  Additional commentary and details from the community
are welcome.

On Wed, Jan 4, 2023 at 3:21 PM Lambda Moses  wrote:

> Hi Adam,
>
> I also got this problem, and I would like some input from Bioc Core
> Team. I worked around it by writing a minimal vignette in the main
> branch. Then I made a documentation branch, where I have the same code
> as in main branch, but with more elaborate vignettes used to build a
> pkgdown website. I made a rule for myself that I can only merge from the
> main or devel branch to the documentation branch but not the other way
> round. I would switch branch when I find a bug or want a new feature
> while writing the vignettes. You can see the main branch here:
> https://github.com/pachterlab/voyager/tree/main The documentation branch
> here: https://github.com/pachterlab/voyager/tree/documentation
>
> I kind of wonder if the 5 MB rule is outdated in the age of increasing
> computer power and internet speed. A jpeg photo can easily exceed 5 MB.
> I also wonder if this rule is deliberately kept for good reasons, like
> to make R more inclusive to disadvantaged people with limited internet
> services.
>
> Regards,
>
> Lambda
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
The information in this e-mail is intended only for the ...{{dropped:18}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Large vignettes

2023-01-04 Thread Lambda Moses

Hi Adam,

I also got this problem, and I would like some input from Bioc Core 
Team. I worked around it by writing a minimal vignette in the main 
branch. Then I made a documentation branch, where I have the same code 
as in main branch, but with more elaborate vignettes used to build a 
pkgdown website. I made a rule for myself that I can only merge from the 
main or devel branch to the documentation branch but not the other way 
round. I would switch branch when I find a bug or want a new feature 
while writing the vignettes. You can see the main branch here: 
https://github.com/pachterlab/voyager/tree/main The documentation branch 
here: https://github.com/pachterlab/voyager/tree/documentation


I kind of wonder if the 5 MB rule is outdated in the age of increasing 
computer power and internet speed. A jpeg photo can easily exceed 5 MB. 
I also wonder if this rule is deliberately kept for good reasons, like 
to make R more inclusive to disadvantaged people with limited internet 
services.


Regards,

Lambda

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Bioc-devel Digest, Vol 226, Issue 3

2023-01-04 Thread Lambda Moses

Hi Adam,

I also got this problem, and I would like some input from Bioc Core 
Team. I worked around it by writing a minimal vignette in the main 
branch. Then I made a documentation branch, where I have the same code 
as in main branch, but with more elaborate vignettes used to build a 
pkgdown website. I made a rule for myself that I can only merge from the 
main or devel branch to the documentation branch but not the other way 
round. I would switch branch when I find a bug or want a new feature 
while writing the vignettes. You can see the main branch here: 
https://github.com/pachterlab/voyager/tree/main The documentation branch 
here: https://github.com/pachterlab/voyager/tree/documentation


I kind of wonder if the 5 MB rule is outdated in the age of increasing 
computer power and internet speed. A jpeg photo can easily exceed 5 MB. 
I also wonder if this rule is deliberately kept for good reasons, like 
to make R more inclusive to disadvantaged people with limited internet 
services.


Regards,

Lambda

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Bioconductor data packages containing very large files

2023-01-04 Thread Kern, Lori
A package cannot be that large directly.  Please see information on creating an 
ExperimentHub package to provide the data for use in the package:

 
https://bioconductor.org/packages/release/bioc/vignettes/HubPub/inst/doc/CreateAHubPackage.html

Cheers,




Lori Shepherd - Kern

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Heery Richard 

Sent: Wednesday, January 4, 2023 9:07 AM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] Bioconductor data packages containing very large files

Hi Bioconductor,

I have made a database that is 27 GB that I would like to share as part of a 
Bioconductor package and I was just wondering if it is possible to submit very 
large files like this to Bioconductor or if there may be any alternative ways 
of sharing the file as part of a package?

Best wishes,

Richard Heery

IEO, Milan, Italy
[5x1000]

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://secure-web.cisco.com/139fG07AH98RfoK_SoTpu-tcev3I6LbfqnNToVDRGIvCJsmi1AvcbL1c_t7Dd-rZAXrbqZqUyjb-Sim4Tlgxui3zBHM-ntzSY3xE-0nyd4prF3cuito1iGsDjrgMaAqQ35mIgJeRu2NgXkmQYh5E_wUGoyaoiTz5wLOF2f_rz5wXX3QfIIeUKae7OPTyPuN7OoBJ_gqHxxZ0pK0K6ZyHmOqaF5vc7CmBgK26UgmtjXgat8_vjnbfbDbp_rO_0k1IdDLJjIkyBoSdmFO6wmG6H4Y4r1CzG9PyRLXXG8CMX4PHGc4DMXCTsBYxv6T3GOTpW/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Bioconductor data packages containing very large files

2023-01-04 Thread Heery Richard
Hi Bioconductor,

I have made a database that is 27 GB that I would like to share as part of a 
Bioconductor package and I was just wondering if it is possible to submit very 
large files like this to Bioconductor or if there may be any alternative ways 
of sharing the file as part of a package?

Best wishes,

Richard Heery

IEO, Milan, Italy
[5x1000]

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel