Not an answer, but a request from someone often working behind firewalls and/or machines not connected to the internet. Please have a way to have the package search for the data at some user specified location such as a local directory.
Best, Jan On 14-02-2025 15:54, John Clarke wrote:
Hi folks, I've looked around for this particular question, but haven't found a good answer. I have a versioned dataset that includes about 6 csv files that total about 15MB for each version. The versions get updated every few years or so and are used to drive the model which was written in C++ but is now inside an Rcpp wrapper. Apart from the fact that CRAN does not permit large files, I want to have a better way for users to access particular versions of the dataset. Usage idea: # The following would hopefully also download default/most recent version of the csv files from CRAN (if allowed) or Github or some other repository for academic open source data. install.packages("MyPackage") mypackage = new(MyPackage) Then, if necessary, the user could change the dataset used with something like: mypackage.dataset("2.1.0") which would retrieve new csv files if they haven't already been downloaded and update the data_folder path internally to point to 2.1.0 directory. Requirements: - The dataset is csv (not a R data object) and the Rcpp MyPackage expects this format - Would be nice to properly include citations for the data as they will likely be initially released through a journal publication What is the best practice for this sort of dataset management for a package in R? Is it okay to use Github to store and version the data? Or preferred to use an R package (ignoring the file size limit). Or some other open source data hosting? I see https://r-universe.dev/ as an option as well. In any case, what is the proper mechanism for retrieving/caching the data? Thanks, -John John Clarke | Senior Technical Advisor | Cornerstone Systems Northwest | john.cla...@cornerstonenw.com [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel