There is a README.txt in the pkgs folder. I will attach it. I think this is accurate, but there may be something else on the site.
On Thu, Oct 23, 2014 at 10:23 PM, Henrik Bengtsson <h...@biostat.ucsf.edu> wrote: > It's been a while since I worked with experimental packages. Where > can I find documentation on how to (Subversion) update our > AffymetrixDataTestFiles package with additional data files? All I > know is that the SVN repository only contains a stub of the package > and > http://www.bioconductor.org/developers/package-guidelines/#package-types > provides little information and basically only point to the devel > mailing list. > > Thanks, > > Henrik > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel >
======================================== BioC Experiment Data Package SVN Repos ======================================== :Date: 2006-08-10 :Author: S. Falcon :svn URL: https://hedgehog.fhcrc.org/bioc-data/trunk/experiment/pkgs Background ========== This svn directory contains BioC experiment data packages. Data packages contain potentially large binary files that do not change often. Most updates to these packages involve the package infrastructure files. Obtaining a working copy of a data package over a slow connection can be frustrating, especially when all that is needed is the infrastructure files and not the actual data. We have implemented a scheme that allows separate checkout of infrastructure files and data files. This document describes the scheme and provides instructions for checkout and update of existing data packages as well as for adding new packages. How to Create an Infrastructure-Only Workingcopy ================================================ You can obtain a checkout of all experiment data package infrastructure files as follows:: svn checkout \ https://hedgehog.fhcrc.org/bioc-data/trunk/experiment/pkgs To obtain the files for a particular package, say ``ALL``:: svn checkout \ https://hedgehog.fhcrc.org/bioc-data/trunk/experiment/pkgs/ALL If you want to preview what is available, you might try the following:: # get the top-level scripts, but don't recurse into subdirs svn checkout -N \ https://hedgehog.fhcrc.org/bioc-data/trunk/experiment/pkgs # see what is there svn ls https://hedgehog.fhcrc.org/bioc-data/trunk/experiment/pkgs # get a particular package's infrastructure files cd pkgs svn up ALL # see next section for getting complete working copy w/ data How to Create a Complete Workingcopy ==================================== First create a workingcopy of the infrastructure files as described above. Next use the helper script ``add_data.py`` (you will need Python). It is located here: https://hedgehog.fhcrc.org/bioc-data/trunk/experiment/pkgs/add_data.py Here's a complete example for ``ALL``:: svn checkout \ https://hedgehog.fhcrc.org/bioc-data/trunk/experiment/pkgs/ALL python add_data.py ALL This will add the big data directories (usually data/, but sometimes also dirs under inst/) to your working copy. Usually, the svn:ignore property has been set so that you won't accidentally add these dirs when working with the package, but please take care anyway. A note about committing changes to the data ------------------------------------------- If you want to modify the actual data, cd into the appropriate dir after having run add_data.py and do your commit from there. The script adds a full working copy inside the infrastructure working copy. How to Add a New Data Package ============================= 1. Add the infrastructure files under ``pkgs``. 2. Add any large data directories to ../data_store/PKGNAME/. For example, if there is large data in PKGNAME/data and PKGNAME/inst/extdata, you would add PKGNAME/data and PKGNAME/inst/extdata to ../data_store. 3. Create a file 'external_data_store.txt' listing each dir that is stored externally (each on a separate line). Contining the example above, the file would contain:: data inst/extdata This should go in the top-level of the package dir. 4. Add svn ignore properties. Continuing the example:: cd PKGNAME svn propset svn:ignore '*' ./data/ ## property 'svn:ignore' set on '.' ## in the data folder svn propset svn:ignore '*' ./inst/ or (this might not work anymore) svn propedit svn:ignore . ## add 'data' here svn propedit svn:ignore inst ## add 'extdata' here 5. Commit. Details of Storage Scheme ========================= Experiment data package infrastructure files live in ``experiment/pkgs``. Package subdirectories that contain large files are stored under ``experiment/data_store``. There is no mechanism to support separate storage of individual files. Here is an example of how data for the ``davidTiling`` package is stored:: experiment/pkgs/davidTiling/ DESCRIPTION NAMESPACE R/ external_data_store.txt inst/ man/ experiment/data_store/davidTiling/ data/ inst/ celfiles/ website/ The ``davidTiling`` package contains large data in the ``data/``, ``inst/celfiles/``, and ``inst/website/`` subdirectories. As you can see, each of these is stored separately from the package infrastructure files. The file ``external_data_store.txt`` lists the location of the externally stored data. Here is the contents for ``davidTiling``:: data inst/website inst/celfiles To create a complete directory containing both infrastructure and data, one first does a checkout of the infrastructure and then does a checkout of each individual externally stored subdir. This can be done inside the infrastructure working copy. There is a helper script to automate the required svn commands. One option that might be worth adding to the script is to do an export instead of checkout. Additionally, the ``svn:ignore`` property has been set in the infrastructure dir to help prevent folks from accidentally adding the external data to the infrastructure dir itself.
_______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel