yes Hervé has made this point too -- mucking with the history of the package, 
potentially breaking historical checkouts (when large files are deleted from 
the history, too).

It's relevant because when a package is added to our repository we do a full 
clone of the master branch; an alternative would be to do a --depth 1 clone of 
the master branch, but to me this doesn't seem ideal at all -- from the 
Bioconductor perspective the git.bioconductor.org package is definitive, and 
all we would have would be 'and then a miracle occurred' for early package 
development. I'm also nervous about side-effects associated with maintaining 
the Bioconductor and non-Bioconductor repositories that have different 
historical starts.

My own feel is that most of these cases are packages that are still 'new' and 
seldom have clones / forks.

One could take a hybrid approach, where if a maintainer insists on the 
integrity of their git repository (or even automatically, if they do have large 
files in their history we automatically change strategy) then we do a --depth 1 
clone.

Martin

On 10/1/20, 12:17 PM, "Bioc-devel on behalf of Henrik Bengtsson" 
<bioc-devel-boun...@r-project.org on behalf of henrik.bengts...@gmail.com> 
wrote:

    I understood that it's a submission. Just wanted to make sure that it's
    clear there might be side effects, e.g. people clone and collaborate also
    before submitting to Bioc and a rewrite might surprise existing
    collaborators etc.

    /H

    On Thu, Oct 1, 2020, 09:04 Nitesh Turaga <nturaga.b...@gmail.com> wrote:

    > This package isn’t yet a Bioconductor package Henrik. It will break other
    > forks most likely. This package hasn’t been submitted to the Contributions
    > either to be reviewed. So this is the time to break what needs to be 
broken
    > before it’s submitted to Bioconductor and gets into the Bioconductor git
    > repository.
    >
    > Nitesh
    >
    > On Oct 1, 2020, at 11:57 AM, Henrik Bengtsson <henrik.bengts...@gmail.com>
    > wrote:
    >
    > Doesn't a git rewrite break all existing clones, forks out there? I'm
    > happy to be corrected, if this is not the case.
    >
    > /Henrik
    >
    > On Thu, Oct 1, 2020, 08:16 Nitesh Turaga <nturaga.b...@gmail.com> wrote:
    >
    >> Hi,
    >>
    >> The BiocCheck will complain on the build system about the > 5MB package
    >> size.
    >>
    >> The rewrite of the history with BFG cleaner (
    >> https://rtyley.github.io/bfg-repo-cleaner/ <
    >> https://rtyley.github.io/bfg-repo-cleaner/>) is not as severe as you
    >> think it is to be honest. It is just removing these pack files which 
don’t
    >> have a place in the tree structure. These are more often than not, orphan
    >> files.
    >>
    >> If you are suspect of this solution, I would suggest you make a backup
    >> clone of your repo and try it on that first before you touch the main 
repo.
    >> Check the history (git log) to see if anything important is missing.
    >>
    >> But usually a software package has to be below 5MB. If you have some data
    >> in there which is needed for the package, consider Experiment Hub.
    >>
    >> Best,
    >>
    >> Nitesh
    >>
    >> > On Sep 30, 2020, at 12:46 PM, McGrath, Max <max.mcgr...@ucdenver.edu>
    >> wrote:
    >> >
    >> > Hi all,
    >> >
    >> > We have a package that is ready for submission, but when running
    >> BiocCheck a warning is generated noting that "The following files are 
over
    >> 5MB in size: '.git/objects/pack/pack-xxx...". I've pruned, repacked, and
    >> run git gc which reduced the file size from 5.2 to 5.1MB, but I have been
    >> unable to reduce it further.
    >> >
    >> > I'm reaching out to determine if this is an issue, and if so to ask for
    >> recommendations for solving it. Currently, the only solution I've come up
    >> with is to rewrite the repository's history using a tool like
    >> "git-filter-repo", but this is a more drastic action than I would prefer 
to
    >> take. I would greatly appreciate any advice on the matter.
    >> >
    >> > Thank you,
    >> > Max McGrath
    >> >
    >> >       [[alternative HTML version deleted]]
    >> >
    >> > _______________________________________________
    >> > Bioc-devel@r-project.org mailing list
    >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
    >>
    >>
    >>         [[alternative HTML version deleted]]
    >>
    >> _______________________________________________
    >> Bioc-devel@r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
    >>
    >
    >

        [[alternative HTML version deleted]]

    _______________________________________________
    Bioc-devel@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to