Hi all,
Thank you all for taking the time to discuss this issue. While the package does
have a fairly long history with multiple authors, I am currently the only
active developer on the project. So, if it preferable from Bioconductor’s
perspective, I will rewrite the repository's history.
I do have one remaining question regarding the extent of the rewrite. Currently
the package (not including the pack file) sits at ~1.6MB. After a test run of
the rewrite I was able to reduce the pack file to ~4MB. So, the total package
is still over 5MB, but each individual file is under the threshold. Is this
acceptable? Or will I need to delete more from the history? I ask because I
imagine it is preferable to limit the extent of the rewrite to a minimal
acceptable standard.
Thanks,
Max
From: Bioc-devel on behalf of Martin Morgan
Sent: Thursday, October 1, 2020 10:32 AM
To: Henrik Bengtsson ; Nitesh Turaga
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] Git pack file greater than 5MB
yes Hervé has made this point too -- mucking with the history of the package,
potentially breaking historical checkouts (when large files are deleted from
the history, too).
It's relevant because when a package is added to our repository we do a full
clone of the master branch; an alternative would be to do a --depth 1 clone of
the master branch, but to me this doesn't seem ideal at all -- from the
Bioconductor perspective the git.bioconductor.org package is definitive, and
all we would have would be 'and then a miracle occurred' for early package
development. I'm also nervous about side-effects associated with maintaining
the Bioconductor and non-Bioconductor repositories that have different
historical starts.
My own feel is that most of these cases are packages that are still 'new' and
seldom have clones / forks.
One could take a hybrid approach, where if a maintainer insists on the
integrity of their git repository (or even automatically, if they do have large
files in their history we automatically change strategy) then we do a --depth 1
clone.
Martin
On 10/1/20, 12:17 PM, "Bioc-devel on behalf of Henrik Bengtsson"
wrote:
I understood that it's a submission. Just wanted to make sure that it's
clear there might be side effects, e.g. people clone and collaborate also
before submitting to Bioc and a rewrite might surprise existing
collaborators etc.
/H
On Thu, Oct 1, 2020, 09:04 Nitesh Turaga wrote:
> This package isn’t yet a Bioconductor package Henrik. It will break other
> forks most likely. This package hasn’t been submitted to the Contributions
> either to be reviewed. So this is the time to break what needs to be
broken
> before it’s submitted to Bioconductor and gets into the Bioconductor git
> repository.
>
> Nitesh
>
> On Oct 1, 2020, at 11:57 AM, Henrik Bengtsson
> wrote:
>
> Doesn't a git rewrite break all existing clones, forks out there? I'm
> happy to be corrected, if this is not the case.
>
> /Henrik
>
> On Thu, Oct 1, 2020, 08:16 Nitesh Turaga wrote:
>
>> Hi,
>>
>> The BiocCheck will complain on the build system about the > 5MB package
>> size.
>>
>> The rewrite of the history with BFG cleaner (
>> https://rtyley.github.io/bfg-repo-cleaner/ <
>> https://rtyley.github.io/bfg-repo-cleaner/>) is not as severe as you
>> think it is to be honest. It is just removing these pack files which
don’t
>> have a place in the tree structure. These are more often than not, orphan
>> files.
>>
>> If you are suspect of this solution, I would suggest you make a backup
>> clone of your repo and try it on that first before you touch the main
repo.
>> Check the history (git log) to see if anything important is missing.
>>
>> But usually a software package has to be below 5MB. If you have some data
>> in there which is needed for the package, consider Experiment Hub.
>>
>> Best,
>>
>> Nitesh
>>
>> > On Sep 30, 2020, at 12:46 PM, McGrath, Max
>> wrote:
>> >
>> > Hi all,
>> >
>> > We have a package that is ready for submission, but when running
>> BiocCheck a warning is generated noting that "The following files are
over
>> 5MB in size: '.git/objects/pack/pack-xxx...". I've pruned, repacked, and
>> run git gc which reduced the file size from 5.2 to 5.1MB, but I have been
>> unable to reduce it further.
>> >
>> > I'm reaching out to determine if this is an issue, and if so to ask for
>> recommendation