Thanks Sean,

This looks awesome! Many thanks for storing this. I'll see how I could
process the data and might contact you off-list or via the issues on
the repo.

Just by the numbers reported I'm a bit surprised by the daily
increment of the summary table. Bioconductor software has around 2000
packages, checked on 5 different machines, per 5 outputs (Install,
build, check, bin, propagate) (which results on that order of
magnitudes), but not all builds and checks are run everyday (now I
cannot find the page where the frequency is reported).

At the moment I won't use build and check reports but I might be
interested in that later (I too collect general checks results from
CRAN without the log files).
In any case, I'll get in touch.
Ideally, I would like to export/use this from a package, as I have
done for CRAN via the repo.data package I'm building.

Best wishes and many thanks,

Lluís

On Thu, 20 Mar 2025 at 02:56, Sean Davis <seand...@gmail.com> wrote:
>
> Hi, all.
>
>
>
> Perhaps a bit tangential, but I capture the results of all build reports for 
> all packages daily (that is the intent, anyway) going back a year or so (a 
> couple of years if we dig into archives). The reports are processed using 
> code in this repo: https://github.com/seandavi/BiocBuildDB using a github 
> action that runs daily. This might not be exactly the format you are looking 
> for, Lluis, but it does have a complete history of every build for every 
> package for every day for all Bioc builds.
>
>
>
> The result is a set of three CSV files (one set for every build, about 3.5k 
> CSV files right now) with rows for each package/machine/build step and the 
> results of the build, including propagation status (whether the package gets 
> pushed to release). Version numbers, git hashes, dates, Bioconductor 
> versions, build commands, error logs, etc. are all captured. Thus, things 
> like full text search over captured log output is possible over time, across 
> branches, and across machines or packages. When a package enters the system 
> is captured. The build_summary table currently checks in at about 6M rows 
> (again, without going into archive data) and adds about 20k rows per day.
>
>
>
> I have pending issues to expose the data but just haven’t prioritized the 
> work. I’m happy to discuss access and use cases either in a new thread here, 
> on Slack, or via github issues.
>
>
>
> Sean
>
>
>
>
>
>
>
> From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of Lluís 
> Revilla <lluis.revi...@gmail.com>
> Date: Wednesday, March 19, 2025 at 6:21 PM
> To: Kern, Lori <lori.sheph...@roswellpark.org>
> Cc: bioc-devel <bioc-devel@r-project.org>
> Subject: Re: [Bioc-devel] Bioconductor archive?
>
> Hi Lori,
>
> Many thanks for your answer. I have a couple of follow-up questions.
>
> > It looks like the Date/Publication field is only present when there was a 
> > change on the branch post release.   (ie. any package that has a version 
> > x.y.(z+n) instead of x.y.0.
> > After a release is frozen and a new release occurs, Bioconductor does not 
> > allow any changes or fixes even to bugs.  A release is frozen so there is 
> > no changes after the new release occurs.
>
> Thanks for reminding me of this. I'm interested on the x.y.z+n
> packages that were released on each release, not just the last one or
> the initial one. Is this historical information available? The file at
> https://bioconductor.org/packages/3.20/bioc/VIEWS only includes the
> latest date of a given release, but there could be a release within a
> given Bioconductor version before that.
>
> > I would have to dig in the history but my guess is 3.7 might be when we 
> > either switched to git or started having archived versions so likely not 
> > available before this date.
>
> I thought it would be difficult if not impossible to check this but
> even for the current release I can't find this data. Does Bioconductor
> have an internal archive with this information? On CRAN even if it
> removes a package internally the  activities of the archive are
> stored: each date-time of publication, archive and removal. Does
> something similar happen in Bioconductor? Even if a given package is
> not available knowing that there was a release could be helpful for
> reproducibility (as it could be used to compare with the git log).
>
> With that information finding which package versions were used for a
> script with only a date could become easier.
>
> Best,
>
> Lluís
>
>
> >
> >
> >
> > Lori Shepherd - Kern
> >
> > Bioconductor Core Team
> >
> > Roswell Park Comprehensive Cancer Center
> >
> > Department of Biostatistics & Bioinformatics
> >
> > Elm & Carlton Streets
> >
> > Buffalo, New York 14263
> >
> > ________________________________
> > From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of Lluís 
> > Revilla <lluis.revi...@gmail.com>
> > Sent: Saturday, March 15, 2025 5:20 AM
> > To: bioc-devel <bioc-devel@r-project.org>
> > Subject: [Bioc-devel] Bioconductor archive?
> >
> > Hi,
> >
> > Recently I learned thanks to Martin Morgan that there are some files with
> > the Date/Publication fields for Bioconductor packages:
> > https://secure-web.cisco.com/1WmVHwH9-fASq-_cRqjzutLif_scf2tV0oia7j9wcAlmEkD6LTfPr4hpDabt4CAjYBdFcUrtqQXG2zbH0HakIsmTnqgnHUbghB0qC_b3FyGAhL5dnDBbz1Oh7HlpVwyPV79vgW7FMsg__zeInCyPb_jmFBXAvFRuq-HsBLTAC-Bf2EfgTjG3y38kBOIGnb59DWA6ILkuC-oYK0RJe8h3JvV5RoaeA9FxDk6QokHUT-YeC7hIEd_hURH1dV0dKbJN717qRcgwyT42SNb1evj91AQrxGnEyIR2XFpm28A-qOih3N2V_YsWsZd0wzGApXcZy/https%3A%2F%2Fbioconductor.org%2Fpackages%2F3.7%2Fbioc%2FVIEWS
> >  I'm trying to reconstruct
> > which packages from  CRAN and Biocondctor were available at any moment and
> > it was very helpful.
> >
> > However, these files have the latest version published by a package on a
> > given Bioconductor release.
> > Is there a way to know if there were more updates after a release?
> > I thought about searching the git log for each package. But that wouldn't
> > be enough, as they might have increased their version but not passed
> > Bioconductor checks, and thus not be released.
> >
> > Related to this, this field is present from Bioconductor version 3.7 or
> > later but I couldn't find it on previous releases. Is there a way to know
> > previous packages' releases and their dates?
> >
> > Packages' updates on the release branch should on contain bug fixes, but
> > for reproducibility purposes it might be necessary to get the same bugs
> > again.
> >
> > Many thanks in advance,
> >
> > Lluís
> >
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-devel@r-project.org mailing list
> > https://secure-web.cisco.com/13SnGNaaDyFbctEb1TdAguAxRDGWtUJvQINgKyoWwg8r1Kce77xQNycHZxQSYbLF7m6L2z5y7dVIwm3y-9U1nxiyuzrQxuIQZc5HoTMPvbokKA1qJHn3CCb-Zlx3gtXWIW2VtFh_7loh_SYeLpi5ak38PFBFkLutgGFEwFhXbr0EFIo2W8HRtaqFNH9_U-hcBauAVzEJOJV9rFuxZom3twTGLLjMzaXn7ZhRdcG56Z_sAM0lzgdFeTgepY4mN7XAUwqNMoSSwjIeL10YspawZ6fy_yXLfIysgSN1DpVVdzc9Pv7GHlPjj7-EVYr-ScNbg/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel
> >
> >
> > This email message may contain legally privileged and/or confidential 
> > information. If you are not the intended recipient(s), or the employee or 
> > agent responsible for the delivery of this message to the intended 
> > recipient(s), you are hereby notified that any disclosure, copying, 
> > distribution, or use of this email message is prohibited. If you have 
> > received this message in error, please notify the sender immediately by 
> > e-mail and delete this email message from your computer. Thank you.
>
> _______________________________________________
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to