Hi David, lots of great stuff here, I'll try to keep my replies short though...
On Sat, Nov 13, 2021 at 7:47 PM David Jencks <david.a.jen...@gmail.com> wrote: > > The Antora build part of the website is getting better at detecting problems > and failing the build, and the website build seems to me to be failing more > often. Perhaps we can find ways to improve our process so there are fewer > problematic commits and it’s easier to detect and fix problems earlier. My intent with the website was that we should fail to publish as often as necessary to do our best in not publishing a broken website. I think we can tolerate website not being up to date with the latest for a day or two. > There are a few problems caused by interactions between near-in-time commits > and commits that bring in stuff that is obsolete due to recent website build > changes. Let’s ignore those :-)… especially the second kind will iron > themselves out over time. +1 > So, people keep merging PRs that change the documentation without checking > that it doesn’t break the website build, either locally or as a CI check on > the PR. I think the xref syntax makes it difficult for folk to wrap their head, I think everyone should be familiar with the documentation we have here: https://github.com/apache/camel-website/#links-between-pages-in-antora-content > They theoretically could do a local website build that incorporates their > changes, but right now it’s way too hard and time consuming. (I’ll discuss > the problems with the projects that attempt to do a partial local build later) > So one good step would be to make local website builds to check doc changes > easy and quick. I’ve made some progress on this. +1 I think so too, local build and then the CI build against the git repository should be our first lines of defense. We should make those take seconds not minutes and then mandatory. > Another step would be for CI to check the website build on each PR, either > the whole site or a partial build. I think GH actions can trigger each > other, but I’ve never set it up. Do we have enough GH action time to do a > full website build on every PR to any camel subproject? Is it practical to > trigger the website build only when something documentation-related changes? > (this detection would need to be carefully set up in each subproject) If > these are possible I think we should just do this. It’s probably possible to > set up quicker partial builds, but it’s decidedly more complicated. Most issues I've seen have been with xref linking, I think we should focus on that first. Other checks I think fail less often. Perhaps not building at all could be a good solution. For example in the Camel main, Guillaume built a Maven plugin that checks for broken xrefs in seconds. But that (currently) works only within the main Camel repository. What if we could expand that or build comparable tooling that checks xrefs in the adocs of the git repository the developer is working on but also takes into account xrefs to other subprojects? Two ideas I have here would be to clone other subprojects and build an index of xrefs against them as well; or to use the sitemap XMLs (could be fairly quick!) from the live website and reverse them back to xrefs for checking. > Another step would be to make it extremely visible when the jenkins website > build fails. I try to follow the dev list pretty closely, and see a lot of > GH PR CI build failures reported, but apparently the jenkins build has been > failing for several days and I had no idea. +1 I think this is an area I can focus on next. I think we're in agreement that we don't want to send emails to the dev@ mailing list, one idea is to create GitHub issues; I just thought of a "status" channel on Zulip. Or perhaps both. > In principle, what other steps could we take? > > —— > > Comments on the existing attempts to have subproject-specific partial builds: > > Dan Allen (of Antora) has repeatedly said that subsidiary builds such as > local or partial builds should be done from (clones) of the repo containing > the playbook for the actual site. For a long time I disagreed and thought > approaches like that of camel-quarkus to have a local build in the subproject > were workable but I’m now convinced that they are totally unmaintainable. > They rely on updating each such subproject every time the main playbook > changes, and in a way that requires deep understanding of the entire site > build. It just isn’t going to work, ever. I wonder if we could have the approach of Camel Quarkus and solve the issue of outdated playbooks by having a git submodule of the website in every project. Be warned though, the website git repository is very large (3.6GB). > —— > Maybe there’s hope… > > If we’re going to encourage or require local builds of the website, there > needs to be a defined file system relationship between the camel-website > clone and the subproject(s) clone(s). I have a “global” directory (named > camel) into which I’ve cloned all the subprojects next to one another > (together with some extra git work trees). I think this is the simplest > arrangement and I think we could require it. > > Next, there needs to be an easy way (preferably automated) to modify the > playbook to take account of building against one (or possibly more) local > clones. E.g, if I’m working on camel-quarkus, I should only need to have > camel-quarkus cloned, and still be able to do a build. Doing this is much > more plausible if we can assume that every branch participating in the > website is present and up to date locally. Does anyone know if it’s possible > to write a git script that can update branches without switching to them? If > we can assume this, then the local build just involves changing the playbook > source url from GitHub….<project>.git to `./../<project>` and adjusting the > checked out branch name. I think this could help, but I'm a bit skeptical that if this is not automated folk will skip over this. My workflow is somewhat similar, I have all subprojects checked out in the same (parent) directory, so I just change the playbook to use HEAD branch and ../camel-$subproject to build. > Then there’s the problem that the full Antora build takes something like 6 > minutes now, which is too long for anyone to wait for. So, we need an > effective way of doing quick partial builds. I’ve been working on this with > some progress. Dan has an idea he calls a site manifest, which means that > the site build writes out the content catalog with information about the > Antora coordinates and the site location of every page. Then a partial build > can read this in to populate the partial build content catalog, so that xrefs > can be properly resolved. This was originally developed to enable a > “subsidiary site” to have xrefs to a “main site”. I’ve adapted this to be an > Antora pipeline extension, and it can be used in a couple of ways. Here's where it dawned on me that we already have the manifest of sorts in the sitemap XML files. But the idea of each subproject building it's bit of the website is also interesting to me. > - A site manifest could be published as part of the actual site. In this > case the partial build would fetch it, and only pages actually present > locally would get local links. You’d find out whether there are any > problems, but it might be hard to locate the local pages through navigation. > > - If you do a full build locally to generate a local site manifest, a partial > build using that site manifest will only overwrite the rebuilt local files, > leaving you with a functional local site. > > - Possibly the full Jenkins build could also package the Antora site as a zip > archive, and local builds could fetch and unpack it rather than doing a full > local build. I think INFRA might not look too keenly on us taking up too much disk space on ci-builds.a.o. We _could/perhaps_ push to repository.a.o. as a -SNAPSHOT. > With the site manifest, there’s still the problem of modifying the playbook > to only build a little bit. I’ve written another extension that you > configure with the part you want to build, and it applies appropriate > filters. You can configure it down to one page. It also watches for changes > and rebuilds when it detects a change: I think I’ll need to make that > configurable since it’s great to see your changes quickly but not what you > want for a build step. > I have not yet tried to make it easy to select which subproject you want to > build: so far it requires knowing how to configure the extensions. I’ve > started having some ideas on how this might be done. This would bring super fast previews, could be part of the preview functionality we already have for the website... > What I’m envisioning and hoping for is a pre-PR process that involves > running, in a local camel-website clone, something like `yarn > partial-build-camel-quarkus` that will in less than a minute detect any > errors and produce a local site you can look at with the local changes. This would be really cool. > Thoughts? If we agree that most issues are broken xrefs (that's how it seems to me) perhaps focusing on not building the Antora bits at all, but doing something along the lines what Guillaume built with information (say from XML sitemaps) about other Antora components in the mix, feels like it would bring some quick wins. Sorry I don't think that was short... zoran -- Zoran Regvart