Robert, this is super cool! Looks great, and I'm looking forward to the separate PRs to get us into shape for getting the CIs working.
Personally, I really like that we are not having to create separate CI scripts as compared to just using the markdown file directly - this will be helpful to ensure that the CI scripts and the guide don't accidentally drift. Best, Adnan Hemani On Wed, Jan 28, 2026 at 4:24 AM Robert Stupp <[email protected]> wrote: > Hi all, > > And here's a working version for "CI on the guides" : > https://github.com/apache/polaris/pull/3553. It builds upon PR #3550 > > A CI run, takes about 10 minutes for all guides, looks like this one: > https://github.com/apache/polaris/actions/runs/21437618287?pr=3553 > > The TL;DR of how it works: > - It exercises for all getting-started guides Markdown files in CI > - Testing can be run locally as well, either for all guides or a single > guide > - For each Markdown file, the `shell` and `sql` code blocks are extracted > - Supports "docker compose" and Spark SQL Shell invocations > - Custom assertions can be added (as a `shell` code block in an HTML > comment, so the assertions aren't rendered on the web site) > > While working on this, I had to fix a couple of things in the > docker-compose files. Some are related to docker-compose service > dependencies and timing, others due to the guides just not working anymore. > I'll come up with separate PRs to address the findings individually. > > Robert > > > On Mon, Jan 26, 2026 at 3:29 PM Robert Stupp <[email protected]> wrote: > > > Hi all, > > > > Here's a prototype as a PR https://github.com/apache/polaris/pull/3550 - > > please try it out and let me know what you think. > > > > On Tue, Jan 20, 2026 at 9:12 PM Dmitri Bourlatchkov <[email protected]> > > wrote: > > > >> Hi All, > >> > >> Building CI for getting-started guides sounds useful to me. I suppose > we'd > >> have to formalize the format of the related `.md` files somehow to make > >> automated execution possible. > >> > >> I wonder about the reliability of these tests too. If CI is flaky (e.g. > >> containers not starting properly), it might be an irritation more than > an > >> aid. It's worth a try in any case. > >> > >> Cheers, > >> Dmitri. > >> > >> On Tue, Jan 20, 2026 at 2:48 PM Yong Zheng <[email protected]> wrote: > >> > >> > 100%. There are so many open source projects with outdated > >> getting-started > >> > examples and it will be nice to have these in our CI pipelines. The > only > >> > concern on my end is how do we defined coverage for getting-started > >> > example? Currently most of them have simple examples to do following: > >> > 1. use catalog > >> > 2. create namespace > >> > 3. create table under namespace > >> > 4. create some dummy data > >> > > >> > Will these be sufficient for CI? With these, we will only know the > basic > >> > stuff work but if users tried to more complex things, we can't really > >> > guarantee it will still work. But will this be sufficient? > >> > > >> > Thanks, > >> > Yong Zheng > >> > > >> > On 2026/01/20 10:55:30 Robert Stupp wrote: > >> > > Hi all, > >> > > > >> > > We have a nice collection of getting started guides in the source > >> > > repository [1]. > >> > > The user-targeting description of each guide is in a README.md file. > >> > > > >> > > I would like to start a discussion and gather feedback about two > >> > > topics regarding the getting-started guides: > >> > > > >> > > 1. Website: > >> > > The user facing getting-started guides are well written but not very > >> > > visible to users, because those are not on the web site. > >> > > What are your thoughts of moving the getting-started guides to the > >> > website? > >> > > > >> > > 2. CI coverage: > >> > > Most, actually all, getting-started guides include code snippets > >> > > referencing Docker compose files. > >> > > Manually verifying these code snippets and Docker compose files, > >> > > during initial contribution or when those are being updated, is > quite > >> > > some work. > >> > > I _think_ we can automate the verification of the code snippets, and > >> > > with those the Docker compose files, in CI. > >> > > The overall idea is to parse the getting-started guide markdown and > >> > > let a workflow execute the code blocks for shell/bash. > >> > > I am not sure whether all guides can actually be verified, because > >> > > some of those Docker compose files start a couple of containers, > which > >> > > can be a resource (RAM/CPU) issue in GitHub's hosted runners. > >> > > The alternatives would be: > >> > > - Never update the getting-started guides with the risk that those > >> > > become stale and outdated. > >> > > - Keep the manual verification process. > >> > > Any thoughts on this? > >> > > > >> > > Robert > >> > > > >> > > > >> > > [1] https://github.com/apache/polaris/tree/main/getting-started > >> > > > >> > > >> > > >
