Re: [DISCUSS] Visibility of and CI for getting-started guides

Adnan Hemani via dev Wed, 28 Jan 2026 14:57:06 -0800

Robert, this is super cool! Looks great, and I'm looking forward to the
separate PRs to get us into shape for getting the CIs working.


Personally, I really like that we are not having to create separate CI
scripts as compared to just using the markdown file directly - this will be
helpful to ensure that the CI scripts and the guide don't accidentally
drift.

Best,
Adnan Hemani

On Wed, Jan 28, 2026 at 4:24 AM Robert Stupp <[email protected]> wrote:

> Hi all,
>
> And here's a working version for "CI on the guides" :
> https://github.com/apache/polaris/pull/3553. It builds upon PR #3550
>
> A CI run, takes about 10 minutes for all guides, looks like this one:
> https://github.com/apache/polaris/actions/runs/21437618287?pr=3553
>
> The TL;DR of how it works:
> - It exercises for all getting-started guides Markdown files in CI
> - Testing can be run locally as well, either for all guides or a single
> guide
> - For each Markdown file, the `shell` and `sql` code blocks are extracted
> - Supports "docker compose" and Spark SQL Shell invocations
> - Custom assertions can be added (as a `shell` code block in an HTML
> comment, so the assertions aren't rendered on the web site)
>
> While working on this, I had to fix a couple of things in the
> docker-compose files. Some are related to docker-compose service
> dependencies and timing, others due to the guides just not working anymore.
> I'll come up with separate PRs to address the findings individually.
>
> Robert
>
>
> On Mon, Jan 26, 2026 at 3:29 PM Robert Stupp <[email protected]> wrote:
>
> > Hi all,
> >
> > Here's a prototype as a PR https://github.com/apache/polaris/pull/3550 -
> > please try it out and let me know what you think.
> >
> > On Tue, Jan 20, 2026 at 9:12 PM Dmitri Bourlatchkov <[email protected]>
> > wrote:
> >
> >> Hi All,
> >>
> >> Building CI for getting-started guides sounds useful to me. I suppose
> we'd
> >> have to formalize the format of the related `.md` files somehow to make
> >> automated execution possible.
> >>
> >> I wonder about the reliability of these tests too. If CI is flaky (e.g.
> >> containers not starting properly), it might be an irritation more than
> an
> >> aid. It's worth a try in any case.
> >>
> >> Cheers,
> >> Dmitri.
> >>
> >> On Tue, Jan 20, 2026 at 2:48 PM Yong Zheng <[email protected]> wrote:
> >>
> >> > 100%. There are so many open source projects with outdated
> >> getting-started
> >> > examples and it will be nice to have these in our CI pipelines. The
> only
> >> > concern on my end is how do we defined coverage for getting-started
> >> > example? Currently most of them have simple examples to do following:
> >> > 1. use catalog
> >> > 2. create namespace
> >> > 3. create table under namespace
> >> > 4. create some dummy data
> >> >
> >> > Will these be sufficient for CI? With these, we will only know the
> basic
> >> > stuff work but if users tried to more complex things, we can't really
> >> > guarantee it will still work. But will this be sufficient?
> >> >
> >> > Thanks,
> >> > Yong Zheng
> >> >
> >> > On 2026/01/20 10:55:30 Robert Stupp wrote:
> >> > > Hi all,
> >> > >
> >> > > We have a nice collection of getting started guides in the source
> >> > > repository [1].
> >> > > The user-targeting description of each guide is in a README.md file.
> >> > >
> >> > > I would like to start a discussion and gather feedback about two
> >> > > topics regarding the getting-started guides:
> >> > >
> >> > > 1. Website:
> >> > > The user facing getting-started guides are well written but not very
> >> > > visible to users, because those are not on the web site.
> >> > > What are your thoughts of moving the getting-started guides to the
> >> > website?
> >> > >
> >> > > 2. CI coverage:
> >> > > Most, actually all, getting-started guides include code snippets
> >> > > referencing Docker compose files.
> >> > > Manually verifying these code snippets and Docker compose files,
> >> > > during initial contribution or when those are being updated, is
> quite
> >> > > some work.
> >> > > I _think_ we can automate the verification of the code snippets, and
> >> > > with those the Docker compose files, in CI.
> >> > > The overall idea is to parse the getting-started guide markdown and
> >> > > let a workflow execute the code blocks for shell/bash.
> >> > > I am not sure whether all guides can actually be verified, because
> >> > > some of those Docker compose files start a couple of containers,
> which
> >> > > can be a resource (RAM/CPU) issue in GitHub's hosted runners.
> >> > > The alternatives would be:
> >> > > - Never update the getting-started guides with the risk that those
> >> > > become stale and outdated.
> >> > > - Keep the manual verification process.
> >> > > Any thoughts on this?
> >> > >
> >> > > Robert
> >> > >
> >> > >
> >> > > [1] https://github.com/apache/polaris/tree/main/getting-started
> >> > >
> >> >
> >>
> >
>

Re: [DISCUSS] Visibility of and CI for getting-started guides

Reply via email to