Re: [DISCUSS] Visibility of and CI for getting-started guides

Robert Stupp Thu, 29 Jan 2026 02:39:53 -0800

Some PRs are already up. But digging more into docker-compose and "all the
things" around it discovered a couple of more things.
I've summarized the findings and came up with some recommendations here
https://github.com/snazy/polaris/blob/guides-ci/site/content/guides/_index.md#authoring-guides
(on the WIP branch for "guides-CI").
It covers Docker Compose, the usage of 'curl', health-checks, service
dependencies and "final setup services".


I'll come up with a few more PRs to fix the existing guides after the
existing PRs are in and do not conflict with the PRs for the new findings.


On Wed, Jan 28, 2026 at 11:57 PM Adnan Hemani via dev <
[email protected]> wrote:

> Robert, this is super cool! Looks great, and I'm looking forward to the
> separate PRs to get us into shape for getting the CIs working.
>
> Personally, I really like that we are not having to create separate CI
> scripts as compared to just using the markdown file directly - this will be
> helpful to ensure that the CI scripts and the guide don't accidentally
> drift.
>
> Best,
> Adnan Hemani
>
> On Wed, Jan 28, 2026 at 4:24 AM Robert Stupp <[email protected]> wrote:
>
> > Hi all,
> >
> > And here's a working version for "CI on the guides" :
> > https://github.com/apache/polaris/pull/3553. It builds upon PR #3550
> >
> > A CI run, takes about 10 minutes for all guides, looks like this one:
> > https://github.com/apache/polaris/actions/runs/21437618287?pr=3553
> >
> > The TL;DR of how it works:
> > - It exercises for all getting-started guides Markdown files in CI
> > - Testing can be run locally as well, either for all guides or a single
> > guide
> > - For each Markdown file, the `shell` and `sql` code blocks are extracted
> > - Supports "docker compose" and Spark SQL Shell invocations
> > - Custom assertions can be added (as a `shell` code block in an HTML
> > comment, so the assertions aren't rendered on the web site)
> >
> > While working on this, I had to fix a couple of things in the
> > docker-compose files. Some are related to docker-compose service
> > dependencies and timing, others due to the guides just not working
> anymore.
> > I'll come up with separate PRs to address the findings individually.
> >
> > Robert
> >
> >
> > On Mon, Jan 26, 2026 at 3:29 PM Robert Stupp <[email protected]> wrote:
> >
> > > Hi all,
> > >
> > > Here's a prototype as a PR https://github.com/apache/polaris/pull/3550
> -
> > > please try it out and let me know what you think.
> > >
> > > On Tue, Jan 20, 2026 at 9:12 PM Dmitri Bourlatchkov <[email protected]>
> > > wrote:
> > >
> > >> Hi All,
> > >>
> > >> Building CI for getting-started guides sounds useful to me. I suppose
> > we'd
> > >> have to formalize the format of the related `.md` files somehow to
> make
> > >> automated execution possible.
> > >>
> > >> I wonder about the reliability of these tests too. If CI is flaky
> (e.g.
> > >> containers not starting properly), it might be an irritation more than
> > an
> > >> aid. It's worth a try in any case.
> > >>
> > >> Cheers,
> > >> Dmitri.
> > >>
> > >> On Tue, Jan 20, 2026 at 2:48 PM Yong Zheng <[email protected]> wrote:
> > >>
> > >> > 100%. There are so many open source projects with outdated
> > >> getting-started
> > >> > examples and it will be nice to have these in our CI pipelines. The
> > only
> > >> > concern on my end is how do we defined coverage for getting-started
> > >> > example? Currently most of them have simple examples to do
> following:
> > >> > 1. use catalog
> > >> > 2. create namespace
> > >> > 3. create table under namespace
> > >> > 4. create some dummy data
> > >> >
> > >> > Will these be sufficient for CI? With these, we will only know the
> > basic
> > >> > stuff work but if users tried to more complex things, we can't
> really
> > >> > guarantee it will still work. But will this be sufficient?
> > >> >
> > >> > Thanks,
> > >> > Yong Zheng
> > >> >
> > >> > On 2026/01/20 10:55:30 Robert Stupp wrote:
> > >> > > Hi all,
> > >> > >
> > >> > > We have a nice collection of getting started guides in the source
> > >> > > repository [1].
> > >> > > The user-targeting description of each guide is in a README.md
> file.
> > >> > >
> > >> > > I would like to start a discussion and gather feedback about two
> > >> > > topics regarding the getting-started guides:
> > >> > >
> > >> > > 1. Website:
> > >> > > The user facing getting-started guides are well written but not
> very
> > >> > > visible to users, because those are not on the web site.
> > >> > > What are your thoughts of moving the getting-started guides to the
> > >> > website?
> > >> > >
> > >> > > 2. CI coverage:
> > >> > > Most, actually all, getting-started guides include code snippets
> > >> > > referencing Docker compose files.
> > >> > > Manually verifying these code snippets and Docker compose files,
> > >> > > during initial contribution or when those are being updated, is
> > quite
> > >> > > some work.
> > >> > > I _think_ we can automate the verification of the code snippets,
> and
> > >> > > with those the Docker compose files, in CI.
> > >> > > The overall idea is to parse the getting-started guide markdown
> and
> > >> > > let a workflow execute the code blocks for shell/bash.
> > >> > > I am not sure whether all guides can actually be verified, because
> > >> > > some of those Docker compose files start a couple of containers,
> > which
> > >> > > can be a resource (RAM/CPU) issue in GitHub's hosted runners.
> > >> > > The alternatives would be:
> > >> > > - Never update the getting-started guides with the risk that those
> > >> > > become stale and outdated.
> > >> > > - Keep the manual verification process.
> > >> > > Any thoughts on this?
> > >> > >
> > >> > > Robert
> > >> > >
> > >> > >
> > >> > > [1] https://github.com/apache/polaris/tree/main/getting-started
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: [DISCUSS] Visibility of and CI for getting-started guides

Reply via email to