I agree in general.

But if you dig into the issue I've encountered, the cause of the failure is 
elusive, the tests that are implicated in triggering the failure are unrelated 
the the actual cause, and those tests are still providing value where they're 
succeeding. I don't want to disable them universally and reduce the coverage. I 
also wish to unblock the merging of the changes, which are stable in other 
environments (buildbots, cpython github workers, and upstream in 
importlib_metadata).

I've spent hours and hours this past weekend just trying to isolate the issue 
by swapping out changes and triggering buildbot runs, but that approach was 
slow and tedious and ultimately ineffective (I failed to isolate the pertinent 
factors). I'm exhausted and need to find a way to replicate the failure in some 
developer environment (a skill I was hoping to solicit from others) in order to 
perform more rapid experimentation.

My intention was to tackle things in this order:

  1.  Suppress test failures on buildbots. Ideally, it would be only the 
implicated buildbots, but I was willing to hit all buildbots for this small 
exclusion. If the buildbot execution could be detected by buildbot name, that 
might provide more fine grained control.
  2.  Merge the PR so that those changes could gain soak time in the alpha 
builds (and close out this effort rather than making it a dependency on this 
emergent issue).
  3.  File an issue about the failures in the buildbot to solicit help to fix 
the root cause of the buildbot failures.

I guess I could just replace (1) with a broad exclusion, but it would be safer 
to limit the scope of the test skip to help retain coverage and also continue 
to demonstrate that the test passes in non-buildbot environments.

My preference would be to expose functionality in buildbots similar to the 
environment indicators exposed by other CI environments like Travis CI, 
Appveyor, and GitHub Actions (all of which expose both generic `CI` indicators 
and more specific environment details (action name, runner details). Indeed, 
when I asked an AI about how to detect a buildbot environment, it suggested a 
couple of environment variables I could use for that purpose 
(assuming/hallucinating based on what it expected based on other runners).

If we instead want to by principle disallow detection of the buildbot 
environments, it would make sense to make that an explicit note in the buildbot 
documentation, so we can reference the guidance and help future affected 
use-cases resolve the approach quickly.

That said, if we go with the latter approach, this thread could serve as the 
pertinent guidance.

How can we reconcile these tradeoffs?




On Mon, Aug 25, 2025 at 5:14 PM, Zachary Ware 
<z...@python.org<mailto:z...@python.org>> wrote:

On Mon, Aug 25, 2025 at 11:39 AM Gregory P. Smith 
<g...@krypto.org<mailto:g...@krypto.org>> wrote:

I think an environment variable could be set globally for all workers via the 
code in
https://github.com/python/buildmaster-config/tree/main/master/custom

It could, but I personally really don't think we want to. There should not be 
any difference between a buildbot run and a regular test run on a machine of 
similar setup, so I don't think we want to be trying to skip things 
specifically on buildbots. If it *must* be skipped on buildbots, skip it 
everywhere.

_______________________________________________
Python-Buildbots mailing list -- python-buildbots@python.org
To unsubscribe send an email to python-buildbots-le...@python.org
https://mail.python.org/mailman3//lists/python-buildbots.python.org
Member address: arch...@mail-archive.com

Reply via email to