Hi all,
I thought a bit about our build bots and how to deal with them. It takes
us a long time to get the bots fixing once they broke. And currently
they broke because somethings seem to have changed on the machines.
I see to issues. We do not monitor / fix the bots and they are changing
without our notice. (that is not ment as a critic, just an observation).
So i think that we should adress this, since we are not enough people
that the process works as is.
I am think about the following measures:
1) I would like to get information on when a bot is failing, so we can
take action, and do not need to monitor.
1a) I know from Gavin that we could activate an email. I would like to
get the information in future about the fail. What would be the right
channel for such an email?
- dev : imho could be to many noise for overall
- sysadmin: an option, but i am not super sure this is the right place.
- an own list for tech messages.
2) the system seem to fail often because something changed underneeth
and our old build environment brakes.
a) I would like to have some build environment health check, to check if
the system is working for the build and what is wrong. we could extent
the autoconfig with more tests. Would that be the right way?
b) I would like to have more control over how the buildbots are
configured. Maybe have it better documented in code for puppet or
another system. Maybe even setup the build system at each build.
C) as an alternative, i could emagine that we provide images like our
linux image, and adapt it to the use by the build bot. This can be
created for windows too. and maybe even possible for mac. it would make
things easier for us.
This is a complete different strategy. I am not sure if infra would
support us in this dockerization.
3) Timeline
I am very slow in working down my tasks. I want to look into the bots
and I am volunteering, but i need to look at what is still open. so my
situation is as follows:
1) Finishing opengrok (there is mostly documentation and infra task open.)
2) pootle migration + get the translation CI / CD going
- this involves the linux build machine
3) Extension Page update / rewrite
- i still need to research the best strategy. That is the next step
4) Python 3 update
5) there is a security topic it seems i am the only one currently
looking into it.
6) MediaWiki dockerization and update on the latest version.
We are on a newer version, but still on an out of date version. We
should try to dockerize it imho. (the db remains local build, the effort
is then not that big.)
7) Fixing the Windows bots
That is currently my priorization. And since i placed the Windows bot
down to 7. It may make sense if no one else looks into it that we delete
the bots for now as Matthias has suggested. I can then implement the
strategy for the linux bot, and we learn there more if the situation
improves.
Of course if there is a Volunteer for any of the topics I am willing to
support the volunteer in getting it done.
Please, provide me with your insights or questions. It will create more
clarity.
Thanks, all the best
peter
Am 07.12.2025 um 16:57 schrieb Gavin McDonald:
https://ci2.apache.org/#/workers/13
Does anyone want to help fix these or shall we delete the jobs?
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]