Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-16 Thread Kent Fredric
On Wed, 16 Sep 2020 16:05:49 -0600
Tim Harder  wrote:

> Speaking for myself, I avoid hosting most of my Gentoo-related work
> (outside of gentoo repo ebuild mangling) on gentoo.org since I prefer
> the services offered elsewhere in terms of usability, visibility, and
> project maintenance. Take this as constructive criticism of how Gentoo
> currently operates as an upstream host and see it as a call for putting
> more emphasis towards deploying GitLab, Gitea, or other similar service
> for Gentoo.

100%. Rich's suggestions with regards to documenting a "here's how you
build a container that can be air-dropped onto gentoo infra and booted,
and incrementally updated on request" process would possibly go a long
way with all of this.

Random devs can band together, build some GitFoo container, get it
working Gud(TM), and petition Infra to deploy it (probably in some
semi-official "this is just an 'speriment" namespace till it ossifies)

Making the Infra side DeadEasy(TM), and the contributor side
DeadEasy(TM) reduces all the real friction points beyond politics.


pgpw5PItQheYQ.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-16 Thread Kent Fredric
On Wed, 16 Sep 2020 19:47:35 -0400
Rich Freeman  wrote:

> Seems like a way to improve this would be better documentation and a
> DIY infra testing platform.
> 
> First, document how to prepare a service for infra hosting.  Maybe
> provide an example service.
> 
> Second, publish a tarball of a container/chroot that basically
> simulates an infra host for testing purposes.  Provide instructions on
> how to configure/run it.  Make it easy - edit this config file to
> point to your repo, put the name of your service/host here, etc.
> 
> Anybody developing a service could then just follow the instructions
> and then test out their service in a similar environment to what infra
> uses.  Nobody needs to be trusted with any credentials.  Nobody needs
> access to any special repos.  They can run the container on their own
> host, and point it at their favorite git repo hosting service.
> 
> Once it is working all they need to do is give the link to their
> repo/etc to infra.  Infra can then fork it and host it.  The
> maintainer can submit pull requests as needed to infra.

I'm in favour of all of this.


pgpR_IjvewA8t.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-16 Thread Kent Fredric
On Wed, 16 Sep 2020 07:11:12 -0400
Rich Freeman  wrote:

> I realize this is a bit more tangential.  I just think that infra is
> already a huge failure point, so having more stuff on infra actually
> makes that failure point more critical.  A Gentoo where little is
> hosted on stuff we own is much more resilient in the face of
> legal/money/etc issues.  If Gentoo just becomes some blessed config
> files, a website, and SAML then anybody could host the core from their
> basement.

I agree on the "infra is a big SPOF", just to me, and my experience,
"single developers" are a much larger/more volatile SPOF.

Like, I can't even keep my own stuff running, so I'm using myself as an
example.

But I know too many people who fall into this camp.

Individuals are much less likely to have the finances and ability to,
not only delegate others to work on their platforms, but are unlikely
to have the delegation itself delegated.

So as fragile as gentoo infra is, ... its still less fragile in the
long term view of things than random 3rd parties.

( This is where we cry about the loss of the original gentoo wiki, and
how long it took us to replace it, where I'd imagine if it had started
out with the full backing of gentoo infra, we'd have lost much less,
and we'd have not lost so much of our google juice and 'fountain of
arcane knowledge' to Arch )


pgpqoOko22Ei0.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-16 Thread Rich Freeman
On Wed, Sep 16, 2020 at 11:44 AM Alec Warner  wrote:
>
>  - repomirror-ci and all the CI stuff is on infra because mgorny is also on 
> infra! It's not like we set his stuff up for him; instead we gave him access 
> to all the infra repos and he had to write his own puppet configs and 
> whatnot. The benefit of course is that anyone on infra can bump the stuff and 
> login to the machines and debug...but its not exactly a low bar.
> ..
> I think traditionally it's been a slog for non-infra people to get infra to 
> host much of anything and due to difficulties with the all-or-nothing 
> approach we take with infra credenteials; can really set a high bar to host 
> much of anything these days.

Seems like a way to improve this would be better documentation and a
DIY infra testing platform.

First, document how to prepare a service for infra hosting.  Maybe
provide an example service.

Second, publish a tarball of a container/chroot that basically
simulates an infra host for testing purposes.  Provide instructions on
how to configure/run it.  Make it easy - edit this config file to
point to your repo, put the name of your service/host here, etc.

Anybody developing a service could then just follow the instructions
and then test out their service in a similar environment to what infra
uses.  Nobody needs to be trusted with any credentials.  Nobody needs
access to any special repos.  They can run the container on their own
host, and point it at their favorite git repo hosting service.

Once it is working all they need to do is give the link to their
repo/etc to infra.  Infra can then fork it and host it.  The
maintainer can submit pull requests as needed to infra.

-- 
Rich



Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-16 Thread Tim Harder
On 2020-09-16 Wed 09:36, Jonas Stein wrote:
> The heart of a distribution is basically its infrastructure and the
> tools to test, maintain and distribute packages.
> 
> If a distribution relies on external sources, which are not maintained
> by the distribution, but a single person, it has been forked.
> 
> A healthy distribution needs to maintain its own tools.

Gentoo is quite free to mirror or fork various tools it deems critical
to itself. All this should require is poking your favorite infra contact
until they set it up. Beyond that, forcing recalcitrant upstreams to
move is futile since Gentoo has no leverage besides asking nicely.

Speaking for myself, I avoid hosting most of my Gentoo-related work
(outside of gentoo repo ebuild mangling) on gentoo.org since I prefer
the services offered elsewhere in terms of usability, visibility, and
project maintenance. Take this as constructive criticism of how Gentoo
currently operates as an upstream host and see it as a call for putting
more emphasis towards deploying GitLab, Gitea, or other similar service
for Gentoo.

Tim



Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-16 Thread Alec Warner
On Wed, Sep 16, 2020 at 1:17 AM Kent Fredric  wrote:

> On Mon, 14 Sep 2020 10:15:31 -0400
> Rich Freeman  wrote:
>
> > It might be easier to take smaller steps, such as having a policy that
> > "any call for devs to use/test a new tool/service, or any service that
> > automatically performs transactions on bugzilla, must be FOSS, and the
> > link to the source must be included in the initial communication, and
> > it must be clear what version of the code is operating at any time."
> > That is a pretty low barrier to those creating tools, though it
> > doesn't address the infra concern.  However, it does mean that infra
> > is now free to fork the service at any time, and reduces the bus
> > factor greatly.
>
> For the situation of things that take life before being part of infra,
> I think the least we can do is recognize their utility and importance,
> and at least, have infra *offer* some sort of shared location to run a
> deployment.
>
> That I think helps everyone, gives people a place to remove their own
> bus factor, but without mandatory strongarming.
>

The main challenge is always getting stuff onto infra. The past few things
we have launched have been because the developers in question joined infra
to launch their work.

 - repomirror-ci and all the CI stuff is on infra because mgorny is also on
infra! It's not like we set his stuff up for him; instead we gave him
access to all the infra repos and he had to write his own puppet configs
and whatnot. The benefit of course is that anyone on infra can bump the
stuff and login to the machines and debug...but its not exactly a low bar.
 - packages.gentoo.org was also originally arzano pushing patches to me
(which I would merge and then release by hand.) Just like with ebuilds this
eventually became too tedious and he was onboarded as a dev and infra
member to eliminate the middleman.

I think traditionally it's been a slog for non-infra people to get infra to
host much of anything and due to difficulties with the all-or-nothing
approach we take with infra credenteials; can really set a high bar to host
much of anything these days.

-A


Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-16 Thread Jonas Stein
Hi,

> However, we are still facing the same problem: Only one person is
> involved in development and knows how to run it. In case something will
> break again and Michał will be unavailable, we can’t just push a fix and
> watch a CI pipeline picking up and deploying new nattka. Instead someone
> will have to fork repository from Michał’s private repository at GitHub,
> make the changes and hope that anyone within infrastructure team can
> help to deploy fixed nattka.

The heart of a distribution is basically its infrastructure and the
tools to test, maintain and distribute packages.

If a distribution relies on external sources, which are not maintained
by the distribution, but a single person, it has been forked.

A healthy distribution needs to maintain its own tools.

-- 
Best,
Jonas



Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-16 Thread Rich Freeman
On Wed, Sep 16, 2020 at 4:17 AM Kent Fredric  wrote:
>
> On Mon, 14 Sep 2020 10:15:31 -0400
> Rich Freeman  wrote:
>
> > It might be easier to take smaller steps, such as having a policy that
> > "any call for devs to use/test a new tool/service, or any service that
> > automatically performs transactions on bugzilla, must be FOSS, and the
> > link to the source must be included in the initial communication, and
> > it must be clear what version of the code is operating at any time."
> > That is a pretty low barrier to those creating tools, though it
> > doesn't address the infra concern.  However, it does mean that infra
> > is now free to fork the service at any time, and reduces the bus
> > factor greatly.
>
> For the situation of things that take life before being part of infra,
> I think the least we can do is recognize their utility and importance,
> and at least, have infra *offer* some sort of shared location to run a
> deployment.
>
> That I think helps everyone, gives people a place to remove their own
> bus factor, but without mandatory strongarming.

This might be one option.  Another might be to try to have a more
"open infra" approach where core services like authentication are
provided by Gentoo, but available to 3rd parties.  This could allow
more services to be externally hosted, ideally with redundancy (not
just at the host level, but at the maintainer/software level as well).
The obvious downside to this would be chaos - we might have 3 list
servers, 5 bug trackers, 10 git repos, and so on.  However, if
anything did go down we'd have half a dozen potential replacements so
all anybody needs to do is migrate their own stuff to their chosen
copy.

The value add of Gentoo would be in central services like
identity/reputation and curation.  There might be 14 git repos out
there but Gentoo would control which ones end up on the master rsync
server and which rsync mirrors get advertised as being genuine, and
which list servers are official.

I realize this is a bit more tangential.  I just think that infra is
already a huge failure point, so having more stuff on infra actually
makes that failure point more critical.  A Gentoo where little is
hosted on stuff we own is much more resilient in the face of
legal/money/etc issues.  If Gentoo just becomes some blessed config
files, a website, and SAML then anybody could host the core from their
basement.

Maybe it is a good thing that core services aren't always hosted by infra?

-- 
Rich



Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-16 Thread Kent Fredric
On Mon, 14 Sep 2020 10:15:31 -0400
Rich Freeman  wrote:

> It might be easier to take smaller steps, such as having a policy that
> "any call for devs to use/test a new tool/service, or any service that
> automatically performs transactions on bugzilla, must be FOSS, and the
> link to the source must be included in the initial communication, and
> it must be clear what version of the code is operating at any time."
> That is a pretty low barrier to those creating tools, though it
> doesn't address the infra concern.  However, it does mean that infra
> is now free to fork the service at any time, and reduces the bus
> factor greatly.

For the situation of things that take life before being part of infra,
I think the least we can do is recognize their utility and importance,
and at least, have infra *offer* some sort of shared location to run a
deployment.

That I think helps everyone, gives people a place to remove their own
bus factor, but without mandatory strongarming.


pgpJ3wCEsqaTS.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-14 Thread Rich Freeman
On Sun, Sep 13, 2020 at 11:52 PM Kent Fredric  wrote:
>
> But when you file a bug, you rely on bugzilla being maintained by
> Gentoo Infra, not some 3rd party.
>

I think the Council will need to consider where it wants to draw the
lines on something like this.  Here is my sense of how these sorts of
things come about:

1.  Somebody sees an opportunity for improvement and writes some code.
They interface it on their own with git/bugzilla/whatever and host it
on their own systems.  They use it for a while and improve it.

2.  They start to advertise it and call for testers.  This is nothing
more than a list post at the start.  It is completely optional.  A few
people start using it and find that it is helpful.

3.  It is still optional, but since it is helpful the 10% of the devs
who do 90% of the work in the relevant area (like arch teams/etc)
adopt it, which means that 90% of the work is using the new tool,
still self-hosted by the dev.  It might or might not have any source
published.

4.  The devs who are using the tool are also the ones maintaining all
the documentation for the official workflows, and they update it to
reflect what they're actually doing.  It might still be optional.  (In
fact, as far as I can tell from reading the docs nattka is still
optional - you could still just CC arch teams and so on yourself -
heck, arch teams can stabilize things even if you don't file a bug
though this is unlikely to happen much.)

5.  At some point somebody notices that 80% of the problems come from
the 10% of the work that isn't doing things the new way, and the new
way stops being optional.

Maybe somebody closer to these tools might want to correct something
above.  However, as an observer this is how these things seem to
evolve - it is a very bazaar-like methodology.

Keep in mind that rules don't make things happen - they prevent things
from happening.  The hope behind a rule is that if you dam off
something suboptimal the enthusiasm travels down some other path and
doesn't just die off.  So, where do you build the dam above?  Do you
let steps 1-4 happen and draw the line at step 5, which might just
mean that we accept the 80% of the problems that come from it being
optional until infra hosts it?  Do we draw the line all the way up at
1 and block any use of APIs in ways that are not explicitly approved?
Do we block it at step 4, so the arch team is using nattka for 90% of
the cases, and they just trade notes via email and nobody else knows
what they're doing because the wiki reflects a process nobody actually
follows?

I realize that I'm mostly pointing out things that can go wrong.

I don't think anybody would say that it is better not having infra
maintaining critical infra.  The problem is that the infra team
probably isn't going to officially host stuff way back at step 1.  A
random dev can't write a script and ask infra to start running it and
bug them 3 times a day to do a pull from their git repo.  Infra is
probably going to wait until something closer to step 3-4 before they
get involved, which means the tool is already being used for a
substantial amount of work.  I'm not sure if we even have a defined
process for getting new tools like these onto infra, or how we do
config/change management in these cases.

The council can say "don't use non-infra-hosted services as part of
essential processes" but what does that actually mean?  Does that mean
going up to step 3, so 90% of your arch testing bugs are going through
nattka, but it just isn't documented on the wiki?  Does it mean going
up to step 2, so some portion of them are - if so how do you prevent
it from going from 10% to 90% if the new tool works better?  Does it
mean not interfering at all with 1-5 but imploring infra and the
service maintainers to figure something out?  If the service isn't
expensive to run, those maintaining the service might not see much
benefit in moving it over, and infra is of course always
manpower-constrained.

It might be easier to take smaller steps, such as having a policy that
"any call for devs to use/test a new tool/service, or any service that
automatically performs transactions on bugzilla, must be FOSS, and the
link to the source must be included in the initial communication, and
it must be clear what version of the code is operating at any time."
That is a pretty low barrier to those creating tools, though it
doesn't address the infra concern.  However, it does mean that infra
is now free to fork the service at any time, and reduces the bus
factor greatly.

-- 
Rich



Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-13 Thread Kent Fredric
On Sun, 13 Sep 2020 12:04:39 -0700
Alec Warner  wrote:

> Is openrc critical to Gentoo? it doesn't live on our infra.
> Is pkgcore critical to Gentoo? it doesn't live on our infra.

Both those things are "things employed by users for their systems".

Neither of those things are integral to any workflow, and are entirely
optional.

But when you file a bug, you rely on bugzilla being maintained by
Gentoo Infra, not some 3rd party.

And when you file a keyword/stable request, you rely on a
bugzilla-integrated functionality through nattka to check and verify
keywords.

Subsequently, whether or not you opted into this, nattka is now
critical to the workflow of everyone doing keywording/stable requests.

You can't really say that about pkgcore or openrc.

But you can say that about the QA Automated Testing service, which
mangles gentoo.git and creates sync/gentoo.git, and reports when people
broke the tree.

And that *uses* pkgcore.

But I'm pretty sure that infrastructure lives on gentoo "somewhere",
and if it doesn't, it should.



pgpGWwZcOK06b.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-13 Thread Michał Górny
On Sun, 2020-09-13 at 20:31 +0200, Thomas Deutschmann wrote:
> You maybe all remember what happened to stable-bot: Years ago,
> kensington created stable-bot on his own as PoC which revolutionized the
> way how we do package stabilization in bugzilla. The service run on his
> own infrastructure. Because of the benefit of the service the bot
> provided, arch team’s workflow became dependent on stable-bot. We were
> lucky that stable-bot just worked most of the time until the service was
> down for a while. Nobody was able to help here: Kensigton himself was
> unavailable, nobody had the sources… the end of the story: mgorny
> created nattka which replaced stable-bot.
> 
> However, we are still facing the same problem:

No, we're not.  Don't you see the huge difference between proprietary
closed-source software and free software?

>  Only one person is involved in development

How is that a problem?  It is quite normal that simple tools are
developed primarily by one person, and nattka is not exactly rocket
science.  That said, nobody is stopping others from working on it, there
are surely some improvements to be done.

> and knows how to run it.

Everything is documented and fully contained in puppet.  Deploying it to
new server is as trivial as enabling it on host in question.

> In case something will
> break again and Michał will be unavailable, we can’t just push a fix and
> watch a CI pipeline picking up and deploying new nattka.

Who is 'we' here?  Our Infra does not use 'CI pipelines', it uses Gentoo
packages.  Deploying a new version means bumping the package, waiting
for rsync to distribute it (meh!) and telling puppet to upgrade.  No
buzz words involved, sorry.

>  Instead someone will have to fork repository from Michał’s private
> repository at GitHub, make the changes

It's not private, it's public.  And to be honest, forking it on GitHub
will certainly take less time and effort than dealing with
git.gentoo.org (which needs to happen via Infra).

> and hope that anyone within infrastructure team can
> help to deploy fixed nattka.

Are you suggesting there is a problem with Gentoo Infrastructure?
I really don't see where this is going.  You're concerned that Infra
can't handle it, yet you actually ask to make it more reliant
on Infra...

> This is what the motion is about: This is not about that Gentoo depends
> on single persons or things like that. It’s about the idea to
> *formalize* the requirement that any service and software which is
> critical for Gentoo (think about pkgcore) should live within Gentoo
> namespace (https://gitweb.gentoo.org/), i.e. be accessible for *any*
> Gentoo developer and deployments should be based on these repositories.

You should weigh your words more carefully.  Otherwise, we're going to
end up forking the whole base-system, kernel...

> Or in other words: Make sure that we adhere to social contract even for
> critical software and services Gentoo depends on. So that we will never
> ever face the situation that something we depend on doesn’t work
> anymore.

I fail to see how this is going to actually accomplish the goal.  Having
a different pipeline for committing does not make stuff not break, nor
makes it more likely for people to fix it.  In fact, I dare say having
nattka on GitHub increases the chances of someone (esp. non-developer)
submitting a fix, compared to obscure git.gentoo.org with no clear
contribution pipeline.

>  Taking care of working pipelines before something is broken
> should also help us in case something stops working so we don’t have to
> figure out how to fix and re-deploy when house is already burning (like
> portage: In case Zac can't do a release for some reason, in theory,
> every Gentoo developer would be able to roll a new release).

Please tell me how rolling a new release of nattka is exactly harder
than rolling a release of Portage?  I'm pretty sure most of Gentoo
developers can figure out how to use GitHub, how to fork a repository,
and if they use Python they can probably deal with pretty standard
setup.py.  Deployment can't be done without Infra's help anyway.

-- 
Best regards,
Michał Górny



signature.asc
Description: This is a digitally signed message part


Re: [gentoo-dev] [RFC] Services and software which is critical for Gentoo should be developed/run in Gentoo namespace

2020-09-13 Thread Alec Warner
On Sun, Sep 13, 2020 at 11:31 AM Thomas Deutschmann 
wrote:

> Hi,
>
> TL;DR: jstein asked council [Bug 729062] for a motion that any service
> and software which is critical for Gentoo should be developed/run in
> Gentoo namespace. Because any request to council must be discussed I
> volunteered to bring this topic to the mailing list (sorry for the huge
> delay!).
>
>
> Problem
> ===
> You maybe all remember what happened to stable-bot: Years ago,
> kensington created stable-bot on his own as PoC which revolutionized the
> way how we do package stabilization in bugzilla. The service run on his
> own infrastructure. Because of the benefit of the service the bot
> provided, arch team’s workflow became dependent on stable-bot. We were
> lucky that stable-bot just worked most of the time until the service was
> down for a while. Nobody was able to help here: Kensigton himself was
> unavailable, nobody had the sources… the end of the story: mgorny
> created nattka which replaced stable-bot.
>
> However, we are still facing the same problem: Only one person is
> involved in development and knows how to run it. In case something will
> break again and Michał will be unavailable, we can’t just push a fix and
> watch a CI pipeline picking up and deploying new nattka. Instead someone
> will have to fork repository from Michał’s private repository at GitHub,
> make the changes and hope that anyone within infrastructure team can
> help to deploy fixed nattka.
>
> This is what the motion is about: This is not about that Gentoo depends
> on single persons or things like that. It’s about the idea to
> *formalize* the requirement that any service and software which is
> critical for Gentoo (think about pkgcore) should live within Gentoo
> namespace (https://gitweb.gentoo.org/), i.e. be accessible for *any*
> Gentoo developer and deployments should be based on these repositories.
> Or in other words: Make sure that we adhere to social contract even for
> critical software and services Gentoo depends on. So that we will never
> ever face the situation that something we depend on doesn’t work
> anymore. Taking care of working pipelines before something is broken
> should also help us in case something stops working so we don’t have to
> figure out how to fix and re-deploy when house is already burning (like
> portage: In case Zac can't do a release for some reason, in theory,
> every Gentoo developer would be able to roll a new release).
>

I think your examples are a bit weird.

Is openrc critical to Gentoo? it doesn't live on our infra.
Is pkgcore critical to Gentoo? it doesn't live on our infra.

Note that these are just packages, not services and the social contract
just says
"""However, Gentoo will never depend upon a piece of software or metadata
unless it conforms to the GNU General Public License, the GNU Lesser
General Public License, the Creative Commons - Attribution/Share Alike or
some other license approved by the Open Source Initiative."""

It says nothing about where things are hosted or how services are provided.

I'd consider splitting the two here. For packages I don't think it matters
as much where they are hosted. Most things can be mirrored into gentoo (if
we want a copy of the src tree) and we also have tarballs of the source
code much of the time on the mirror network.

For services, I tend to agree more with your comments; we need need
visibility and operational capability for services. When we rely on service
components where the source is not available; its bad. But we rely on
numerous services now. E.g. p.g.o relies on repology. Does that mean we
need the source code to repology? I assume not. Does that mean we need to
run our own repology? Also I assume not.

-A


>
> See also:
> =
> Bug 729062: https://bugs.gentoo.org/729062
>
>
> --
> Regards,
> Thomas Deutschmann / Gentoo Linux Developer
> C4DD 695F A713 8F24 2AA1 5638 5849 7EE5 1D5D 74A5
>
>