Re: [gentoo-dev] [RFC] Should NATTkA reject keywordreqs for packages with -arch (-*) keywords?

2020-05-05 Thread Thomas Deutschmann
On 2020-05-06 00:52, James Le Cuirot wrote:
> On Tue, 05 May 2020 22:19:59 +0200
> Michał Górny  wrote:
>> 
>> WDYT?
> 
> Play it safe. -* is frequently used for binary packages where an arch
> will simply either work or it won't, with little likelihood of the
> situation changing. -arch is so rare that I don't recall ever seeing
> it. In either case, restoring an arch should be an explicit action.

+1


-- 
Regards,
Thomas Deutschmann / Gentoo Linux Developer
C4DD 695F A713 8F24 2AA1 5638 5849 7EE5 1D5D 74A5



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] user.eclass ignores ROOT/SYSROOT

2020-05-05 Thread Bertrand Jacquin

On 2020-05-05 21:22, Peter Stuge wrote:

Hi,

I'm trying something out over here and I'm surprised to find that
acct-group/* do not work with ROOT+SYSROOT != "/".

Should I file yet another bug about this


Correct, https://bugs.gentoo.org/show_bug.cgi?id=541406 is tracking this 
and has some background into it.


--
Bertrand



Re: [gentoo-dev] [RFC] Should NATTkA reject keywordreqs for packages with -arch (-*) keywords?

2020-05-05 Thread James Le Cuirot
On Tue, 05 May 2020 22:19:59 +0200
Michał Górny  wrote:

> Hi,
> 
> TL;DR: should NATTkA reject request to keyword on arch if the ebuild has
> '-arch' (or '-*') in KEYWORDS already?
> 
> 
> Background: I've recently been rekeywording two packages that gained
> dependency on gevent.  When I was mass-requesting rekeywording, it
> escaped my attention that gevent is explicitly marked '-ia64'.  The arch
> team apparently got mad at me and added gevent to their package.mask to
> make its breakage more explicit.
> 
> I think it would make sense if NATTkA detected '-ia64' there and told me
> that the package is keyword-masked on ia64.
> 
> The flip side is that it would prevent people from using NATTkA to
> restore keywords that were marked '-arch' before.  Of course, if this
> would ever be necessary it could easily be resolved via removing '-arch' 
> first or adding some extra hack.
> 
> WDYT?

Play it safe. -* is frequently used for binary packages where an arch
will simply either work or it won't, with little likelihood of the
situation changing. -arch is so rare that I don't recall ever seeing
it. In either case, restoring an arch should be an explicit action.

-- 
James Le Cuirot (chewi)
Gentoo Linux Developer


pgpsWcCmYiPbG.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] Ideas for gentoostats implementation

2020-05-05 Thread Toralf Förster
On 5/5/20 10:26 PM, Daniel Pielmeier wrote:
> Actually the maintainer decided to continue the project.
> The code is now hosted at Github [1].
> The site moved to a new server and the upload is working again.
> 
> [1] https://github.com/portagefilelist
> 
> -- 
> Best regards
> Daniel

Indeed - I'm reactivating the pfl logic in the tinderbox script.

-- 
Toralf
PGP 23217DA7 9B888F45



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] user.eclass ignores ROOT/SYSROOT

2020-05-05 Thread David Michael
On Tue, May 5, 2020 at 4:22 PM Peter Stuge  wrote:
> Hi,
>
> I'm trying something out over here and I'm surprised to find that
> acct-group/* do not work with ROOT+SYSROOT != "/".
>
> Should I file yet another bug about this?
>
> I suppose the limitation is in user.eclass, but what about the 11 bugs
> already filed about exactly this problem?
>
> They are easy to see in the dup bug list at https://bugs.gentoo.org/53269
>
> Unfortunately mgorny closed 53269 WONTFIX because GLEP-27 is Deferred,
> causing all dup and dep bugs to be forgotten. Sad panda.
>
> --8<-- reproduce
> # export r=$(mktemp -d)
> # ROOT=$r SYSROOT=$r strace -fe execve emerge baselayout acct-group/ftp 2>&1 
> | grep groupadd
> [pid 13269] execve("/usr/sbin/groupadd", ["groupadd", "-r", "-g", "21", 
> "ftp"], 0x5d7e299e2340 /* 227 vars */) = 0
> groupadd: cannot lock /etc/group; try again later.
>  *   groupadd -r ${opts} "${egroup}" || die
>  *   groupadd -r ${opts} "${egroup}" || die
> -->8--
>
> In my particular case -R $r would work just fine, but as can be seen
> in several of those 11 dup bugs it is not a general solution.
>
> Any ideas on how to solve this?

I know it's not a general fix, but my solution for building a separate
systemd root was to use sysusers.  You could try the eclass patch(es)
at https://bugs.gentoo.org/702624 if you're using at least systemd
245.  (Older versions work if you apply the upstream commit linked in
the bug.)  If you're running a systemd host with a non-systemd target,
you can still probably run "systemd-sysusers --root=$ROOT" after
emerge to generate the accounts that way.

Thanks.

David



Re: [gentoo-dev] [RFC] Ideas for gentoostats implementation

2020-05-05 Thread Daniel Pielmeier
Am May 5, 2020 7:31:34 PM UTC schrieb "Toralf Förster" :
>On 4/26/20 10:08 AM, Michał Górny wrote:
>> I don't think we really want to try to investigate
>> which files are actually used but focus on what's installed.
>Hi,
>
>I do wonder if the http://www.portagefilelist.de/site/start (package
>app-portage/pfl) would be part of that or not?
>The maintainer of the pfl stopped the import of new data last year due
>to lack fo time to maintain that project and is looking for a
>usccessor.

Actually the maintainer decided to continue the project.
The code is now hosted at Github [1].
The site moved to a new server and the upload is working again.

[1] https://github.com/portagefilelist

-- 
Best regards
Daniel

[gentoo-dev] user.eclass ignores ROOT/SYSROOT

2020-05-05 Thread Peter Stuge
Hi,

I'm trying something out over here and I'm surprised to find that
acct-group/* do not work with ROOT+SYSROOT != "/".

Should I file yet another bug about this?

I suppose the limitation is in user.eclass, but what about the 11 bugs
already filed about exactly this problem?

They are easy to see in the dup bug list at https://bugs.gentoo.org/53269

Unfortunately mgorny closed 53269 WONTFIX because GLEP-27 is Deferred,
causing all dup and dep bugs to be forgotten. Sad panda.


--8<-- reproduce
# export r=$(mktemp -d)
# ROOT=$r SYSROOT=$r strace -fe execve emerge baselayout acct-group/ftp 2>&1 | 
grep groupadd
[pid 13269] execve("/usr/sbin/groupadd", ["groupadd", "-r", "-g", "21", "ftp"], 
0x5d7e299e2340 /* 227 vars */) = 0
groupadd: cannot lock /etc/group; try again later.
 *   groupadd -r ${opts} "${egroup}" || die
 *   groupadd -r ${opts} "${egroup}" || die
-->8--

In my particular case -R $r would work just fine, but as can be seen
in several of those 11 dup bugs it is not a general solution.


Any ideas on how to solve this?


Thanks

//Peter



[gentoo-dev] [RFC] Should NATTkA reject keywordreqs for packages with -arch (-*) keywords?

2020-05-05 Thread Michał Górny
Hi,

TL;DR: should NATTkA reject request to keyword on arch if the ebuild has
'-arch' (or '-*') in KEYWORDS already?


Background: I've recently been rekeywording two packages that gained
dependency on gevent.  When I was mass-requesting rekeywording, it
escaped my attention that gevent is explicitly marked '-ia64'.  The arch
team apparently got mad at me and added gevent to their package.mask to
make its breakage more explicit.

I think it would make sense if NATTkA detected '-ia64' there and told me
that the package is keyword-masked on ia64.

The flip side is that it would prevent people from using NATTkA to
restore keywords that were marked '-arch' before.  Of course, if this
would ever be necessary it could easily be resolved via removing '-arch' 
first or adding some extra hack.

WDYT?

-- 
Best regards,
Michał Górny



signature.asc
Description: This is a digitally signed message part


Re: [gentoo-dev] [RFC] Ideas for gentoostats implementation

2020-05-05 Thread Kent Fredric
On Tue, 5 May 2020 02:47:48 +0200
Thomas Deutschmann  wrote:

> Yes it would be a signal but a useless signal, not?

"There are no users reported using this dist, so we can nuke it" is
still far far superior to "there are no reverse dependencies, so we can
nuke it"

*Even* when the former is false information.

As presently, the "no reverse dependencies, therefore nuke" essentially
asserts there *are* no users to consider.

So the *worst* case scenario for decisions made with these statistics
is our *current* case.

Even if *nobody* uses the service and *all* results indicates "nobody
uses anything", then we'll just be reverting to what we currently do:
Remove things entirely on conjecture that they're not useful.



pgpocHsjWcN5y.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] Ideas for gentoostats implementation

2020-05-05 Thread Toralf Förster
On 4/26/20 10:08 AM, Michał Górny wrote:
> I don't think we really want to try to investigate
> which files are actually used but focus on what's installed.
Hi,

I do wonder if the http://www.portagefilelist.de/site/start (package 
app-portage/pfl) would be part of that or not?
The maintainer of the pfl stopped the import of new data last year due to lack 
fo time to maintain that project and is looking for a usccessor.

-- 
Toralf
PGP 23217DA7 9B888F45



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] Ideas for gentoostats implementation

2020-05-05 Thread Jaco Kroon
Hi Michał, and the rest of the Gentoo devs,

I've been patiently sitting and watching this discussion.

I raised some ideas with another developer (Not Michał) just days before
he raised this thread to the ML.

I believe all points raised to this point is valid, I'll try to summarise:

1.  This must be completely *opt in*.
2.  Anonymity was discussed by various parties (privacy).
3.  "spam" protection (ie, preventing bogus data from entering).
4.  Trustworthiness of data.
5.  Acceptance of some form of privacy policy.

In my opinion, points 2 and 3 works against each other, in that if
registration is compulsory if you would like to submit stats, then we
can control the spam more easily (not foolproof), but requiring
registration also raises the entry barrier.  I'd be completely willing
to provide at least an email address as part of a submission.

All of the replies seems to have focused purely on yes/no, do it or
don't.  Not many have addressed the benefits to end users/system
administrators.  It seems to focus is on what we as developers can get
out of this.

Regarding the above points:

1.  I fully agree.  This should not be forced on anyone.
2.  Happy to concede that some people may wish to submit anonymously. 
Let them.
3.  I'll address this below.
4.  A lot of the discussion has been around the usefulness of the data,
and I concede to Thomas that this may (or may not) generate "decision
blind spots" or as per "artificially increase decision certainty".  I
don't see how this is worse than what we've got now.
5.  We have the infrastructure for this already by way of licenses.  So
we ship with "GPLv2/3/whatever + GentooPrivacy", and users have to first
take explicit action to accept GentooPrivacy.

I have some other ideas around this, which will tread even further on
privacy, but again, all of this should be a kind of opt-in, and building
on the ideas by Kent where he suggested a form of submission proxy
(STATS_SERVER), we could potentially give the full benefit of the code
to such entities, but then still allow them to submit "upstream" in a
more filtered manner.

Bottom line, in my opinion:  Any data is better than no data!

Whilst we can't say "no one is using xyz", we will at least be able to
say "hey, some people are using xyz", and whilst this may generate some
blinds it at least enables us to test known use cases during
test-builds, eg, we know for a fact a thousand users are using package X
with USE flags "-* a b c", so we should definitely run that as a compile
test.  Your build breaks frequently?  Would you mind submitting stats? 
Great thank you.  You not willing to do that, then my stance becomes one
of "ok, I'll help where I can, but really, please consider us to help
you, if you submit stats we can pre-emptively at least include build
tests for your specific USE flags." - and again, this means we can
actually have our tooling use these stats to generate build tests for
the "known popular" configs.

I point you to RHEL - why are people willing to pay for for RHEL?  What
do they get for that buck?  Because I promise you, the support I get
from fellow Gentoo'ers FAR outweigh the support I have ever gotten from
(paid for) RHEL.  Most of the time.

I myself used to run 500+ Gentoo hosts more than 15 years back.  It was
fun.  I was also a student back then so had much more time on my hands
than I do now.  It was challenging, and fun to try and get things to
work exactly the way we envisioned it should.  I promise you, if what
Michał proposes was available for me back then to firstly keep track of
my own internal assets, and to submit stats upstream to help improve
Gentoo I would not have hesitated for 10 seconds.

And there I touch on a point I'm trying to make - this should be
something that not only helps devs, but brings benefit to users.  I'll
say more on this at the end of the email (possibly force users to run
some of their own infra for this at least, but these stats form the
framework for a multi-system management system too, potentially).  First
I'd like to pay more attention to the individual points raised by Michał.

On 2020/04/26 10:08, Michał Górny wrote:

> Hi,
>
> The topic of rebooting gentoostats comes here from time to time.  Unless
> I'm mistaken, all the efforts so far were superficial, lacking a clear
> plan and unwilling to research the problems.  I'd like to start
> a serious discussion focused on the issues we need to solve, and propose
> some ideas how we could solve them.
>
> I can't promise I'll find time to implement it.  However, I'd like to
> get a clear plan on how it should be done if someone actually does it.

My time is also limited, but I would love to be involved in some way or
another.

> The big questions
> =
> The way I see it, the primary goal of the project would be to gather
> statistics on popularity of packages, in order to help us prioritize our
> attention and make decisions on what to keep and what to remove.  Unlike
> Debian's 

Re: [gentoo-dev] [RFC] Ideas for gentoostats implementation

2020-05-05 Thread Nils Freydank
Hi all,

I find the idea of having data great, but agree that it can lead to a false
sense of having a correct data base. Therefor two thoughts:

First, therefore I'd like to propose that you introduce gentoostats as a
*strictly timed experiment* and evaluate if it actually changed anything within
your decisions and drop it or let run permanently afterwards.

I have no proper solution for the parameters though, maybe something like
"I choose to keep X use flags based on g.s.", but this would ask every dev to
log plenty of decisions manually (read: I don't think this will happen).

Second, I'm a bit frightened of Whissi's thought of dropping anything
security related based on non-input via g.s. -- I'd like to ask you to use the
information based on g.s. *not* for security related decisions, more for
"harmless" ones like the Matt mentioned: Should I really support feature X while
literally everyone of 200 users uses feature Y instead and I have no real
testing ground for feature X (Matt, yell at me if I got you wrong!).

Kind regards,
Nils (holgersson on Freenode)


signature.asc
Description: PGP signature


Re: [gentoo-dev] acct-user/acct-group trouble

2020-05-05 Thread Alarig Le Lay
On Tue 05 May 2020 15:23:34 GMT, lego12...@yandex.ru wrote:
> Hm. And what version should i choose for such ebuilds?..

It usually on version 0.

-- 
Alarig



Re: [gentoo-dev] acct-user/acct-group trouble

2020-05-05 Thread lego12239
Hm. And what version should i choose for such ebuilds?..

-- 
Олег Неманов (Oleg Nemanov)



Re: [gentoo-dev] acct-user/acct-group trouble

2020-05-05 Thread lego12239
On Tue, May 05, 2020 at 01:34:41PM +0200, Michał Górny wrote:
> You don't understand correctly.  Read the doc for ACCT_GROUP_ID.

Found. Thanks.

-- 
Олег Неманов (Oleg Nemanov)



Re: [gentoo-dev] acct-user/acct-group trouble

2020-05-05 Thread Michał Górny
On Tue, 2020-05-05 at 14:24 +0300, Oleg wrote:
> Hi, all.
> 
> I'm creating a ebuild and need to create user/group during it installation.
> As i create a package/ebuild for our company internal use, it useless to
> request uid/gid to be saved in uid-gid.txt. Also, i don't need a constant
> uid/gid, just want a user/group creation. This package will be installed on
> various machines and i don't want to track unused uid/gid from uid-gid.txt
> and also each of these machines. Any available is sufficient.
> 
> If i understand correctly, acct-user/acct-group doesn't allow me do this.
> And i must choose a constant uid/gid and pray that
> they willn't match existent ones on any machine, amn't it?

You don't understand correctly.  Read the doc for ACCT_GROUP_ID.

-- 
Best regards,
Michał Górny



signature.asc
Description: This is a digitally signed message part


[gentoo-dev] acct-user/acct-group trouble

2020-05-05 Thread Oleg
Hi, all.

I'm creating a ebuild and need to create user/group during it installation.
As i create a package/ebuild for our company internal use, it useless to
request uid/gid to be saved in uid-gid.txt. Also, i don't need a constant
uid/gid, just want a user/group creation. This package will be installed on
various machines and i don't want to track unused uid/gid from uid-gid.txt
and also each of these machines. Any available is sufficient.

If i understand correctly, acct-user/acct-group doesn't allow me do this.
And i must choose a constant uid/gid and pray that
they willn't match existent ones on any machine, amn't it?

If so, can i use 'enewuser ${PN} -1 -1 "${UHOME}" ${PN} > /dev/null' as usual?

-- 
Олег Неманов (Oleg Nemanov)



Re: [gentoo-dev] [RFC] Ideas for gentoostats implementation

2020-05-05 Thread Michał Górny
On Tue, 2020-05-05 at 02:47 +0200, Thomas Deutschmann wrote:
> Yes it would be a signal but a useless signal, not?
> 

You seem to aim for arbitrarily blocking developers from making
decisions by preventing them from having data.  This won't work. 
Firstly, because *we have* to make decisions, and the worse data we
have, the more arbitrary decisions will be.  Secondly, because we always
will have some data, it will probably be worse than what's being
proposed here.

Generally, having more data means making better informed decisions.
Of course, there's always the potential of having too much data (though
I honestly don't think we're anywhere near that).  There's also
the potential of being lazy and just taking the easiest available data. 
There's no way around that but then, you can also be lazy and make
decisions ignoring any data.


For example, one kind of data we have right now are bugs.  So a package
fails for me in an obvious way yet there's no bug open.  Does that mean
that the package has zero users?  Otherwise someone would have reported
the problem, right?  So here go last rites.

Gentoostats could tell me 'hey, this package has bunch of users still'. 
This questions my first assessment -- 'oh, they probably haven't had to
rebuild it since ...'


If I have no data, we have to rely on 'gut feelings'.  I have a gut
feeling that this package looks useless, why bother.  Is that more
worthwhile than having *some* number to look at?  Even if the data is
biased towards specific kind of users, it would probably work better
than guessing.  And if it looks unreasonable, nobody stops you from
guessing.  I guess that an informed guess is better than a random guess.

-- 
Best regards,
Michał Górny



signature.asc
Description: This is a digitally signed message part


Re: [gentoo-dev] [RFC] Ideas for gentoostats implementation

2020-05-05 Thread Alec Warner
On Mon, May 4, 2020 at 10:14 PM Matt Turner  wrote:

> On Mon, May 4, 2020 at 5:48 PM Thomas Deutschmann 
> wrote:
> >
> > On 2020-04-26 15:46, Kent Fredric wrote:
> > > On Sun, 26 Apr 2020 14:38:54 +0200
> > > Thomas Deutschmann  wrote:
> > >
> > >> Let's assume we will get reports that app-misc/foo is only installed
> 20
> > >> times. If you are going to judge based on this data, "Obviously,
> nobody
> > >> is using that package, it's stuck on ... safe to remove"
> your
> > >> view is biased:
> > >
> > > I see this as more like what bloom filters get you, but in reverse:
> > >
> > > [...]
> > >
> > > - But now, instead of having "we don't know if anybody uses this", you
> > >   *can* have a "we know for sure somebody uses this".
> >
> > But how does that information really help us to decide anything in the
> end?
> >
> > Case A, stats are showing 0 users:
> >
> > Like said, we can't know if this is true or if this package is only used
> > in setups where people don't report stats.
> >
> >
> > Case B, stats are showing x users:
> >
> > Now what? Package from case A could have similar users -- we just don't
> > know. Assume firefox has 1.000 users, chromium has 500 users and vivaldi
> > doesn't show up in stats. How does that help us? Would this allow us to
> > skip publishing GLSAs for vivalid because we assume nobody in Gentoo is
> > using vivaldi? Does it allow Python project to go forward pushing a mask
> > for removal in case vivaldi would depend on Python version, Python
> > project want to get rid of? Would this allow Gentoo PR to make a public
> > statement like "Firefox is the most popular browser in Gentoo, twice as
> > users as chromium"?
>
> I hate the saying "the perfect is the enemy of the good" but I think
> it applies here.
>
> You're of course correct that we would not have perfect information.
> But the thing about statistics is that you can still know some things
> based on a sampling of that perfect information.
>
> I would personally like to have data on whether users of my packages
> have certain USE flags enabled. Knowing that would allow me to decide
> whether its worth the maintenance burden of supporting features that I
> *think* are very rarely used. If instead the data showed me that 50%
> of users had IUSE=xyz enabled, I probably wouldn't consider removing
> it.
>
> I think your example of potential misuse of data is a bit over dramatic.
>

Let me present the same point another way.

Today we have no data, so we make an arbitrary decision. It might be right
or wrong; and we may not know until after we decide.
This is traditionally things like "break them and they will come" type of
process. "Mask it, if they complain, I'll unmask it."

In the future, we could have this package data. It may influence decision
making. However I'm not sure from a decision-making standpoint that it is
strictly worse than no data.
The danger (which is what I think Whissi's concern is) is that it could
artificially increase decision certainty.

For example, if I have to decide whether to keep a package, or a flag, or
whatever. I might make an arbitrary decision. I'm aware it's arbitrary, it
might be wrong, and so I'm not super attached to such a decision. I'm not
*certain* about it; but I have to decide one way or the other[0]. Then I
move to a world with package data. Now I'm no longer making an arbitrary
decision; I'm making a decision based on *data*. The *data* tells me my
decision is correct, resulting in a more *certain* decision outcome. I
think this is the fallacy we want to avoid. The data can be informative but
there are significant biases in it that should result in very *little*
certainty added to decision making.

Making decisions based on incomplete data is just life though, so I'm
fairly skeptical of a "we shouldn't collect any data" type of mindset. I'd
be curious to see if we can instill a *culture* component around the use of
data in our development workflows.

-A

[0] There are a bunch of other cultural components here, like different
decision types (1 vs 2) and the ability to make a mistake in public and not
feel bad about it; so I'm aware reality does not reflect this trivial
example. But those are hallmarks of cultural markets I'd like to aim for in
Gentoo, so I would prefer to discuss a world where they exist ;)