[Python-ideas] Re: "Curated" package repo?

2023-07-09 Thread Christopher Barker
On Sun, Jul 9, 2023 at 8:37 AM James Addison via Python-ideas <
python-ideas@python.org> wrote:

> ISTM the primary use cases advanced here have been for "naive" users.
>>> Likely they won't be in a position to decide whether they trust Guido
>>> van Rossum or Egg Rando more.
>>
>>
There are 718,155 users on PyPi -- I can't imagine that trying to figure
out which of those hundreds of thousands of users you trust for
reviews would be at all helpful -- it simply doesn't scale.

I suppose if my fantasy "curated" site existed, and the curation group
were of a manageable size, then you could do that, but the point of having
a modest number of curators is that you can already trust them ;-)

Honestly, I'd be more likely to go with "I can assume that projects that
>> are dependencies of other projects that I already know are good quality,
>> are themselves good quality". Which excludes people from the
>> equation altogether,
>>
>
I there are a number of metrics that could be used -- and "how many
projects" use this projecct as a dependency" is a good one. -- "which"
projects would be even stronger. And there are others.

Anything like that it can be gamed, but I"m not sure that's as huge a
problem as it might be -- what is the incentive to game this system? this
is all open source, no one's making money, and frankly, having a lot of
users can be a burden as well!

Sure, many of us would really like a lot of people to use our code, but
the incentives to cheat to get more users really aren't that strong. -- at
least if. you can filter out the malware in some other way.

-CHB






> but which falls apart when I'm looking for a library in a new area.
>>
>> Paul
>>
>
> Cautious +1, since PageRank did pretty well for a good stint in a somewhat
> analogous environment.
>
>> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/J5RH7ZGWO23APG42E6ZU5QPRXMYKJ7W4/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Q35LO7KS5XPZVGTYA2XEFEJVSVO27EBC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: "Curated" package repo?

2023-07-09 Thread James Addison via Python-ideas
On Sun, Jul 9, 2023, 16:25 Paul Moore  wrote:

>
>
> On Sun, 9 Jul 2023 at 15:56, Stephen J. Turnbull <
> turnbull.stephen...@u.tsukuba.ac.jp> wrote:
>
>> James Addison via Python-ideas writes:
>>
>>  > The implementation of such a system could either be centralized or
>>  > distributed; the trust signals that human users infer from it
>>  > should always be distributed.
>>
>> ISTM the primary use cases advanced here have been for "naive" users.
>> Likely they won't be in a position to decide whether they trust Guido
>> van Rossum or Egg Rando more.  So in practice they'll often want to go
>> with some kind of publicly weighted average of scores.
>>
>
> I'll also point out that I'm a long-standing Python developer, and a core
> dev, and I still *regularly* get surprised by finding out that community
> members that I know and respect are maintainers of projects that I had no
> idea they were associated with. Which suggests that I have no idea how many
> *other* people who I think of as "just another person" might be maintainers
> of key, high-profile projects. So I think that a model based round
> weighting results based on "who you trust" would have some rather
> unfortunate failure modes.
>
> Honestly, I'd be more likely to go with "I can assume that projects that
> are dependencies of other projects that I already know are good quality,
> are themselves good quality". Which excludes people from the
> equation altogether, but which falls apart when I'm looking for a library
> in a new area.
>
> Paul
>

Cautious +1, since PageRank did pretty well for a good stint in a somewhat
analogous environment.

>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/J5RH7ZGWO23APG42E6ZU5QPRXMYKJ7W4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: "Curated" package repo?

2023-07-09 Thread Paul Moore
On Sun, 9 Jul 2023 at 15:56, Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> James Addison via Python-ideas writes:
>
>  > The implementation of such a system could either be centralized or
>  > distributed; the trust signals that human users infer from it
>  > should always be distributed.
>
> ISTM the primary use cases advanced here have been for "naive" users.
> Likely they won't be in a position to decide whether they trust Guido
> van Rossum or Egg Rando more.  So in practice they'll often want to go
> with some kind of publicly weighted average of scores.
>

I'll also point out that I'm a long-standing Python developer, and a core
dev, and I still *regularly* get surprised by finding out that community
members that I know and respect are maintainers of projects that I had no
idea they were associated with. Which suggests that I have no idea how many
*other* people who I think of as "just another person" might be maintainers
of key, high-profile projects. So I think that a model based round
weighting results based on "who you trust" would have some rather
unfortunate failure modes.

Honestly, I'd be more likely to go with "I can assume that projects that
are dependencies of other projects that I already know are good quality,
are themselves good quality". Which excludes people from the
equation altogether, but which falls apart when I'm looking for a library
in a new area.

Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/N6X7JFHR6U4TEE4YSZPTE2M4OPD6BMMM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: "Curated" package repo?

2023-07-09 Thread James Addison via Python-ideas
I didn't really address your point there; indirectly mine was to reaffirm a
sense that not all participants may want to read the opinions of others
while learning technologies, and that's why I am skeptical of the
suggestions to include subjective user ratings of any kind within Python
packaging infrastructure.

On Sun, Jul 9, 2023, 16:09 James Addison  wrote:

> On Sun, Jul 9, 2023, 15:52 Stephen J. Turnbull <
> turnbull.stephen...@u.tsukuba.ac.jp> wrote:
>
>> James Addison via Python-ideas writes:
>>
>>  > The implementation of such a system could either be centralized or
>>  > distributed; the trust signals that human users infer from it
>>  > should always be distributed.
>>
>> ISTM the primary use cases advanced here have been for "naive" users.
>> Likely they won't be in a position to decide whether they trust Guido
>> van Rossum or Egg Rando more.  So in practice they'll often want to go
>> with some kind of publicly weighted average of scores.
>>
>> To avoid the problem of ballot-box stuffing, you could go the way that
>> pro sports often do for their All-Star teams: have one vote by anybody
>> who cares to register an ID, and another by verified committers,
>> including committers from "trusted" projects as well.
>>
>
> As someone who sometimes prefers to learn independently -- even if that
> takes longer and may produce unusual perspectives -- I remember learning
> web development by reading the source HTML of websites.
>
> Maybe that wouldn't be the typical way to learn programming -- but given
> the volume of successful and important software that exists in the world
> today, I think that having that code and the packages that it is composed
> of available to learn from would be highly beneficial to maintainers,
> educators and students, and other groups as well.
>
>>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DLH44V4UUDUQN6NCMIXSADM6RE27RIEJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: "Curated" package repo?

2023-07-09 Thread James Addison via Python-ideas
On Sun, Jul 9, 2023, 15:52 Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> James Addison via Python-ideas writes:
>
>  > The implementation of such a system could either be centralized or
>  > distributed; the trust signals that human users infer from it
>  > should always be distributed.
>
> ISTM the primary use cases advanced here have been for "naive" users.
> Likely they won't be in a position to decide whether they trust Guido
> van Rossum or Egg Rando more.  So in practice they'll often want to go
> with some kind of publicly weighted average of scores.
>
> To avoid the problem of ballot-box stuffing, you could go the way that
> pro sports often do for their All-Star teams: have one vote by anybody
> who cares to register an ID, and another by verified committers,
> including committers from "trusted" projects as well.
>

As someone who sometimes prefers to learn independently -- even if that
takes longer and may produce unusual perspectives -- I remember learning
web development by reading the source HTML of websites.

Maybe that wouldn't be the typical way to learn programming -- but given
the volume of successful and important software that exists in the world
today, I think that having that code and the packages that it is composed
of available to learn from would be highly beneficial to maintainers,
educators and students, and other groups as well.

>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XRPK7LDU3JMP7NBY75SUOHUSHHW33BKA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: "Curated" package repo?

2023-07-09 Thread Stephen J. Turnbull
James Addison via Python-ideas writes:

 > The implementation of such a system could either be centralized or
 > distributed; the trust signals that human users infer from it
 > should always be distributed.

ISTM the primary use cases advanced here have been for "naive" users.
Likely they won't be in a position to decide whether they trust Guido
van Rossum or Egg Rando more.  So in practice they'll often want to go
with some kind of publicly weighted average of scores.

To avoid the problem of ballot-box stuffing, you could go the way that
pro sports often do for their All-Star teams: have one vote by anybody
who cares to register an ID, and another by verified committers,
including committers from "trusted" projects as well.

Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4XKWJDHQWBCX7HIX7UT5GJNXMFOLMDWY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: "Curated" package repo?

2023-07-09 Thread James Addison via Python-ideas
On Sun, Jul 9, 2023, 09:13 Chris Angelico  wrote:

> On Sun, 9 Jul 2023 at 18:06, James Addison via Python-ideas
>  wrote:
> >
> > On Sun, 9 Jul 2023 at 02:11, Cameron Simpson  wrote:
> > > I have always thought that any community scoring system should allow
> > > other users to mark up/down other reviewers w.r.t the scores presented.
> > > That markup should only affect the scoring as presented to the person
> > > doing the markup, like a personal killfile. The idea is that you can
> > > have the ratings you see affected by notions that "I trust the opinions
> > > of user A" or "I find user B's opinion criteria not useful for my
> > > criteria".
> >
> > That sounds to me like the basis of a distributed trust network, and
> > could be useful.
> >
>
> Why distributed? This sounded more like a centralized system, but one
> where you can "ignore reviews from this user" for any other user.
>

The implementation of such a system could either be centralized or
distributed; the trust signals that human users infer from it should always
be distributed.  And I'd argue that it's more difficult to guarantee that
the trust presented to all participants is fair and accurate in either a
centralized or a proprietary system.

>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LAVE5SWYASATB7H3D4CAKZOCZX4GT3SW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: "Curated" package repo?

2023-07-09 Thread Chris Angelico
On Sun, 9 Jul 2023 at 18:06, James Addison via Python-ideas
 wrote:
>
> On Sun, 9 Jul 2023 at 02:11, Cameron Simpson  wrote:
> > I have always thought that any community scoring system should allow
> > other users to mark up/down other reviewers w.r.t the scores presented.
> > That markup should only affect the scoring as presented to the person
> > doing the markup, like a personal killfile. The idea is that you can
> > have the ratings you see affected by notions that "I trust the opinions
> > of user A" or "I find user B's opinion criteria not useful for my
> > criteria".
>
> That sounds to me like the basis of a distributed trust network, and
> could be useful.
>

Why distributed? This sounded more like a centralized system, but one
where you can "ignore reviews from this user" for any other user.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IOFZ4NR3XYQDUTD3FY2XUTRRADPMQ7AC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: "Curated" package repo?

2023-07-09 Thread James Addison via Python-ideas
On Sun, 9 Jul 2023 at 02:11, Cameron Simpson  wrote:
>
> On 04Jul2023 17:21, Christopher Barker  wrote:
> >3) A rating system built into PyPi -- This could be a combination of
> >two
> >things:
> >  A - Automated analysis -- download stats, dependency stats, release
> >frequency, etc, etc, etc.
> >  B - Community ratings -- upvotes. stars, whatever.
> >
> >If done well, that could be very useful -- search on PyPi listed by rating.
> >However -- :done well" ios a huge challenge -- I don't think there's a way
> >to do the automated system right, and community scoring can be abused
> >pretty easily. But maybe folks smarter than me could make it work with one
> >or both of these approaches.
>
> I have always thought that any community scoring system should allow
> other users to mark up/down other reviewers w.r.t the scores presented.
> That markup should only affect the scoring as presented to the person
> doing the markup, like a personal killfile. The idea is that you can
> have the ratings you see affected by notions that "I trust the opinions
> of user A" or "I find user B's opinion criteria not useful for my
> criteria".
>
> Of course the "ignore user B" has some of the same downsides as trying
> individually ignore certain spam sources: good for a single "bad" actor
> (by my personal criteria) to ignore their (apparent) gaming of the
> ratings but not good for a swarm of robots.

Hi Cameron,

That sounds to me like the basis of a distributed trust network, and
could be useful.

Some thoughts from experience working with Python (and other
ecosystem) packages: after getting to know the usernames of developers
and publishers of packages, I think that much of that trust can be
learned by individuals without the assistance of technology -- that is
to say, people begin to recognize authors that they trust, and authors
that they don't.

How to provide reassurance that each author's identity remains the
same between modifications to packages/code is a related challenge,
though.  FWIW, I don't really like many of the common multi-factor
authentication systems used today, because I don't like seeing
barriers to expression emerge, even when the intent is benevolent.
I'm not sure I yet have better alternatives to suggest, though.

Your message also helped me clarify why I don't like embedding any
review information at all within packaging ecosystems -- regardless of
whether transitive trust is additionally available in the form of
reviews.

The reason is that I would prefer to see end-to-end transparent supply
chain integrity for almost all, if not all, software products.  I'm
typing this in a GMail web interface, but I do not believe that many
people have access to all of the source code for the version that I'm
using.  If everyone did, and if that source included strong dependency
hashes to indicate the dependencies used -- similar to the way that
pip-tools[1] can write a persistent record of a dependency set,
allowing the same dependencies to be inspected and installed by others
-- then people could begin to build their own mental models of what
packages -- and what specific versions of those packages -- are worth
trusting.

In other words: if all of the software and bill-of-materials for it
became open and published, and could be constructed reproducibly[2],
then social trust would emerge without a requirement for reviews.
That would not be mutually-exclusive with the presence of reviews --
verbal, written, or otherwise -- elsewhere.

Thanks,
James

[1] - https://github.com/jazzband/pip-tools/

[2] - https://www.reproducible-builds.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OOPIHTBTJFHYVJLJVYHWAK4EPYKP6YBH/
Code of Conduct: http://python.org/psf/codeofconduct/