Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-31 Thread Michał Górny
On Sat, 2023-01-28 at 17:38 +0100, Michał Górny wrote:
> Hi, everyone.
> 
> TL;DR: I'd like to propose naming dev-python/* packages following PyPI
> names whenever possible, case-preserving, with modifications only when
> necessary to match PN rules.
> 

The "relaxed" version is now official:

https://projects.gentoo.org/python/guide/package-maintenance.html#package-name-policy

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-30 Thread Michał Górny
On Mon, 2023-01-30 at 16:11 +0500, Anna (cybertailor) Vyalkova wrote:
> On 2023-01-30 12:00, Michał Górny wrote:
> > However, there's a can of worms around the corner -- should we also
> > allow normalizing "-" and "_" across different packages (see dev-
> > python/sphinx*)?
> 
> PyPI treats "-" and "_" separators as the same, so I'd not use
> underscores for in-repo consistency.

I suppose that's PEP 503.  It speaks of name normalization:

| The name should be lowercased with all runs of the characters ., -,
| or _ replaced with a single - character.  [1]

Technically, a policy that would require only "normalized" name match
would let us improve consistency when upstreams fail to do so. 
Unfortunately, while common tools search case-insensitively, they are
sensitive to these characters (and I'm not convinced of changing that).

[1] https://peps.python.org/pep-0503/#normalized-names

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-30 Thread Arsen Arsenović

Andrew Ammerlaan  writes:

> On 28/01/2023 19:02, Ulrich Mueller wrote:
>>> On Sat, 28 Jan 2023, Michał Górny wrote:
 However, it's been pointed out that this makes it hard for people to
 find packages they're looking for.
>> I don't understand this argument. Why would all-lowercase make finding a
>> package harder?
>
> Here's an example, on pypi we have packages:
> - git-python
> - python-git
> - GitPython
> - git-py
>
> Each of these is a different package. The package you usually want is
> GitPython, but if we would name it gitpython or git-python, things would get
> very confusing very quickly. In fact, this package was renamed precisely to
> avoid this confusion in [1]. This is not the only case where there are very
> similarly named packages on pypi. By having a 1 to 1 mapping between names in
> pypi and names in ::gentoo we avoid this confusion.

AFAIK, but I cannot find a source confirming this, PyPI project names
are case-insensitive, so it should be okay to map to all lowercase.

> Best regards,
> Andrew
>
> [1]
> https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=0dec450a90c7490f11df7e69cd9c6709c099285c


-- 
Arsen Arsenović


signature.asc
Description: PGP signature


Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-30 Thread Anna (cybertailor) Vyalkova
On 2023-01-30 12:00, Michał Górny wrote:
> However, there's a can of worms around the corner -- should we also
> allow normalizing "-" and "_" across different packages (see dev-
> python/sphinx*)?

PyPI treats "-" and "_" separators as the same, so I'd not use
underscores for in-repo consistency.



Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-30 Thread Michał Górny
On Sat, 2023-01-28 at 17:38 +0100, Michał Górny wrote:
> To improve consistency and make packages easier to find, I'd like to
> propose going forward that when packages are published on PyPI, we use
> their official PyPI names.  This also means preserving the case for
> the few packages that use CamelCase names and similar.
> 
> Some modifications will be necessary.  For example, it is legal for PyPI
> package names to include dot (".") — we normally translate that to a
> hyphen ("-").  We may also have use cases for creating multiple Gentoo
> packages from the same PyPI package (see e.g. dev-python/ensurepip-*). 
> Then, there are of course Python packages that aren't published on PyPI.
> 
> Still, I think as a general rule of thumb this would make sense.  WDYT?
> 

To add a data point, the "Flask-Babel" package has been renamed to
"flask-babel" upstream today.  Unfortunately, minor changes to names are
not that uncommon (pkgcheck regularly catches them via "mismatched"
remote-ids).  This also means that now this one package is inconsistent
with the rest of capitalized "Flask" packages.

In the end, I'm still not sure whether this policy really makes sense. 
Perhaps it should be relaxed to allow case mismatches, if only to allow
us to retain in-tree consistency when upstreams fail to be consistent.

However, there's a can of worms around the corner -- should we also
allow normalizing "-" and "_" across different packages (see dev-
python/sphinx*)?

Now you see why we didn't have a policy for this before.

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-29 Thread John Helmert III
On Sun, Jan 29, 2023 at 02:15:19AM +0300, Torokhov Sergey wrote:
> The similar names in PyPi is a real problem for users when trying to 
> find associated packages. It's also could be a security issue for them with 
> malicious packages named like popular packages.  />So in ::guru I try to save package naming even if it's too  
> CamelCase.As for replacing dot  (".") with hyphen 
> ("-") I have PyPi package "FoBiS.py" that is packaged in ::guru just as 
> "FoBiS" as I wasn't sure is it worth to store ".py" suffix while github repo 
> of this project is just "FoBiS". So there could be a problem if package named 
> "fobis" will appear in PyPi.28.01.2023, 19:38, 
> "Michał Górny" mgo...@gentoo.org:Hi, 
> everyone.TL;DR: I'd like to propose naming dev-python/* packages 
> following PyPInames whenever possible, case-preserving, with 
> modifications only whennecessary to match PN rules.So 
> far the naming in dev-python/* hasn't been exactly consistent. Myself 
> I've been mostly following "whatever's the easiest" policy which />generally meant following GitHub project names whenever we fetched from />there.This mostly made sense so far, as I've been thinking of 
> dev-python/primarily in terms of dependencies of other packages.  
> However, it'sbeen pointed out that this makes it hard for people to 
> find packagesthey're looking for.The vast majority of 
> packages in dev-python/ are also published on PyPI[1].  They can 
> afterwards be installed using tools such as pip, orspecified as 
> dependencies of other projects — using their PyPI namesin every 
> case.On top of that, it is not unknown for multiple packages with 
> verysimilar names to coexis, say "foo", "pyfoo" and "python-foo".  When 
> GHproject names come into the picture, this can get even more 
> ambiguous. Don't even get me started about developers pushing duplicate 
> packagesbecause they didn't find the existing instance. />To improve consistency and make packages easier to find, I'd like to />propose going forward that when packages are published on PyPI, we use />their official PyPI names.  This also means preserving the case for />the few packages that use CamelCase names and similar.Some 
> modifications will be necessary.  For example, it is legal for PyPI />package names to include dot (".") — we normally translate that to a />hyphen ("-").  We may also have use cases for creating multiple Gentoo />packages from the same PyPI package (see e.g. dev-python/ensurepip-*).  />Then, there are of course Python packages that aren't published on PyPI. />Still, I think as a general rule of thumb this would make sense.  
> WDYT?[1] https://pypi.org/; 
> target="_blank">https://pypi.org/ class="f55bbb4eeef208e8wmi-sign">-- Best regards,Michał Górny />

Can you send plaintext mail to gentoo-dev? HTML makes it very hard to read your 
mails in certain clients.


signature.asc
Description: PGP signature


Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-29 Thread John Helmert III
On Sat, Jan 28, 2023 at 10:23:45PM +0100, Ulrich Mueller wrote:
> > On Sat, 28 Jan 2023, Andrew Ammerlaan wrote:
> 
> > Each of these is a different package. The package you usually want is
> > GitPython, but if we would name it gitpython or git-python, things
> > would get very confusing very quickly. In fact, this package was
> > renamed precisely to avoid this confusion in [1]. This is not the only
> > case where there are very similarly named packages on pypi. By having
> > a 1 to 1 mapping between names in pypi and names in ::gentoo we avoid
> > this confusion.
> 
> Looking at mgorny's list, you cannot have an 1 to 1 mapping anyway,
> because that would result in invalid PN names.

Should imperfection get in the way of bettering the mapping?


signature.asc
Description: PGP signature


Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-29 Thread John Helmert III
On Sat, Jan 28, 2023 at 10:15:02PM +0500, Anna (cybertailor) Vyalkova wrote:
> I'd prefer if PyPI names are guidelines, not a strict policy. I don't
> like CamelCase and separators other than dash ("-") :P
> 
> Also I don't like when packages are named "dev-python/python-foo"
> instead of just "dev-python/foo".

So, two simply aesthetic opinions. I'm not sure it's appropriate to
suggest one's aesthetic preference as default when there's no further
benefit.


signature.asc
Description: PGP signature


Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Michał Górny
On Sun, 2023-01-29 at 02:15 +0300, Torokhov Sergey wrote:
> As for replacing dot  (".") with hyphen ("-") I have PyPi package
> "FoBiS.py" that is packaged in ::guru just as "FoBiS" as I wasn't sure
> is it worth to store ".py" suffix while github repo of this project is
> just "FoBiS". So there could be a problem if package named "fobis"
> will appear in PyPi.

Thanks for this example.  This is actually a perfect case that makes you
really, really think about dropping ".py" and a perfect explanation why
we should keep it, even if it makes the package name look "unnatural".

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Torokhov Sergey
The similar names in PyPi is a real problem for users when trying to find associated packages. It's also could be a security issue for them with malicious packages named like popular packages. So in ::guru I try to save package naming even if it's too  CamelCase.As for replacing dot  (".") with hyphen ("-") I have PyPi package "FoBiS.py" that is packaged in ::guru just as "FoBiS" as I wasn't sure is it worth to store ".py" suffix while github repo of this project is just "FoBiS". So there could be a problem if package named "fobis" will appear in PyPi.28.01.2023, 19:38, "Michał Górny" :Hi, everyone.TL;DR: I'd like to propose naming dev-python/* packages following PyPInames whenever possible, case-preserving, with modifications only whennecessary to match PN rules.So far the naming in dev-python/* hasn't been exactly consistent. Myself I've been mostly following "whatever's the easiest" policy whichgenerally meant following GitHub project names whenever we fetched fromthere.This mostly made sense so far, as I've been thinking of dev-python/primarily in terms of dependencies of other packages.  However, it'sbeen pointed out that this makes it hard for people to find packagesthey're looking for.The vast majority of packages in dev-python/ are also published on PyPI[1].  They can afterwards be installed using tools such as pip, orspecified as dependencies of other projects — using their PyPI namesin every case.On top of that, it is not unknown for multiple packages with verysimilar names to coexis, say "foo", "pyfoo" and "python-foo".  When GHproject names come into the picture, this can get even more ambiguous. Don't even get me started about developers pushing duplicate packagesbecause they didn't find the existing instance.To improve consistency and make packages easier to find, I'd like topropose going forward that when packages are published on PyPI, we usetheir official PyPI names.  This also means preserving the case forthe few packages that use CamelCase names and similar.Some modifications will be necessary.  For example, it is legal for PyPIpackage names to include dot (".") — we normally translate that to ahyphen ("-").  We may also have use cases for creating multiple Gentoopackages from the same PyPI package (see e.g. dev-python/ensurepip-*). Then, there are of course Python packages that aren't published on PyPI.Still, I think as a general rule of thumb this would make sense.  WDYT?[1] https://pypi.org/-- Best regards,Michał Górny

Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Florian Schmaus

On 28/01/2023 17.38, Michał Górny wrote:

To improve consistency and make packages easier to find, I'd like to
propose going forward that when packages are published on PyPI, we use
their official PyPI names.  This also means preserving the case for
the few packages that use CamelCase names and similar.


Consistency is generally a good thing. So +1

FTR, I think this should probably be applied in general in such 
situations, and not just for the Python ecosystem.


- Flow



Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Ulrich Mueller
> On Sat, 28 Jan 2023, Andrew Ammerlaan wrote:

> Each of these is a different package. The package you usually want is
> GitPython, but if we would name it gitpython or git-python, things
> would get very confusing very quickly. In fact, this package was
> renamed precisely to avoid this confusion in [1]. This is not the only
> case where there are very similarly named packages on pypi. By having
> a 1 to 1 mapping between names in pypi and names in ::gentoo we avoid
> this confusion.

Looking at mgorny's list, you cannot have an 1 to 1 mapping anyway,
because that would result in invalid PN names.


signature.asc
Description: PGP signature


Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Michał Górny
On Sat, 2023-01-28 at 22:15 +0500, Anna (cybertailor) Vyalkova wrote:
> I'd prefer if PyPI names are guidelines, not a strict policy. I don't
> like CamelCase and separators other than dash ("-") :P
> 
> Also I don't like when packages are named "dev-python/python-foo"
> instead of just "dev-python/foo".
> 

So instead you claim "foo" and block adding actual "foo" later?

-- 
Best regards,
Michał Górny




Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Anna (cybertailor) Vyalkova
On 2023-01-28 19:02, Ulrich Mueller wrote:
> > On Sat, 28 Jan 2023, Michał Górny wrote:
> >> However, it's been pointed out that this makes it hard for people to
> >> find packages they're looking for.
> 
> I don't understand this argument. Why would all-lowercase make finding a
> package harder?

It doesn't.
`eix` search is case-insensitive.



Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Andrew Ammerlaan

On 28/01/2023 19:02, Ulrich Mueller wrote:

On Sat, 28 Jan 2023, Michał Górny wrote:

However, it's been pointed out that this makes it hard for people to
find packages they're looking for.


I don't understand this argument. Why would all-lowercase make finding a
package harder?


Here's an example, on pypi we have packages:
- git-python
- python-git
- GitPython
- git-py

Each of these is a different package. The package you usually want is 
GitPython, but if we would name it gitpython or git-python, things would 
get very confusing very quickly. In fact, this package was renamed 
precisely to avoid this confusion in [1]. This is not the only case 
where there are very similarly named packages on pypi. By having a 1 to 
1 mapping between names in pypi and names in ::gentoo we avoid this 
confusion.


Best regards,
Andrew

[1] 
https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=0dec450a90c7490f11df7e69cd9c6709c099285c




Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Ulrich Mueller
> On Sat, 28 Jan 2023, Michał Górny wrote:

> Based on existing remote-id entries, the following package names are
> mismatched (PN on left, PyPI name on right).  Note that some of the IDs
> could be wrong, particularly because PyPI "autocorrects" - vs _.

Are there any rules by which upstream use of upper vs lower case can be
predicted? On first glance they look completely random, which is exactly
the reason why we have an all-lowercase policy for PN.

>> However, it's been pointed out that this makes it hard for people to
>> find packages they're looking for.

I don't understand this argument. Why would all-lowercase make finding a
package harder?


signature.asc
Description: PGP signature


Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Ionen Wolkens
On Sat, Jan 28, 2023 at 05:38:05PM +0100, Michał Górny wrote:
> Hi, everyone.
> 
> TL;DR: I'd like to propose naming dev-python/* packages following PyPI
> names whenever possible, case-preserving, with modifications only when
> necessary to match PN rules.
> 
> 
> So far the naming in dev-python/* hasn't been exactly consistent. 
> Myself I've been mostly following "whatever's the easiest" policy which
> generally meant following GitHub project names whenever we fetched from
> there.
> 
> This mostly made sense so far, as I've been thinking of dev-python/
> primarily in terms of dependencies of other packages.  However, it's
> been pointed out that this makes it hard for people to find packages
> they're looking for.
> 
> The vast majority of packages in dev-python/ are also published on PyPI
> [1].  They can afterwards be installed using tools such as pip, or
> specified as dependencies of other projects — using their PyPI names
> in every case.
> 
> On top of that, it is not unknown for multiple packages with very
> similar names to coexis, say "foo", "pyfoo" and "python-foo".  When GH
> project names come into the picture, this can get even more ambiguous. 
> Don't even get me started about developers pushing duplicate packages
> because they didn't find the existing instance.
> 
> 
> To improve consistency and make packages easier to find, I'd like to
> propose going forward that when packages are published on PyPI, we use
> their official PyPI names.  This also means preserving the case for
> the few packages that use CamelCase names and similar.
> 
> Some modifications will be necessary.  For example, it is legal for PyPI
> package names to include dot (".") — we normally translate that to a
> hyphen ("-").  We may also have use cases for creating multiple Gentoo
> packages from the same PyPI package (see e.g. dev-python/ensurepip-*). 
> Then, there are of course Python packages that aren't published on PyPI.
> 
> Still, I think as a general rule of thumb this would make sense.  WDYT?

Just to say I'm all for it. As much as I don't like some of the
pypi^H^H^H^HPyPi^HI names and mismatches from the "typical" style
used in the tree, it's a small price to pay for consistency within
this large group of packages.

> 
> 
> [1] https://pypi.org/

-- 
ionen


signature.asc
Description: PGP signature


Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Anna (cybertailor) Vyalkova
I'd prefer if PyPI names are guidelines, not a strict policy. I don't
like CamelCase and separators other than dash ("-") :P

Also I don't like when packages are named "dev-python/python-foo"
instead of just "dev-python/foo".



Re: [gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Michał Górny
On Sat, 2023-01-28 at 17:38 +0100, Michał Górny wrote:
> TL;DR: I'd like to propose naming dev-python/* packages following PyPI
> names whenever possible, case-preserving, with modifications only when
> necessary to match PN rules.

Based on existing remote-id entries, the following package names are
mismatched (PN on left, PyPI name on right).  Note that some of the IDs
could be wrong, particularly because PyPI "autocorrects" - vs _.



aiohttp-cors  | aiohttp_cors
anyqt | AnyQt
automat   | Automat
aws-xray-sdk-python   | aws-xray-sdk
blake3-py | blake3
boolean-py| boolean.py
bottleneck| Bottleneck
cachecontrol  | CacheControl
cangjie   | CangJie
cerberus  | Cerberus
certifi   | certifi-system-store
chameleon | Chameleon
charset_normalizer| charset-normalizer
cheetah3  | Cheetah3
cherrypy  | CherryPy
cjkwrap   | CJKwrap
cli_helpers   | cli-helpers
collective-checkdocs  | collective.checkdocs
configupdater | ConfigUpdater
cx_Freeze | cx-Freeze
cython| Cython
deprecated| Deprecated
discogs-client| python3-discogs-client
django| Django
django_polymorphic| django-polymorphic
dogpile-cache | dogpile.cache
easyprocess   | EasyProcess
editorconfig-core-py  | EditorConfig
elasticsearch-py  | elasticsearch7
ensurepip-pip | pip
ensurepip-setuptools  | setuptools
ensurepip-wheels  | pip
et_xmlfile| et-xmlfile
eyeD3 | eyed3
flask-api | Flask-API
flask-babel   | Flask-Babel
flask-compress| Flask-Compress
flask-cors| Flask-Cors
flask-debug   | Flask-Debug
flask-gravatar| Flask-Gravatar
flask-htmlmin | Flask-HTMLmin
flask-login   | Flask-Login
flask | Flask
flask-migrate | Flask-Migrate
flask-paranoid| Flask-Paranoid
flask-script  | Flask-Script
flask-sphinx-themes   | Flask-Sphinx-Themes
flit_core | flit-core
flit_scm  | flit-scm
flufl-lock| flufl.lock
genshi| Genshi
github3   | github3.py
gmpy  | gmpy2
google-reauth-python  | google-reauth
hcloud-python | hcloud
imapclient| IMAPClient
importlib_metadata| importlib-metadata
importlib_resources   | importlib-resources
indexed_gzip  | indexed-gzip
jack-client   | JACK-Client
jaraco-classes| jaraco.classes
jaraco-collections| jaraco.collections
jaraco-context| jaraco.context
jaraco-envs   | jaraco.envs
jaraco-functools  | jaraco.functools
jaraco-itertools  | jaraco.itertools
jaraco-logging| jaraco.logging
jaraco-path   | jaraco.path
jaraco-stream | jaraco.stream
jaraco-test   | jaraco.test
jaraco-text   | jaraco.text
jinja | Jinja2
js2py | Js2Py
jschema_to_python | jschema-to-python
jupyter_client| jupyter-client
jupyter_console   | jupyter-console
jupyter_core  | jupyter-core
jupyter_events| jupyter-events
jupyter_kernel_test   | jupyter-kernel-test
jupyterlab_pygments   | jupyterlab-pygments
jupyterlab_server | jupyterlab-server
jupyter_packaging | jupyter-packaging
jupyter_server_mathjax| jupyter-server-mathjax
jupyter_server| jupyter-server
keyrings-alt  | keyrings.alt
keystoneauth  | keystoneauth1
libcloud  

[gentoo-dev] dev-python/ package naming policy?

2023-01-28 Thread Michał Górny
Hi, everyone.

TL;DR: I'd like to propose naming dev-python/* packages following PyPI
names whenever possible, case-preserving, with modifications only when
necessary to match PN rules.


So far the naming in dev-python/* hasn't been exactly consistent. 
Myself I've been mostly following "whatever's the easiest" policy which
generally meant following GitHub project names whenever we fetched from
there.

This mostly made sense so far, as I've been thinking of dev-python/
primarily in terms of dependencies of other packages.  However, it's
been pointed out that this makes it hard for people to find packages
they're looking for.

The vast majority of packages in dev-python/ are also published on PyPI
[1].  They can afterwards be installed using tools such as pip, or
specified as dependencies of other projects — using their PyPI names
in every case.

On top of that, it is not unknown for multiple packages with very
similar names to coexis, say "foo", "pyfoo" and "python-foo".  When GH
project names come into the picture, this can get even more ambiguous. 
Don't even get me started about developers pushing duplicate packages
because they didn't find the existing instance.


To improve consistency and make packages easier to find, I'd like to
propose going forward that when packages are published on PyPI, we use
their official PyPI names.  This also means preserving the case for
the few packages that use CamelCase names and similar.

Some modifications will be necessary.  For example, it is legal for PyPI
package names to include dot (".") — we normally translate that to a
hyphen ("-").  We may also have use cases for creating multiple Gentoo
packages from the same PyPI package (see e.g. dev-python/ensurepip-*). 
Then, there are of course Python packages that aren't published on PyPI.

Still, I think as a general rule of thumb this would make sense.  WDYT?


[1] https://pypi.org/

-- 
Best regards,
Michał Górny