Re: [Distutils] Handling Case/Normalization Differences
On 28 August 2014 19:58, Donald Stufft don...@stufft.io wrote: To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names for it's directories and such as well. These two changes will make it so that the client side will know ahead of time exactly what form the server expects any given name to be in. This will allow a change in pip to happen which will pre-normalize all names which will make the interaction with mirrors better and will reduce the number of HTTP requests that a single ``pip install`` needs to make. Just to clarify, this means that if I want to find the simple index page for a distribution, without hitting redirects, I should first normalise the project name (so Django becomes django) and then request https://pypi.python.org/simple/normalised_name/ (with a slash on the end). Is that correct? It seems to match what I see in practice (in particular, the version without a terminating slash redirects to the version with a terminating slash). The JSON API has the opposite behaviour - https://pypi.python.org/pypi/Django/json redirects to https://pypi.python.org/pypi/django/json. Should that not be changed to match? Will it be? Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On 30 September 2014 15:25, Paul Moore p.f.mo...@gmail.com wrote: On 28 August 2014 19:58, Donald Stufft don...@stufft.io wrote: To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names for it's directories and such as well. These two changes will make it so that the client side will know ahead of time exactly what form the server expects any given name to be in. This will allow a change in pip to happen which will pre-normalize all names which will make the interaction with mirrors better and will reduce the number of HTTP requests that a single ``pip install`` needs to make. Just to clarify, this means that if I want to find the simple index page for a distribution, without hitting redirects, I should first normalise the project name (so Django becomes django) and then request https://pypi.python.org/simple/normalised_name/ (with a slash on the end). Is that correct? It seems to match what I see in practice (in particular, the version without a terminating slash redirects to the version with a terminating slash). The JSON API has the opposite behaviour - https://pypi.python.org/pypi/Django/json redirects to https://pypi.python.org/pypi/django/json. Should that not be changed to match? Will it be? One further thought. Where is the definition of how to normalise a name? I could probably dig through the pip sources and find it, but it would be nice if it were documented somewhere. From experiment, it seems like lowercase, and with hyphens rather than underscores, is the definition. Does PyPI allow names not allowed by http://legacy.python.org/dev/peps/pep-0426/#name and if it does, how are they normalised? In case it's not obvious, I'm writing a client for the PyPI API, and these questions are coming out of that process. Paul. PS The Python wiki has pages for the XMLRPC and JSON API. Any objections to me adding a page for the simple API? (The obvious objection being that it's documented somewhere else, and I should just put a pointer to the real documentation...) Paul ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On Sep 30, 2014, at 11:14 AM, Paul Moore p.f.mo...@gmail.com wrote: On 30 September 2014 15:25, Paul Moore p.f.mo...@gmail.com wrote: On 28 August 2014 19:58, Donald Stufft don...@stufft.io wrote: To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names for it's directories and such as well. These two changes will make it so that the client side will know ahead of time exactly what form the server expects any given name to be in. This will allow a change in pip to happen which will pre-normalize all names which will make the interaction with mirrors better and will reduce the number of HTTP requests that a single ``pip install`` needs to make. Just to clarify, this means that if I want to find the simple index page for a distribution, without hitting redirects, I should first normalise the project name (so Django becomes django) and then request https://pypi.python.org/simple/normalised_name/ (with a slash on the end). Is that correct? It seems to match what I see in practice (in particular, the version without a terminating slash redirects to the version with a terminating slash). The JSON API has the opposite behaviour - https://pypi.python.org/pypi/Django/json redirects to https://pypi.python.org/pypi/django/json. Should that not be changed to match? Will it be? One further thought. Where is the definition of how to normalise a name? I could probably dig through the pip sources and find it, but it would be nice if it were documented somewhere. From experiment, it seems like lowercase, and with hyphens rather than underscores, is the definition. Does PyPI allow names not allowed by http://legacy.python.org/dev/peps/pep-0426/#name and if it does, how are they normalised? In case it's not obvious, I'm writing a client for the PyPI API, and these questions are coming out of that process. Paul. PS The Python wiki has pages for the XMLRPC and JSON API. Any objections to me adding a page for the simple API? (The obvious objection being that it's documented somewhere else, and I should just put a pointer to the real documentation...) Paul PyPI follows PEP 426, I think we even include the confusables support. Generally the normalization is done with pkg_resources.safe_name(…).lower(). I don’t think there’s any reason not to document it, setuptools has it’s routine documented but that does’t have everything that the /simple/ API supports documented since it’s really documentation for what setuptools does. The URL redirect for the json endpoint was made to match what happens with /pypi/django/. Lately I’ve been thinking that maybe we should just use the normalized form in URLs always and use the author provided name for display purposes. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On Mon, Sep 01, 2014 at 19:07 -0400, Donald Stufft wrote: On Sep 1, 2014, at 4:53 PM, holger krekel hol...@merlinux.eu wrote: On Thu, Aug 28, 2014 at 14:58 -0400, Donald Stufft wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. Of course you mean redirecting everything to the normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names ... devpi-server also broke and I did a hotfix release today. Older installs will still have a problem, though (not all companies run the newest version all the time). Apart form the fact i was on vacation and on business travels, the notice for that breaking change was only one day which i think is a bit too quick. I'd really appreciate if you send a mail to Christian for bandersnatch and me for devpi before such changes happen and with a bit more reasonable ahead time. Besides, i think it's a good change in principle. best and thanks, holger I can only really replete this with https://xkcd.com/1172/. This shouldn’t have been a breaking change, anyone following the HTTP spec dealt with this change just fine. As far as I can tell the only reason it broke devpi was because of an assertion in the code that was asserting against an implementation detail, an implementation detail that I changed. Right, the assertion was there to ensure pypi's realname and devpi's internal realname of a project are the same. This check is now relaxed. FWIW I'd prefer it we just said in all pypi APIs (http and xmlrpc/json) that a project name is always kept in canonical form, i.e. you can maybe register HeLlo_World but it just means hello-world next time someone asks for it. What is the relevance of the realname anyway? Do you keep realnames in warehouse? I’m sorry it broke devpi and that it happened at a time when you were on vacation, but honestly I don’t think it’s reasonable to expect every little thing to have to be run past a list of people. Due to the undocumented nature of these tools people have put a lot of (also undocumented) assumptions into their code, many of which are simply depending on implementation details. I try to test my changes against what I can, in this case pip, setuptools, and bandersnatch, but I can’t test against everything. Thanks for all your work and eagerness to improve things. I think it's safe to assume that any change in PyPI's pip/bandersnatch/devpi facing http API has potential for disruption even if some http specification says otherwise -- at least until we have some specification of how tool/pypi interactions work. best, holger ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On Sep 2, 2014, at 5:36 AM, holger krekel hol...@merlinux.eu wrote: On Mon, Sep 01, 2014 at 19:07 -0400, Donald Stufft wrote: On Sep 1, 2014, at 4:53 PM, holger krekel hol...@merlinux.eu wrote: On Thu, Aug 28, 2014 at 14:58 -0400, Donald Stufft wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. Of course you mean redirecting everything to the normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names ... devpi-server also broke and I did a hotfix release today. Older installs will still have a problem, though (not all companies run the newest version all the time). Apart form the fact i was on vacation and on business travels, the notice for that breaking change was only one day which i think is a bit too quick. I'd really appreciate if you send a mail to Christian for bandersnatch and me for devpi before such changes happen and with a bit more reasonable ahead time. Besides, i think it's a good change in principle. best and thanks, holger I can only really replete this with https://xkcd.com/1172/. This shouldn’t have been a breaking change, anyone following the HTTP spec dealt with this change just fine. As far as I can tell the only reason it broke devpi was because of an assertion in the code that was asserting against an implementation detail, an implementation detail that I changed. Right, the assertion was there to ensure pypi's realname and devpi's internal realname of a project are the same. This check is now relaxed. FWIW I'd prefer it we just said in all pypi APIs (http and xmlrpc/json) that a project name is always kept in canonical form, i.e. you can maybe register HeLlo_World but it just means hello-world next time someone asks for it. What is the relevance of the realname anyway? Do you keep realnames in warehouse? As of right now we do, although I think it’s likely that Warehouse will end up with the normalized name being used as the “identifier” for a project and the name that an author typed in being used as the “display name”. I’m sorry it broke devpi and that it happened at a time when you were on vacation, but honestly I don’t think it’s reasonable to expect every little thing to have to be run past a list of people. Due to the undocumented nature of these tools people have put a lot of (also undocumented) assumptions into their code, many of which are simply depending on implementation details. I try to test my changes against what I can, in this case pip, setuptools, and bandersnatch, but I can’t test against everything. Thanks for all your work and eagerness to improve things. I think it's safe to assume that any change in PyPI's pip/bandersnatch/devpi facing http API has potential for disruption even if some http specification says otherwise -- at least until we have some specification of how tool/pypi interactions work. best, holger --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On Thu, Aug 28, 2014 at 14:58 -0400, Donald Stufft wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. Of course you mean redirecting everything to the normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names ... devpi-server also broke and I did a hotfix release today. Older installs will still have a problem, though (not all companies run the newest version all the time). Apart form the fact i was on vacation and on business travels, the notice for that breaking change was only one day which i think is a bit too quick. I'd really appreciate if you send a mail to Christian for bandersnatch and me for devpi before such changes happen and with a bit more reasonable ahead time. Besides, i think it's a good change in principle. best and thanks, holger for it's directories and such as well. These two changes will make it so that the client side will know ahead of time exactly what form the server expects any given name to be in. This will allow a change in pip to happen which will pre-normalize all names which will make the interaction with mirrors better and will reduce the number of HTTP requests that a single ``pip install`` needs to make. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On Sep 1, 2014, at 4:53 PM, holger krekel hol...@merlinux.eu wrote: On Thu, Aug 28, 2014 at 14:58 -0400, Donald Stufft wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. Of course you mean redirecting everything to the normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names ... devpi-server also broke and I did a hotfix release today. Older installs will still have a problem, though (not all companies run the newest version all the time). Apart form the fact i was on vacation and on business travels, the notice for that breaking change was only one day which i think is a bit too quick. I'd really appreciate if you send a mail to Christian for bandersnatch and me for devpi before such changes happen and with a bit more reasonable ahead time. Besides, i think it's a good change in principle. best and thanks, holger I can only really replete this with https://xkcd.com/1172/. This shouldn’t have been a breaking change, anyone following the HTTP spec dealt with this change just fine. As far as I can tell the only reason it broke devpi was because of an assertion in the code that was asserting against an implementation detail, an implementation detail that I changed. I’m sorry it broke devpi and that it happened at a time when you were on vacation, but honestly I don’t think it’s reasonable to expect every little thing to have to be run past a list of people. Due to the undocumented nature of these tools people have put a lot of (also undocumented) assumptions into their code, many of which are simply depending on implementation details. I try to test my changes against what I can, in this case pip, setuptools, and bandersnatch, but I can’t test against everything. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
FWIW, as a community member it doesn't seem unreasonable to me to expect that a certain amount of advance notice be given for changes like this, *especially* given that the tools are undocumented. Also, there's a difference between notifying people and running it by people (for permission). I think Holger is just asking for enough notice, which shouldn't slow you down like getting sign-off would, say. --Chris On Mon, Sep 1, 2014 at 4:07 PM, Donald Stufft don...@stufft.io wrote: On Sep 1, 2014, at 4:53 PM, holger krekel hol...@merlinux.eu wrote: On Thu, Aug 28, 2014 at 14:58 -0400, Donald Stufft wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. Of course you mean redirecting everything to the normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names ... devpi-server also broke and I did a hotfix release today. Older installs will still have a problem, though (not all companies run the newest version all the time). Apart form the fact i was on vacation and on business travels, the notice for that breaking change was only one day which i think is a bit too quick. I'd really appreciate if you send a mail to Christian for bandersnatch and me for devpi before such changes happen and with a bit more reasonable ahead time. Besides, i think it's a good change in principle. best and thanks, holger I can only really replete this with https://xkcd.com/1172/. This shouldn’t have been a breaking change, anyone following the HTTP spec dealt with this change just fine. As far as I can tell the only reason it broke devpi was because of an assertion in the code that was asserting against an implementation detail, an implementation detail that I changed. I’m sorry it broke devpi and that it happened at a time when you were on vacation, but honestly I don’t think it’s reasonable to expect every little thing to have to be run past a list of people. Due to the undocumented nature of these tools people have put a lot of (also undocumented) assumptions into their code, many of which are simply depending on implementation details. I try to test my changes against what I can, in this case pip, setuptools, and bandersnatch, but I can’t test against everything. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
I don't know exactly. I'd say a change that in your judgment you think has a non-trivial chance of breaking existing tools. Holger is probably in a better position to say. I was just speaking in support of his request, which seemed reasonable to me. --Chris On Mon, Sep 1, 2014 at 5:03 PM, Donald Stufft don...@stufft.io wrote: Changes like what exactly? This was a fairly minor change which is why there wasn't more notice. On Sep 1, 2014, at 7:44 PM, Chris Jerdonek chris.jerdo...@gmail.com wrote: FWIW, as a community member it doesn't seem unreasonable to me to expect that a certain amount of advance notice be given for changes like this, *especially* given that the tools are undocumented. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On Mon, Sep 1, 2014, at 08:15 PM, Chris Jerdonek wrote: I don't know exactly. I'd say a change that in your judgment you think has a non-trivial chance of breaking existing tools. Holger is probably in a better position to say. I was just speaking in support of his request, which seemed reasonable to me. --Chris Which is exactly my point. This change was minor. It didn't break anything but devpi and it wouldn't have broken devpi to my knowledge except for an assert statement that wasn't particularly needed. I already give notice (and discussion, often times even PEPs) for any change that I believe to be breaking. Wanting more is wanting notice on every single change on the off chance someone somewhere might have some dependency on any random implementation detail. On Mon, Sep 1, 2014 at 5:03 PM, Donald Stufft don...@stufft.io wrote: Changes like what exactly? This was a fairly minor change which is why there wasn't more notice. On Sep 1, 2014, at 7:44 PM, Chris Jerdonek chris.jerdo...@gmail.com wrote: FWIW, as a community member it doesn't seem unreasonable to me to expect that a certain amount of advance notice be given for changes like this, *especially* given that the tools are undocumented. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On Mon, Sep 1, 2014 at 7:15 PM, Donald Stufft don...@stufft.io wrote: On Mon, Sep 1, 2014, at 08:15 PM, Chris Jerdonek wrote: I don't know exactly. I'd say a change that in your judgment you think has a non-trivial chance of breaking existing tools. Holger is probably in a better position to say. I was just speaking in support of his request, which seemed reasonable to me. --Chris Which is exactly my point. This change was minor. It didn't break anything but devpi and it wouldn't have broken devpi to my knowledge except for an assert statement that wasn't particularly needed. I already give notice (and discussion, often times even PEPs) for any change that I believe to be breaking. Wanting more is wanting notice on every single change on the off chance someone somewhere might have some dependency on any random implementation detail. If you don't have a good sense of what changes might break existing tools and don't want to notify people, one possibility is to build in a delay between committing to the repo and deploying to production. Interested folks could monitor commits to the repo -- giving them a chance to ask questions and update their tools if necessary. --Chris On Mon, Sep 1, 2014 at 5:03 PM, Donald Stufft don...@stufft.io wrote: Changes like what exactly? This was a fairly minor change which is why there wasn't more notice. On Sep 1, 2014, at 7:44 PM, Chris Jerdonek chris.jerdo...@gmail.com wrote: FWIW, as a community member it doesn't seem unreasonable to me to expect that a certain amount of advance notice be given for changes like this, *especially* given that the tools are undocumented. ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On 2 September 2014 12:54, Chris Jerdonek chris.jerdo...@gmail.com wrote: On Mon, Sep 1, 2014 at 7:15 PM, Donald Stufft don...@stufft.io wrote: I already give notice (and discussion, often times even PEPs) for any change that I believe to be breaking. Wanting more is wanting notice on every single change on the off chance someone somewhere might have some dependency on any random implementation detail. If you don't have a good sense of what changes might break existing tools and don't want to notify people, one possibility is to build in a delay between committing to the repo and deploying to production. Interested folks could monitor commits to the repo -- giving them a chance to ask questions and update their tools if necessary. That will pick up noise from internal or web only changes that don't affect the programmatic APIs. Ideally, we'd have an integration environment where tests for pip, bandersnatch and devpi were all automatically run against pypi commits before they went live, but that's rather a lot of work to set up. Until we have such a system, we may continue to see occasional incidents like this one. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
Ah, I didn't think of that- good point. +1 to your suggested approach. On Thu, Aug 28, 2014 at 3:41 PM, Donald Stufft don...@stufft.io wrote: Since pip 1.4 it does yes, however the problem here is that typically bandersnatch mirrors are simply hosted by plain static web servers and don’t require any sort of runtime logic. On Aug 28, 2014, at 6:39 PM, Joe Smith yasumo...@gmail.com wrote: Naive question- does pip send over a UserAgent (or something) that contains a version number the server can use to determine which behavior to default to? That would allow a deprecation cycle of N months or so that will let people upgrade from 1.5 to 1.6. We could then watch usage of 1.5 decrease over time until it's a non-factor. On Thu, Aug 28, 2014 at 3:26 PM, Donald Stufft don...@stufft.io wrote: On Aug 28, 2014, at 6:09 PM, Donald Stufft don...@stufft.io wrote: On Aug 28, 2014, at 2:58 PM, Donald Stufft don...@stufft.io wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names for it's directories and such as well. These two changes will make it so that the client side will know ahead of time exactly what form the server expects any given name to be in. This will allow a change in pip to happen which will pre-normalize all names which will make the interaction with mirrors better and will reduce the number of HTTP requests that a single ``pip install`` needs to make. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig Hm, so here’s the problem. I have this implemented and deployed to TestPyPI, it works great! However, the next step is to make the change to bandersnatch so that it saves things using their normalized name instead of using their proper name. Doing this will trigger it so that everyone using pip 1.5 won't be able to install anything from that mirror unless it's name is specified as the normalized name (e.g. ``pip install Django`` will fail without --allow-unverified but ``pip install django`` will work). This would be fixed with pip 1.6 (since it would know to normalize the name before fetching the URL). The same thing will occur if we make the change in pip first, it would normalize names so you'd need to use --allow-unverified for everything because it would act as if you typed ``pip install django`` instead of ``pip install Django``. To my knowledge, this *only* will affect pip 1.5.x. So the only way forward I can see to make this change, which I think is a good change and will remove a big gotcha from using a mirror, is to coordinate a release of bandersnatch that coincides with pip 1.6, and tell people they need to upgrade in lockstep. Does anyone have any other ideas? --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig Just thought of this, if the normalized name doesn’t match the real name, then add entries for both. This will make it so that pip 1.5 continues to work and pip 1.6+. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926
Re: [Distutils] Handling Case/Normalization Differences
On Aug 28, 2014, at 2:58 PM, Donald Stufft don...@stufft.io wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names for it's directories and such as well. These two changes will make it so that the client side will know ahead of time exactly what form the server expects any given name to be in. This will allow a change in pip to happen which will pre-normalize all names which will make the interaction with mirrors better and will reduce the number of HTTP requests that a single ``pip install`` needs to make. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig Hm, so here’s the problem. I have this implemented and deployed to TestPyPI, it works great! However, the next step is to make the change to bandersnatch so that it saves things using their normalized name instead of using their proper name. Doing this will trigger it so that everyone using pip 1.5 won't be able to install anything from that mirror unless it's name is specified as the normalized name (e.g. ``pip install Django`` will fail without --allow-unverified but ``pip install django`` will work). This would be fixed with pip 1.6 (since it would know to normalize the name before fetching the URL). The same thing will occur if we make the change in pip first, it would normalize names so you'd need to use --allow-unverified for everything because it would act as if you typed ``pip install django`` instead of ``pip install Django``. To my knowledge, this *only* will affect pip 1.5.x. So the only way forward I can see to make this change, which I think is a good change and will remove a big gotcha from using a mirror, is to coordinate a release of bandersnatch that coincides with pip 1.6, and tell people they need to upgrade in lockstep. Does anyone have any other ideas? --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On Aug 28, 2014, at 6:09 PM, Donald Stufft don...@stufft.io wrote: On Aug 28, 2014, at 2:58 PM, Donald Stufft don...@stufft.io mailto:don...@stufft.io wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names for it's directories and such as well. These two changes will make it so that the client side will know ahead of time exactly what form the server expects any given name to be in. This will allow a change in pip to happen which will pre-normalize all names which will make the interaction with mirrors better and will reduce the number of HTTP requests that a single ``pip install`` needs to make. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org mailto:Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig https://mail.python.org/mailman/listinfo/distutils-sig Hm, so here’s the problem. I have this implemented and deployed to TestPyPI, it works great! However, the next step is to make the change to bandersnatch so that it saves things using their normalized name instead of using their proper name. Doing this will trigger it so that everyone using pip 1.5 won't be able to install anything from that mirror unless it's name is specified as the normalized name (e.g. ``pip install Django`` will fail without --allow-unverified but ``pip install django`` will work). This would be fixed with pip 1.6 (since it would know to normalize the name before fetching the URL). The same thing will occur if we make the change in pip first, it would normalize names so you'd need to use --allow-unverified for everything because it would act as if you typed ``pip install django`` instead of ``pip install Django``. To my knowledge, this *only* will affect pip 1.5.x. So the only way forward I can see to make this change, which I think is a good change and will remove a big gotcha from using a mirror, is to coordinate a release of bandersnatch that coincides with pip 1.6, and tell people they need to upgrade in lockstep. Does anyone have any other ideas? --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig Just thought of this, if the normalized name doesn’t match the real name, then add entries for both. This will make it so that pip 1.5 continues to work and pip 1.6+. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
Since pip 1.4 it does yes, however the problem here is that typically bandersnatch mirrors are simply hosted by plain static web servers and don’t require any sort of runtime logic. On Aug 28, 2014, at 6:39 PM, Joe Smith yasumo...@gmail.com wrote: Naive question- does pip send over a UserAgent (or something) that contains a version number the server can use to determine which behavior to default to? That would allow a deprecation cycle of N months or so that will let people upgrade from 1.5 to 1.6. We could then watch usage of 1.5 decrease over time until it's a non-factor. On Thu, Aug 28, 2014 at 3:26 PM, Donald Stufft don...@stufft.io mailto:don...@stufft.io wrote: On Aug 28, 2014, at 6:09 PM, Donald Stufft don...@stufft.io mailto:don...@stufft.io wrote: On Aug 28, 2014, at 2:58 PM, Donald Stufft don...@stufft.io mailto:don...@stufft.io wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names for it's directories and such as well. These two changes will make it so that the client side will know ahead of time exactly what form the server expects any given name to be in. This will allow a change in pip to happen which will pre-normalize all names which will make the interaction with mirrors better and will reduce the number of HTTP requests that a single ``pip install`` needs to make. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org mailto:Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig https://mail.python.org/mailman/listinfo/distutils-sig Hm, so here’s the problem. I have this implemented and deployed to TestPyPI, it works great! However, the next step is to make the change to bandersnatch so that it saves things using their normalized name instead of using their proper name. Doing this will trigger it so that everyone using pip 1.5 won't be able to install anything from that mirror unless it's name is specified as the normalized name (e.g. ``pip install Django`` will fail without --allow-unverified but ``pip install django`` will work). This would be fixed with pip 1.6 (since it would know to normalize the name before fetching the URL). The same thing will occur if we make the change in pip first, it would normalize names so you'd need to use --allow-unverified for everything because it would act as if you typed ``pip install django`` instead of ``pip install Django``. To my knowledge, this *only* will affect pip 1.5.x. So the only way forward I can see to make this change, which I think is a good change and will remove a big gotcha from using a mirror, is to coordinate a release of bandersnatch that coincides with pip 1.6, and tell people they need to upgrade in lockstep. Does anyone have any other ideas? --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org mailto:Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig https://mail.python.org/mailman/listinfo/distutils-sig Just thought of this, if the normalized name doesn’t match the real name, then add entries for both. This will make it so that pip 1.5 continues to work and pip 1.6+. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist
Re: [Distutils] Handling Case/Normalization Differences
Naive question- does pip send over a UserAgent (or something) that contains a version number the server can use to determine which behavior to default to? That would allow a deprecation cycle of N months or so that will let people upgrade from 1.5 to 1.6. We could then watch usage of 1.5 decrease over time until it's a non-factor. On Thu, Aug 28, 2014 at 3:26 PM, Donald Stufft don...@stufft.io wrote: On Aug 28, 2014, at 6:09 PM, Donald Stufft don...@stufft.io wrote: On Aug 28, 2014, at 2:58 PM, Donald Stufft don...@stufft.io wrote: Right now the “canonical” page for a particular project on PyPI is whatever the author happened to name their package (e.g. Django). This requires PyPI to have some smarts so that it can redirect things like /simple/django/ to /simple/Django/ otherwise someone doing ``pip install django`` would fall back to a much worse behavior. If this redirect doesn't happen, then pip will issue a request for just /simple/ and look for a link that, when both sides are normalized, compares equal to the name it's looking for. It will then follow the link, get /simple/Django/ and everything works... Except it doesn't. The problem here comes from the external link classification that we have now. Pip sees the link to /simple/Django/ as an external link (because it lacks the required rels) and the installation finally fails. The /simple/ case rarely happens when installing from PyPI itself because of the redirect, however it happens quite often when someone is attempting to instal from a mirror instead. Even when everything works correctly the penality for not knowing exactly what name to type in results in at least 1 extra http request, one of which (/simple/) requires pulling down a 2.1MB file. To fix this I'm going to modify PyPI so that it uses the normalized name in the /simple/ URL and redirects everything else to the non-normalized name. I'm also going to submit a PR to bandersnatch so that it will use normalized names for it's directories and such as well. These two changes will make it so that the client side will know ahead of time exactly what form the server expects any given name to be in. This will allow a change in pip to happen which will pre-normalize all names which will make the interaction with mirrors better and will reduce the number of HTTP requests that a single ``pip install`` needs to make. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig Hm, so here’s the problem. I have this implemented and deployed to TestPyPI, it works great! However, the next step is to make the change to bandersnatch so that it saves things using their normalized name instead of using their proper name. Doing this will trigger it so that everyone using pip 1.5 won't be able to install anything from that mirror unless it's name is specified as the normalized name (e.g. ``pip install Django`` will fail without --allow-unverified but ``pip install django`` will work). This would be fixed with pip 1.6 (since it would know to normalize the name before fetching the URL). The same thing will occur if we make the change in pip first, it would normalize names so you'd need to use --allow-unverified for everything because it would act as if you typed ``pip install django`` instead of ``pip install Django``. To my knowledge, this *only* will affect pip 1.5.x. So the only way forward I can see to make this change, which I think is a good change and will remove a big gotcha from using a mirror, is to coordinate a release of bandersnatch that coincides with pip 1.6, and tell people they need to upgrade in lockstep. Does anyone have any other ideas? --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig Just thought of this, if the normalized name doesn’t match the real name, then add entries for both. This will make it so that pip 1.5 continues to work and pip 1.6+. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Handling Case/Normalization Differences
On 29 Aug 2014 08:27, Donald Stufft don...@stufft.io wrote: Just thought of this, if the normalized name doesn’t match the real name, then add entries for both. This will make it so that pip 1.5 continues to work and pip 1.6+. Having bandersnatch mirrors publish under both names sounds like a good approach. Then the pip 1.6 release notes can just be explicit that using older mirrors will need the extra option - earlier versions won't have a problem. Cheers, Nick. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig