Re: Fixing QNetworkAccessManager use
On Thu, Feb 20, 2020 at 2:09 AM Friedrich W. H. Kossebau wrote: > > Am Mittwoch, 19. Februar 2020, 08:05:01 CET schrieb Ben Cooksley: > > On Mon, Feb 3, 2020 at 7:42 AM Volker Krause wrote: > > > It would also help to know where specifically we have that problem, so we > > > can actually solve it, and so we can figure out why we failed to fix this > > > there earlier. > > > > Just bringing this up again - it seems we've not had much movement on > > this aside from the Wiki page. > > The wiki page currently still just recommends to set > "networkAccessManger->setAttribute(QNetworkRequest::FollowRedirectsAttribute, > true);" > > Which seems simple, but possible not what is enough in all cases. > > So my open questions here to be able to act on code I contribute to are: > > a) What about the mentioned QNetworkRequest::NoLessSafeRedirectPolicy, in > which cases should that be used and when not? For interacting with download.kde.org / files.kde.org, I would advise against using this policy, as they will in virtually all instances redirect to mirrors (who don't support https and are http only) > > b) What about the HSTS stuff, when is that recommended? That should be enabled yes. > > c) What is a sane number for QNetworkRequest::maximumRedirectsAllowed? 5 to 10 redirects is a relatively sane number I would expect. At the most I would expect our servers to issue a maximum of 3 redirects in a given chain of URLs. If it is longer than that then we are doing something wrong. > > Both in general and when it comes to KDE servers. > > Personally I am still unsure what the actual issue is. Why are redirects > needed at all. Why all the address changes all the time? The "U" in > "URL"/"URI" is for "uniform", not "unstable", isn't it ;) Please see my other email regarding this. > > Can you give some examples for URLs of resources our code uses on KDE servers, > and why they needed to change? Get Hot New Stuff functionality (Gen 1), originally using a static file tree under http://download.kde.org/khotnewstuff/ This needed to change for two reasons: 1) Mandatory HTTPS 2) The benefit of having these files mirrored, considering their extremely small size and declining client base (KDE 3 and parts of KDE 4) was negligible and creating more load on our systems to support the mirroring process than we got in terms of benefit of having them mirrored. We therefore transitioned to serving these through a CDN. Get Hot New Stuff functionality (Gen 2), originally using a dynamic web service at http://newstuff.kde.org/ and http://data.kstuff.org/ needed to change for two reasons: 1) Mandatory HTTPS 2) The dynamic web service had not been updated in several years, and was dependent on a very specific system setup we hadn't been able to replicate and needed to decomission due to it's age. We therefore needed to convert it to static files, and arrange for those to be hosted elsewhere in our systems. newstuff.kde.org now converts the requests sent to it to redirects to specific static files to keep applications using it working (which includes KF5 era applications who still actively use this and in at least one case continue to be released using this) Get Hot New Stuff functionality (Gen 3), originally used a file at http://download.kde.org/ocs/providers.xml (now at https://autoconfig.kde.org/ocs/providers.xml) This needed to change for two reasons: 1) Mandatory HTTPS 2) It was necessary for non-sysadmins (particularly those involved in running store.kde.org) to be able to update the file directly. As the server hosting download.kde.org is sensitive and doesn't support deploying changes from Git when they are committed, we had to move the file to a different subdomain which could support this. Marble maps, originally hosted under http://download.kde.org/ and later at http://files.kde.org/marble/maps/ and now at https://maps.kde.org/, This need to be moved for couple of reasons: 1) When we transitioned download.kde.org to be a mirror redirector, it was no longer possible for us to easily host non-mirrored resources under the same domain (and the maps weren't mirrored), requiring they be moved to files.kde.org (which as an added benefit also made it possible for developers to update the maps themselves) 2) Later, it was discovered that Marble performance for loading maps using files.kde.org after it transitioned to being a mirror redirector as well was quite poor due to the large number of http requests involved. We therefore shifted it to a CDN based resource which eliminated these performance issues, known as maps.kde.org. KStars resources, originally hosted under http://download.kde.org/apps/kstars/ needed to be moved to https://files.kde.org/ for the following reasons: 1) Mandatory HTTPS 2) To allow developers to freely update them as needed, something which isn't possible on download.kde.org (which is restricted due to it hosting the master copies of tarballs) There have also been two instances where we have been de
Re: Fixing QNetworkAccessManager use
On Thu, Feb 20, 2020 at 9:58 AM Friedrich W. H. Kossebau wrote: > > Am Mittwoch, 19. Februar 2020, 21:01:20 CET schrieb Johan Ouwerkerk: > > On Wed, Feb 19, 2020 at 2:09 PM Friedrich W. H. Kossebau > > > > wrote: > > > Personally I am still unsure what the actual issue is. Why are redirects > > > needed at all. Why all the address changes all the time? > > > > It is part of the HTTP spec for servers to be able to inform clients > > that resource /foo/bar has moved to /bar/baz, either temporarily or > > permanently. > > :) Thanks for that explanation, but that was not my question here (that part I > am well aware of, done my share of web stuff). > > It was rather: why are subdomain names and/or access paths not once properly > designed, but instead changed so often that redirection seems so important to > be a default feature? Just because one can? Things don't change extremely often. Sometimes however requirements or other factors change, which necessitates changing where a resource is hosted. When this happens, it is extremely useful to have the ability to relocate it elsewhere. To use an example, when we first setup files.kde.org it was used by a couple of things, including Necessitas for the Qt binaries that get downloaded on to end user (Android) devices. When this was first established, traffic was well within the reasonable bounds we had expected when setting this up, and everything was served directly by our (single) server. This went quite well for a while. Sometime a bit later, an application was released on Google Play that used Necessitas which was *extremely* popular, to the extent it caused around a terabyte of data to be used within 48 hours or so. Hetzner bandwidth was at this time not only limited to 100mbps, but also capped - with the limit being 5 TB per month and overage after that resulting in a charge per terabyte. We therefore made the decision to convert files.kde.org to a mirror network (like was already in place for download.kde.org), with redirection taking place using Mirrorbrain. We were able to complete this transition quickly thanks to the generous support of some of our mirrors who established mirrors of files.kde.org. Fortunately Necessitas had full support for handling redirects, so this is something we were able to accomplish without any issues. Had redirect support not been available, we would have been left with no way out at that time. I also have other examples involving Marble (including where we got bitten by QNetworkAccessManager for the very first time - back in January 2012) and numerous other KDE Edu applications (all of which fortunately avoided QNAM). > When we write code, we try to keep API stable as much as possible, and only > change API when really useful, and that means for the consumer. When doing > references in text we try to have eternally stable pointers (thanks ISBN & > Co.), > > But this request for stable URLs on the internet might be an idealistic fight > against windmills of a web 1.0 person... > > Cheers > Friedrich > > Cheers, Ben
Re: Fixing QNetworkAccessManager use
On Wed, Feb 19, 2020 at 2:09 PM Friedrich W. H. Kossebau wrote: > > Personally I am still unsure what the actual issue is. Why are redirects > needed at all. Why all the address changes all the time? > It is part of the HTTP spec for servers to be able to inform clients that resource /foo/bar has moved to /bar/baz, either temporarily or permanently. This can be used to do things like mapping /retrieve/document/by/alias -> /documents/actual/document-id, or to redirect to different hosts entirely, or to inform plain text HTTP clients to upgrade to using HTTPS instead. (HSTS is a spec describing how a server can then ask the client to subsequently enforce its policy preference for when to connect over HTTPS.) The main difference between temporary and permanent redirects is that clients are allowed to "remember" when a resource moved in the case of permanent redirects so they can optimise subsequent calls to the moved resources (bypassing the redirect entirely). But as you can see, the temporary redirect is something that could be used to do load balancing: assume /resource is expensive to compute or retrieve, then put a proxy in front which load balances to the actual pool of servers using temporary redirects. (Of course you could argue that in such a case maybe round-robin DNS is a better solution altogether.) Regards, - Johan
Re: Fixing QNetworkAccessManager use
Am Mittwoch, 19. Februar 2020, 21:01:20 CET schrieb Johan Ouwerkerk: > On Wed, Feb 19, 2020 at 2:09 PM Friedrich W. H. Kossebau > > wrote: > > Personally I am still unsure what the actual issue is. Why are redirects > > needed at all. Why all the address changes all the time? > > It is part of the HTTP spec for servers to be able to inform clients > that resource /foo/bar has moved to /bar/baz, either temporarily or > permanently. :) Thanks for that explanation, but that was not my question here (that part I am well aware of, done my share of web stuff). It was rather: why are subdomain names and/or access paths not once properly designed, but instead changed so often that redirection seems so important to be a default feature? Just because one can? When we write code, we try to keep API stable as much as possible, and only change API when really useful, and that means for the consumer. When doing references in text we try to have eternally stable pointers (thanks ISBN & Co.), But this request for stable URLs on the internet might be an idealistic fight against windmills of a web 1.0 person... Cheers Friedrich
Re: Fixing QNetworkAccessManager use
Am Mittwoch, 19. Februar 2020, 08:05:01 CET schrieb Ben Cooksley: > On Mon, Feb 3, 2020 at 7:42 AM Volker Krause wrote: > > It would also help to know where specifically we have that problem, so we > > can actually solve it, and so we can figure out why we failed to fix this > > there earlier. > > Just bringing this up again - it seems we've not had much movement on > this aside from the Wiki page. The wiki page currently still just recommends to set "networkAccessManger->setAttribute(QNetworkRequest::FollowRedirectsAttribute, true);" Which seems simple, but possible not what is enough in all cases. So my open questions here to be able to act on code I contribute to are: a) What about the mentioned QNetworkRequest::NoLessSafeRedirectPolicy, in which cases should that be used and when not? b) What about the HSTS stuff, when is that recommended? c) What is a sane number for QNetworkRequest::maximumRedirectsAllowed? Both in general and when it comes to KDE servers. Personally I am still unsure what the actual issue is. Why are redirects needed at all. Why all the address changes all the time? The "U" in "URL"/"URI" is for "uniform", not "unstable", isn't it ;) Can you give some examples for URLs of resources our code uses on KDE servers, and why they needed to change? And if those redirects are permanent, should the client side not also permanently update to the new location then, instead of continuing to poke the old address every time again and again, until one day it will poke into a void because the backward compat redirect support has been dropped? Cheers Friedrich
Fixing QNetworkAccessManager use for KDE services
Hi Ben, sorry to hear about this pain you have in all the good work you do that allows us to enjoy the high reliability of the KDE services. I would like to help to reduce that pain. Am Samstag, 1. Februar 2020, 23:24:14 CET schrieb Ben Cooksley: > Hi all, > > For an extremely long time now, it has been a known issue with the > QNetworkAccessManager that by default it does not follow redirects as > issued by the server it is accessing. This is something the Qt Project > has refused to address using the justification of 'behavioural > compatibility' This justification makes sense to me. People who have had in their code manual redirect handling would not be happy if Qt suddenly starts to do things internally in potentially other ways. Also are they announcing in the API dox to consider changing the default: "For backwards compatibility the default value is QNetworkRequest::ManualRedirectPolicy. This may change in the future and some type of auto-redirect policy will become the default; clients relying on manual redirect handling are encouraged to set this policy explicitly in their code." https://doc.qt.io/qt-5/qnetworkaccessmanager.html#setRedirectPolicy > This behaviour is contrary to that of just about every other HTTP > stack (with the exception of libcurl from my understanding) and is > behaviour that nobody expects. In my case it would be: nobody thought about. When talking to a given hardcoded address, e.g. to query a data blob, and it no longer resolves I would rather expect by default that the service is no longer existing. Mentally driven by C++ ABI concepts that method names & signatures have to be stable ;) Possibly admins/web service developers might think different, as they might like to be flexible under which urls to respond to requests, given redirects exists in the protocol and thus invite to be used. Might be a clash of cultures to some degree. > As a consequence of this (broken) behaviour, Sysadmin has been > effectively forced to implement workarounds and other compatibility > measures in place to keep applications functional. It is also the consequence of no developers having picked up the new built-in redirect accepting options having appeared in Qt 5.6 or 5.9 (more control). And they have not done so because they (at least me) have not been aware that KDE sysadmins would like to be flexible when it comes to data/web services on KDE servers and the addresses under which they are available. See below for proposal how to fix that. [...] > Therefore, given the Qt Project is unlikely to change their position > on this, I would like to propose the following: > 1) That effective immediately, QNetworkAccessManager and it's children > classes be banned and prohibited within KDE software, to be enforced > by server side hooks. > 2) That we fork QNetworkAccessManager and the associated classes > within the appropriate Framework (to be determined), where the > defective behaviour can then be corrected. I cannot see how both help with released code out there already in the wild. To prepare future released code to be supportive to your redirecting desires when it comes to KDE services, I would rather propose this: A) Document the need to enable redirects in requests when using KDE server services in the coding policies as well in the documentation of the respective KDE services, including code examples how to write those. B) Have a rally to fix all current code. C) Have an issue on bugreports.qt.io to see that Qt 6 will have changed the default, to what web service developers/admins would prefer. E.g. in KDevelop code there is a query for https://projects.kde.org/ kde_projects.xml. What kind of redirect support do KDE server admins expect to be supported? So what QNetworkRequest::RedirectPolicy value should be set? What QNetworkRequest::maximumRedirectsAllowed? Ideally one could find answers to these questions on community.kde.org. Cheers Friedrich