Re: Fixing QNetworkAccessManager use

2020-02-20 Thread Ben Cooksley
On Thu, Feb 20, 2020 at 2:09 AM Friedrich W. H. Kossebau
 wrote:
>
> Am Mittwoch, 19. Februar 2020, 08:05:01 CET schrieb Ben Cooksley:
> > On Mon, Feb 3, 2020 at 7:42 AM Volker Krause  wrote:
> > > It would also help to know where specifically we have that problem, so we
> > > can actually solve it, and so we can figure out why we failed to fix this
> > > there earlier.
> >
> > Just bringing this up again - it seems we've not had much movement on
> > this aside from the Wiki page.
>
> The wiki page currently still just recommends to set
> "networkAccessManger->setAttribute(QNetworkRequest::FollowRedirectsAttribute,
> true);"
>
> Which seems simple, but possible not what is enough in all cases.
>
> So my open questions here to be able to act on code I contribute to are:
>
> a) What about the mentioned QNetworkRequest::NoLessSafeRedirectPolicy, in
> which cases should that be used and when not?

For interacting with download.kde.org / files.kde.org, I would advise
against using this policy, as they will in virtually all instances
redirect to mirrors (who don't support https and are http only)

>
> b) What about the HSTS stuff, when is that recommended?

That should be enabled yes.

>
> c) What is a sane number for QNetworkRequest::maximumRedirectsAllowed?

5 to 10 redirects is a relatively sane number I would expect. At the
most I would expect our servers to issue a maximum of 3 redirects in a
given chain of URLs.
If it is longer than that then we are doing something wrong.

>
> Both in general and when it comes to KDE servers.
>
> Personally I am still unsure what the actual issue is. Why are redirects
> needed at all. Why all the address changes all the time? The "U" in
> "URL"/"URI" is for "uniform", not "unstable", isn't it ;)

Please see my other email regarding this.

>
> Can you give some examples for URLs of resources our code uses on KDE servers,
> and why they needed to change?

Get Hot New Stuff functionality (Gen 1), originally using a static
file tree under http://download.kde.org/khotnewstuff/
This needed to change for two reasons:
1) Mandatory HTTPS
2) The benefit of having these files mirrored, considering their
extremely small size and declining client base (KDE 3 and parts of KDE
4) was negligible and creating more load on our systems to support the
mirroring process than we got in terms of benefit of having them
mirrored. We therefore transitioned to serving these through a CDN.

Get Hot New Stuff functionality (Gen 2), originally using a dynamic
web service at http://newstuff.kde.org/ and http://data.kstuff.org/
needed to change for two reasons:
1) Mandatory HTTPS
2) The dynamic web service had not been updated in several years, and
was dependent on a very specific system setup we hadn't been able to
replicate and needed to decomission due to it's age. We therefore
needed to convert it to static files, and arrange for those to be
hosted elsewhere in our systems. newstuff.kde.org now converts the
requests sent to it to redirects to specific static files to keep
applications using it working (which includes KF5 era applications who
still actively use this and in at least one case continue to be
released using this)

Get Hot New Stuff functionality (Gen 3), originally used a file at
http://download.kde.org/ocs/providers.xml (now at
https://autoconfig.kde.org/ocs/providers.xml)
This needed to change for two reasons:
1) Mandatory HTTPS
2) It was necessary for non-sysadmins (particularly those involved in
running store.kde.org) to be able to update the file directly. As the
server hosting download.kde.org is sensitive and doesn't support
deploying changes from Git when they are committed, we had to move the
file to a different subdomain which could support this.

Marble maps, originally hosted under http://download.kde.org/ and
later at http://files.kde.org/marble/maps/ and now at
https://maps.kde.org/,
This need to be moved for couple of reasons:
1) When we transitioned download.kde.org to be a mirror redirector, it
was no longer possible for us to easily host non-mirrored resources
under the same domain (and the maps weren't mirrored), requiring they
be moved to files.kde.org (which as an added benefit also made it
possible for developers to update the maps themselves)
2) Later, it was discovered that Marble performance for loading maps
using files.kde.org after it transitioned to being a mirror redirector
as well was quite poor due to the large number of http requests
involved. We therefore shifted it to a CDN based resource which
eliminated these performance issues, known as maps.kde.org.

KStars resources, originally hosted under
http://download.kde.org/apps/kstars/ needed to be moved to
https://files.kde.org/ for the following reasons:
1) Mandatory HTTPS
2) To allow developers to freely update them as needed, something
which isn't possible on download.kde.org (which is restricted due to
it hosting the master copies of tarballs)

There have also been two instances where we have been de

Re: Fixing QNetworkAccessManager use

2020-02-19 Thread Ben Cooksley
On Thu, Feb 20, 2020 at 9:58 AM Friedrich W. H. Kossebau
 wrote:
>
> Am Mittwoch, 19. Februar 2020, 21:01:20 CET schrieb Johan Ouwerkerk:
> > On Wed, Feb 19, 2020 at 2:09 PM Friedrich W. H. Kossebau
> >
> >  wrote:
> > > Personally I am still unsure what the actual issue is. Why are redirects
> > > needed at all. Why all the address changes all the time?
> >
> > It is part of the HTTP spec for servers to be able to inform clients
> > that resource /foo/bar has moved to /bar/baz, either temporarily or
> > permanently.
>
> :) Thanks for that explanation, but that was not my question here (that part I
> am well aware of, done my share of web stuff).
>
> It was rather: why are subdomain names and/or access paths not once properly
> designed, but instead changed so often that redirection seems so important to
> be a default feature? Just because one can?

Things don't change extremely often.
Sometimes however requirements or other factors change, which
necessitates changing where a resource is hosted.

When this happens, it is extremely useful to have the ability to
relocate it elsewhere.

To use an example, when we first setup files.kde.org it was used by a
couple of things, including Necessitas for the Qt binaries that get
downloaded on to end user (Android) devices. When this was first
established, traffic was well within the reasonable bounds we had
expected when setting this up, and everything was served directly by
our (single) server. This went quite well for a while.

Sometime a bit later, an application was released on Google Play that
used Necessitas which was *extremely* popular, to the extent it caused
around a terabyte of data to be used within 48 hours or so. Hetzner
bandwidth was at this time not only limited to 100mbps, but also
capped - with the limit being 5 TB per month and overage after that
resulting in a charge per terabyte.

We therefore made the decision to convert files.kde.org to a mirror
network (like was already in place for download.kde.org), with
redirection taking place using Mirrorbrain. We were able to complete
this transition quickly thanks to the generous support of some of our
mirrors who established mirrors of files.kde.org. Fortunately
Necessitas had full support for handling redirects, so this is
something we were able to accomplish without any issues.

Had redirect support not been available, we would have been left with
no way out at that time.

I also have other examples involving Marble (including where we got
bitten by QNetworkAccessManager for the very first time - back in
January 2012) and numerous other KDE Edu applications (all of which
fortunately avoided QNAM).

> When we write code, we try to keep API stable as much as possible, and only
> change API when really useful, and that means for the consumer. When doing
> references in text we try to have eternally stable pointers (thanks ISBN &
> Co.),
>
> But this request for stable URLs on the internet might be an idealistic fight
> against windmills of a web 1.0 person...
>
> Cheers
> Friedrich
>
>

Cheers,
Ben


Re: Fixing QNetworkAccessManager use

2020-02-19 Thread Johan Ouwerkerk
On Wed, Feb 19, 2020 at 2:09 PM Friedrich W. H. Kossebau
 wrote:
>
> Personally I am still unsure what the actual issue is. Why are redirects
> needed at all. Why all the address changes all the time?
>

It is part of the HTTP spec for servers to be able to inform clients
that resource /foo/bar has moved to /bar/baz, either temporarily or
permanently.
This can be used to do things like mapping /retrieve/document/by/alias
-> /documents/actual/document-id, or to redirect to different hosts
entirely, or to inform plain text HTTP clients to upgrade to using
HTTPS instead. (HSTS is a spec describing how a server can then ask
the client to subsequently enforce its policy preference for when to
connect over HTTPS.)

The main difference between temporary and permanent redirects is that
clients are allowed to "remember" when a resource moved in the case of
permanent redirects so they can optimise subsequent calls to the moved
resources (bypassing the redirect entirely). But as you can see, the
temporary redirect is something that could be used to do load
balancing: assume /resource is expensive to compute or retrieve, then
put a proxy in front which load balances to the actual pool of servers
using temporary redirects. (Of course you could argue that in such a
case maybe round-robin DNS is a better solution altogether.)

Regards,

- Johan


Re: Fixing QNetworkAccessManager use

2020-02-19 Thread Friedrich W. H. Kossebau
Am Mittwoch, 19. Februar 2020, 21:01:20 CET schrieb Johan Ouwerkerk:
> On Wed, Feb 19, 2020 at 2:09 PM Friedrich W. H. Kossebau
> 
>  wrote:
> > Personally I am still unsure what the actual issue is. Why are redirects
> > needed at all. Why all the address changes all the time?
> 
> It is part of the HTTP spec for servers to be able to inform clients
> that resource /foo/bar has moved to /bar/baz, either temporarily or
> permanently.

:) Thanks for that explanation, but that was not my question here (that part I 
am well aware of, done my share of web stuff).

It was rather: why are subdomain names and/or access paths not once properly 
designed, but instead changed so often that redirection seems so important to 
be a default feature? Just because one can?
When we write code, we try to keep API stable as much as possible, and only 
change API when really useful, and that means for the consumer. When doing 
references in text we try to have eternally stable pointers (thanks ISBN & 
Co.),

But this request for stable URLs on the internet might be an idealistic fight 
against windmills of a web 1.0 person...

Cheers
Friedrich




Re: Fixing QNetworkAccessManager use

2020-02-19 Thread Friedrich W. H. Kossebau
Am Mittwoch, 19. Februar 2020, 08:05:01 CET schrieb Ben Cooksley:
> On Mon, Feb 3, 2020 at 7:42 AM Volker Krause  wrote:
> > It would also help to know where specifically we have that problem, so we
> > can actually solve it, and so we can figure out why we failed to fix this
> > there earlier.
> 
> Just bringing this up again - it seems we've not had much movement on
> this aside from the Wiki page.

The wiki page currently still just recommends to set
"networkAccessManger->setAttribute(QNetworkRequest::FollowRedirectsAttribute, 
true);"

Which seems simple, but possible not what is enough in all cases.

So my open questions here to be able to act on code I contribute to are:

a) What about the mentioned QNetworkRequest::NoLessSafeRedirectPolicy, in 
which cases should that be used and when not?

b) What about the HSTS stuff, when is that recommended?

c) What is a sane number for QNetworkRequest::maximumRedirectsAllowed?

Both in general and when it comes to KDE servers.

Personally I am still unsure what the actual issue is. Why are redirects 
needed at all. Why all the address changes all the time? The "U" in 
"URL"/"URI" is for "uniform", not "unstable", isn't it ;)

Can you give some examples for URLs of resources our code uses on KDE servers, 
and why they needed to change?

And if those redirects are permanent, should the client side not also 
permanently update to the new location then, instead of continuing to poke the 
old address every time again and again, until one day it will poke into a void 
because the backward compat redirect support has been dropped?

Cheers
Friedrich




Fixing QNetworkAccessManager use for KDE services

2020-02-02 Thread Friedrich W. H. Kossebau
Hi Ben,

sorry to hear about this pain you have in all the good work you do that allows 
us to enjoy the high reliability of the KDE services. I would like to help to 
reduce that pain.

Am Samstag, 1. Februar 2020, 23:24:14 CET schrieb Ben Cooksley:
> Hi all,
> 
> For an extremely long time now, it has been a known issue with the
> QNetworkAccessManager that by default it does not follow redirects as
> issued by the server it is accessing. This is something the Qt Project
> has refused to address using the justification of 'behavioural
> compatibility'

This justification makes sense to me. People who have had in their code manual 
redirect handling would not be happy if Qt suddenly starts to do things 
internally in potentially other ways.

Also are they announcing in the API dox to consider changing the default:
"For backwards compatibility the default value is 
QNetworkRequest::ManualRedirectPolicy. This may change in the future and some 
type of auto-redirect policy will become the default; clients relying on 
manual redirect handling are encouraged to set this policy explicitly in their 
code."
https://doc.qt.io/qt-5/qnetworkaccessmanager.html#setRedirectPolicy

> This behaviour is contrary to that of just about every other HTTP
> stack (with the exception of libcurl from my understanding) and is
> behaviour that nobody expects.

In my case it would be: nobody thought about.
When talking to a given hardcoded address, e.g. to query a data blob, and it 
no longer resolves I would rather expect by default that the service is no 
longer existing. Mentally driven by C++ ABI concepts that method names & 
signatures have to be stable ;)
Possibly admins/web service developers might think different, as they might 
like to be flexible under which urls to respond to requests, given redirects 
exists in the protocol and thus invite to be used.
Might be a clash of cultures to some degree.

> As a consequence of this (broken) behaviour, Sysadmin has been
> effectively forced to implement workarounds and other compatibility
> measures in place to keep applications functional.

It is also the consequence of no developers having picked up the new built-in 
redirect accepting options having appeared in Qt 5.6 or 5.9 (more control).
And they have not done so because they (at least me) have not been aware that 
KDE sysadmins would like to be flexible when it comes to data/web services on 
KDE servers and the addresses under which they are available. See below for 
proposal how to fix that.

[...]

> Therefore, given the Qt Project is unlikely to change their position
> on this, I would like to propose the following:
> 1) That effective immediately, QNetworkAccessManager and it's children
> classes be banned and prohibited within KDE software, to be enforced
> by server side hooks.
> 2) That we fork QNetworkAccessManager and the associated classes
> within the appropriate Framework (to be determined), where the
> defective behaviour can then be corrected.

I cannot see how both help with released code out there already in the wild. 
To prepare future released code to be supportive to your redirecting desires 
when it comes to KDE services, I would rather propose this:

A) Document the need to enable redirects in requests when using KDE server 
services in the coding policies as well in the documentation of the respective 
KDE services, including code examples how to write those.
B) Have a rally to fix all current code.
C) Have an issue on bugreports.qt.io to see that Qt 6 will have changed the 
default, to what web service developers/admins would prefer.

E.g. in KDevelop code there is a query for https://projects.kde.org/
kde_projects.xml. What kind of redirect support do KDE server admins expect to 
be supported? So what QNetworkRequest::RedirectPolicy value should be set? 
What QNetworkRequest::maximumRedirectsAllowed?
Ideally one could find answers to these questions on community.kde.org.

Cheers
Friedrich