Here a forward from xmlschema-...@w3.org, I think Xerces is concerned by
this. There is an active thread on this mailing list, with archives
available at https://lists.w3.org/Archives/Public/xmlschema-dev/2022Aug/
Best regards,
Christophe
W3C's main web site https://www.w3.org/ will soon start to redirect
all http requests to https. Will this cause issues for XML
Schema-related resources hosted on www.w3.org?
We announced this intended change a few weeks ago,
[[
W3C’s main web site www.w3.org has been available via https for over
a decade, but until now we have not been redirecting all requests to
https as is commonly done on most other sites.
The primary reason for this is that we wanted to avoid causing
issues for software requesting machine-readable resources from
www.w3.org such as HTML DTDs, XML Schemas, and namespace documents.
We believe enough time has passed for most such software to have
been updated to handle redirects and https, so we are planning to
start redirecting all requests received over http to https within a
month or two.
]]
-- https://www.w3.org/blog/2022/07/redirecting-to-https-on-www-w3-org/
And following an initial test of this change on August 1 we received
some feedback that this caused issues with XML Schema validation. We
are planning a followup test for 3 days starting at 14:00 UTC
tomorrow, August 18.
Some questions I have:
Is it intended that www.w3.org is in the critical path when
performing XML Schema validation? Are .xsd files and/or namespace
documents retrieved each time a validation is done? Are there other
use cases besides validation that might cause automated requests to
www.w3.org?
What are the most popular software packages that might be making
these requests to www.w3.org? In what contexts do they make these
requests? Do the latest versions typically have the ability to
follow http to https redirects? Would XML catalogs help?
The top UAs making requests for .xsd resources on www.w3.org are:
127574 Java/1.8.0_121
96712
25860 Python-urllib/2.7
16673 Apache-CXF/3.3.4
16215 Zeep/4.1.0 (www.python-zeep.org)
6481 Apache-CXF/3.2.10
6205 Java/1.6.0_26
4176 Java/17.0.2
1827 Java/1.8.0_162
1485 Python-urllib/3.7
(1st col is the number of requests in a 90-min sample of the logs)
Omitting version numbers:
159765 Java
101314
29012 Python-urllib
27912 Apache-CXF
17640 Zeep
1467 Mozilla
623 Apache CXF
322 sax Java
211 Apache-HttpClient
187 Oracle HTTPClient Version 10h
120 node-soap
88 SOA Model (see http:
87 Elastic-Heartbeat
74 python-requests
74 curl
Top UAs making requests matching /2001/XMLSchema :
43290 Java
15014 Python-urllib
8358
6106 ALTOVA
3427 Mozilla
364 Go-http-client
130 Java1.8.0_291
88 Zabbix
70 WebexTeams
66 MVision
53 curl
44 Baiduspider+(+http:
42 Apache-HttpClient
40 MapForce
40 cubebot
If we start redirecting http to https, will that fundamentally break
compliance with W3C RECs that specify http: in references to .xsd
files and namespaces? If so, which URIs would we need to continue to
serve via http?
Thanks,
--
Gerald Oskoboiny <ger...@w3.org>
http://www.w3.org/People/Gerald/
tel:+1-604-906-1232 (mobile)