On 27/09/2020 16:28, Rodrigo Díez Villamuera wrote:

I am importing a subset of nodes from UK (those tagged with amenity:pub) for a pet project.

Firstly - welcome!



When analysing the data I realised that some of these nodes contain a website: tag that does not contain an appropriate URL schema (http/https).

Ie: www.mypub.com <http://www.mypub.com> rather than http://www.mypub.com <http://www.mypub.com> or https://www.mypub.com <https://www.mypub.com>

I'm not actually convinced that's a problem - as others have said, web browsers are perfectly capable of converting "www.mypub.com" into either "https://www.mypub.com"or ""http://www.mypub.com"as appropriate, so this doesn't really add any value.  "Letting the browser sort it out" is a great approach as it can deal with now/near future things such as removal TLS 1.0 and 1.1 support as well.



This goes in contradiction with the Wiki documentation for website. <https://wiki.openstreetmap.org/wiki/Key:website>

Unfortunately, OSM's wiki doesn't always reflect actual usage and this is one example.  Changing "www.mypub.com" to "https://www.mypub.com"; doesn't really add any value unless you're actually updating something else about the pub.  Actually, using "www.mypub.com" has some advantages here as it allows the user's web browser to negotiate https if available (the default nowadays) but fall back to http if not.


I created a proposal for a one-off, scoped, automated edit for these nodes to find the appropiate scheme for the existing URL and retag the nodes.

I added the proposal to the Automated edits log. You can read it here <https://wiki.openstreetmap.org/wiki/Automated_edits/rodrigodiez/Add_missing_URL_scheme_to_pub_websites_in_UK>.


What would be rather more interesting would be detecting websites that "don't or no longer represent the pub" in some way:

 * Perhaps the pub had a website, but now has new tenants, and they now
   communicate with customers on the facebook page?
 * Perhaps the website is (like one of your examples) just for the brewery?
 * Perhaps the website now points at domain parking?
 * Perhaps the https certificate has expired, which at the very least
   indicates that the website is unlikely to be kept up to date?

Any problems found would likely need to be resolved manually, but some at least of the above should be detectable automatically.

Best Regards,

Andy


_______________________________________________
Talk-GB mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/talk-gb

Reply via email to