Re: [OSM-talk] New tool/API to find local OSM mailing lists by location
Thanks for the feedback. It's obvious I need to make two changes: - support other forums beyond the mailing lists hosted on http://lists.openstreetmap.org/ - return all the matches within the result hierarchy, not just the most specific, so searching for Glasgow should give talk-scotland and talk-GB Best to avoid building anything against the existing API because I'm going to change it. I'll post here when I've made these changes. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
[OSM-talk] New tool/API to find local OSM mailing lists by location
I've made a tool with an API for finding local OSM mailing lists. https://local.openstreetmap.directory/ The web interface handles searching by place name or lat/lon, the API can be queried by place name, lat/lon and OSM object. This is a proof of concept, I threw it together to see if it useful. How might it be used? Think about systems for organising remote mapathons, like Missing Maps. They could use the API to look up the local mailing list and remind the organiser to contact the local community to let them know about the mapathon. Or an assisted editing tool like MapRoulette might show details of the local mailing list so mappers know how to contact if they're making a complex edit and they want to check with the local community that the edit is correct. Here are some example queries and the result: Cuba: Talk-cu Rome: Talk-it-lazio Burkina Faso: Talk-bf Oxford:Talk-gb-oxoncotswolds Timbuktu: Talk-ml 47.6,-122.3: Talk-us-pugetsound How does it work? The heavy lifting is done via the Nominatim API. For every lookup it issues a query to Nominatim and uses the address information to determine the local mailing list. Code: https://github.com/EdwardBetts/localosm Service: https://local.openstreetmap.directory/ Is this useful? I'd love to hear your thoughts or questions. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
[Talk-gb-westmidlands] Wikidata tags for the West Midlands available as OSM XML
Hi, For a while now I've been working on an automated system to match OSM entities with Wikidata items. I've built something that will let you search for a place and find all Wikidata items in that location with a matching OSM entity. In the past there was interest from the West Midlands mapping community to adding wikidata tags. Here are some saved places in the West Midlands: https://osm.wikidata.link/filtered/West_Midlands Each result shows a list of matches between Wikidata and OSM. The matches in each location are available to download as an OSM change XML file. Local mappers can download this OSM XML, inspect it in JOSM and if it looks correct they can upload the changes. Code and documentation here: https://github.com/EdwardBetts/osm-wikidata Let me know if you've got any questions or ideas. -- Edward. ___ Talk-gb-westmidlands mailing list Talk-gb-westmidlands@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb-westmidlands
Re: [OSM-talk-ie] [Talk-GB] OSM with Wikidata: now covers UK and Ireland
Brian Pranglewrote: > I would import only after an invitation by a mapper or mappers in the > relevant county, and only after they've checked where your data has more > than one match and indicated which of the multiple matches is the > appropriate one Thanks Brian. My technique for dealing with the duplicate matches is just to skip them. Anybody is free to add the wikidata tag by hand. I made the duplicates visible on my list of matches because in some cases we might be able to devise a method for picking where to add the wikidata tag automatically. -- Edward. ___ Talk-ie mailing list Talk-ie@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ie
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
Andrew Hainwrote: > I still get internal errors in a few places, for instance Richmond upon > Thames. It was failing because node 880543279 had been deleted by this recent changeset. http://www.openstreetmap.org/changeset/37791280 I changed the code to skip objects that have been deleted. Thanks for the report. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
Sorry about that. Thanks for the bug report. I've fixed it. Andrew Hain <andrewhain...@hotmail.co.uk> wrote: > I’m getting internal server errors when I try to look at the previews. > > -- > Andrew > > ________ > From: Edward Betts <edw...@4angle.com> > Sent: 11 March 2016 15:46 > To: talk-gb@openstreetmap.org > Subject: Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland > > I've added a code to preview the XML of the changeset that adds wikidata tags > to objects in a given area. You can find preview links on region, county and > district pages, but only if there are less than 150 objects to annotate with a > wikidata tag. > > Examples: > > Norwich > http://edwardbetts.com/osm-wikidata/gb-ie/district/Norwich > preview: http://edwardbetts.com/osm-wikidata/gb-ie/preview/8/Norwich > > Swindon > http://edwardbetts.com/osm-wikidata/gb-ie/county/Swindon > preview: http://edwardbetts.com/osm-wikidata/gb-ie/preview/6/Swindon > > Hackney > http://edwardbetts.com/osm-wikidata/gb-ie/preview/8/London_Borough_of_Hackney > > Isle of Man > http://edwardbetts.com/osm-wikidata/gb-ie/region/Isle_of_Man > preview: http://edwardbetts.com/osm-wikidata/gb-ie/preview/2/Isle_of_Man > > The preview page might be a little slow the first time, but the data is cached > so future access will be fast. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] [Imports] OSM with Wikidata: now covers UK and Ireland
Thanks SK53, you're right. I meant the West Midlands. SK53 <sk53@gmail.com> wrote: > I presume you mean the West Midlands: firstly Birmingham is there & > secondly, I suspect East Midland mappers are less enthusiastic about this > sort of thing. > > Jerry > > On 11 March 2016 at 19:15, Edward Betts <edw...@4angle.com> wrote: > > > Chris Hill <o...@raggedred.net> wrote: > > > On 11/03/16 18:03, Edward Betts wrote: > > > >I've uploaded my first changeset - Places of worship in Birmingham. > > > > > > > >https://www.openstreetmap.org/changeset/37766888 > > > > > > > >This is a modification to 10 ways and one relation. > > > > > > > >The relation is a multipolygon representing Birmingham Oratory. The > > uploader > > > >has put the wikidata tag on the relation, which looks wrong. The other > > tags for > > > >the church are on the outer way. I will fix this before I do any more > > uploads. > > > > > > > > > > Since more than one person has asked you to check with local mappers > > before > > > making these uploads and there is an unanswered question about the > > licence, > > > I would ask you to stop uploading any more of these data until these > > > questions have been addressed to the satisfaction of the mappers in GB. > > > > I was specially approached by mappers from the East Midlands and asked to > > upload this data. Sorry for not making this clear. > > > > There is no problem with the license. I'm not deriving anything from > > Wikipedia > > or Wikidata. I'm just adding a link to Wikidata. There are already lots of > > links to both Wikipedia and Wikidata in OSM. > > -- > > Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
Richard Fairhurstwrote: > Colin Smale wrote: > > As we are not copying the content from Wikipedia/Wikidata, but just > > a reference > > Unfortunately it's not quite that simple. > > The matching is done by co-ordinates. The co-ordinates in Wikidata could be > held to be information copyrighted by Google. Consequently you could argue > that the matching - "the selection or the arrangement of the contents of the > database", to use the language from the Database Directive - is itself a > derivative of Google's map data. > > To be clear, I'm not arguing one way or another - I've probably studied the > related issues as much as anyone on this list and it's not obvious to me > which way it would fall. But anything with the potential to affect such a > large amount of OSM data in the UK needs a thorough legal review, lest we > inadvertently encumber thousands of uses of OSM with Google IP. The way I see it is like this. OSM uses the Open Database License (ODbL), Wikipedia is Creative Commons Attribution-ShareAlike 3.0 (CC-BY-SA). Wikidata is public domain or CC0. OSM uses European law which includes database rights. Wikimedia projects are governed by US law which does not recognize database rights. Under US law coordinates are facts, they can't be copyrighted. There is no protection for coordinates under US law. Therefore Wikimedia has no problems importing coordinates from Wikipedia into Wikidata. Under European law individual coordinates aren't copyrightable, but a large group of them can be considered to constitute a database and database rights come into play. Wikimedia does not assert database rights for Wikidata or Wikipedia. According to taginfo there are 63,782 OSM objects with a wikidata tag. See https://taginfo.openstreetmap.org/keys/wikidata Nobody has any legal objection to adding Wikidata tags to OSM. We aren't violating the database rights of Wikidata by including 60k+ links to Wikidata. We assume that these wikidata tags have been added by hand. The question is whether automating the process of adding wikidata tags to OSM somehow causes a copyright or database right violation. I'm specifically not copying any data at all from Wikidata to OSM. I'm not copying the coordinates, labels or any properties from Wikidata. The only thing I'm adding is a link. I think it is fine for the matching software to consider the location of an item in Wikidata and OSM when deciding if they are a match. This does not make OSM a derived work of Wikidata. There is significant fuzziness in the matching for the location. The matching algorithm consider Wikidata items and OSM objects within 1 kilometres of each other when looking for a match. I hope this answers the question about the legal ramifications of adding links from OSM to Wikidata. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] [Imports] OSM with Wikidata: now covers UK and Ireland
Chris Hill <o...@raggedred.net> wrote: > On 11/03/16 18:03, Edward Betts wrote: > >I've uploaded my first changeset - Places of worship in Birmingham. > > > >https://www.openstreetmap.org/changeset/37766888 > > > >This is a modification to 10 ways and one relation. > > > >The relation is a multipolygon representing Birmingham Oratory. The uploader > >has put the wikidata tag on the relation, which looks wrong. The other tags > >for > >the church are on the outer way. I will fix this before I do any more > >uploads. > > > > Since more than one person has asked you to check with local mappers before > making these uploads and there is an unanswered question about the licence, > I would ask you to stop uploading any more of these data until these > questions have been addressed to the satisfaction of the mappers in GB. I was specially approached by mappers from the East Midlands and asked to upload this data. Sorry for not making this clear. There is no problem with the license. I'm not deriving anything from Wikipedia or Wikidata. I'm just adding a link to Wikidata. There are already lots of links to both Wikipedia and Wikidata in OSM. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
I've uploaded my first changeset - Places of worship in Birmingham. https://www.openstreetmap.org/changeset/37766888 This is a modification to 10 ways and one relation. The relation is a multipolygon representing Birmingham Oratory. The uploader has put the wikidata tag on the relation, which looks wrong. The other tags for the church are on the outer way. I will fix this before I do any more uploads. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
Good question. OSM editors and viewers could be changed to integrate more closely with Wikidata. An OSM editor could suggest to the mapper a matching Wikidata item to tag. We could have the editor automatically look up the label from the Wikidata item in the preferred language of the person editing the map. The map view on the main OSM website includes basic Wikipedia support, the Wikipedia tags becomes links to Wikipedia. Support could be added for Wikidata, this could include looking up the Wikidata label in the appropriate language and links to Wikipedia. We can also show other information from Wikidata like the name of the architect or a photograph. -- Edward. Philip Barneswrote: > What plans are there for maintenance of this data in future, it is a move > away from human readable tags so errors will go unnoticed. > > Mappers are very unlikely to add new wikidata tags in the way we add > Wikipedia. > > Phil (trigpoint) ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
I've added admin_level=8 to the output. This is useful for selling matches grouped by districts, London boroughs and metropolitan boroughs. For example: http://edwardbetts.com/osm-wikidata/gb-ie/region/Greater_London -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] [Imports] OSM with Wikidata: now covers UK and Ireland
I've adjusted the output to show the Wikipedia category path that the matcher followed to find the object. These two items are within the 'Dublin (City)' category, so end up being considered as cities. The reason they match is because one of the tags I consider for matching cities is landuse=residential. http://edwardbetts.com/osm-wikidata/gb-ie/county/County_Dublin/Cities My software tries to skip categories named after cities, but it is confused because the category "Dublin (City)" doesn't exactly match the city name. I'm going to remove landuse=residential as a tag that could match a city. Thanks for the bug report, -- Edward. Rory McCannwrote: > Hi, > > The list for "Cities in Dublin, Ireland" is strange. > http://edwardbetts.com/osm-wikidata/gb-ie/county/County_Dublin/Cities > > It accurately has the one city (Dublin), but it also lists the British > Ambassadors residence, and a student accomodation. Both are accurately > matched up OSM/Wikipedia/Wikidata, but they aren't cities. > > -- > Rory ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
I've updated the category names in the database to remove " by country" from the end. Edward Betts <edw...@4angle.com> wrote: > The category names are from Wikipedia. I start with the "Airports by country" > category and just grab the subcategories for United Kingdom and Ireland. In a > previous version I had code to strip the ' by country' from the end. I'll try > and restore it to reduce the confusion. > -- > Edward. > > Colin Smale <colin.sm...@xs4all.nl> wrote: > > Hi Edward, > > > > I took a look at the result pages and I noticed a small, but pervasive > > typo. All the listings of a category per county are actually titled per > > countRy on all the pages. For example on the Greater London page > > http://edwardbetts.com/osm-wikidata/gb-ie/region/Greater_London you see > > under the "categories" heading "Airports by country" instead of > > "Airports by county". From the content it is clear that you mean county. > > > > > > Only one letter, but a huge difference in meaning... > > > > Thanks > > > > Colin ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
The category names are from Wikipedia. I start with the "Airports by country" category and just grab the subcategories for United Kingdom and Ireland. In a previous version I had code to strip the ' by country' from the end. I'll try and restore it to reduce the confusion. -- Edward. Colin Smalewrote: > Hi Edward, > > I took a look at the result pages and I noticed a small, but pervasive > typo. All the listings of a category per county are actually titled per > countRy on all the pages. For example on the Greater London page > http://edwardbetts.com/osm-wikidata/gb-ie/region/Greater_London you see > under the "categories" heading "Airports by country" instead of > "Airports by county". From the content it is clear that you mean county. > > > Only one letter, but a huge difference in meaning... > > Thanks > > Colin ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
Now fixed: http://edwardbetts.com/osm-wikidata/gb-ie/region/Isle_of_Man Thanks for the bug report. -- Edward. Colin Spiller <co...@thespillers.org.uk> wrote: > Isle of Man conspicuous by its absence > Colin > > Edward Betts <edw...@4angle.com> wrote: > > >I've extended my search for matches between OSM and Wikidata again. It now > >covers all of the UK and Ireland. > > > >I used map data from http://download.geofabrik.de/europe/british-isles.html > > > >The results are grouped by region or county as well as by category. > > > >http://edwardbetts.com/osm-wikidata/gb-ie/ > > > >I'm going to figure out how to upload these matches to OSM. I've registered > >an > >account with the username Wikidata to use for the uploads. > > > >There will be one changeset per county + category for any category with 10 or > >more matches in that county. Categories with less than 10 matches in the > >county will be combined into a single changeset. > > > >OSM objects with an existing wikidata tag won't be changed. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
The matching code is looking for castles to be tagged with one of: historic=castle, building=castle, or tourism=attraction '@ Bristol' is tagged as tourism=attraction. hen my code is matching names it tries removing ' castle' from the end of castle names. It also removes any symbols, so we end up with 'Bristol Castle' matching '@ Bristol' I will change the name matching so the @ symbol isn't removed. Thanks for useful bug report. -- Edward. Neil Matthewswrote: > Had a look at Bristol -- seems fine... > ...except: > > Q4968836 — Bristol Castle — 2 matches found > @ Bristol > > Sometimes known as "At Bristol" -- a science-themed museum/attraction -- an > odd match to a ruined castle? > > Cheers, > Neil ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: now covers UK and Ireland
Dave Fwrote: > I haven't been paying full attention to this. Are we not meant to add a > wikipedia tag any more? > > Could you give a brief update on the proposal please. Here are the relevant pages on the wiki: http://wiki.openstreetmap.org/wiki/Wikidata http://wiki.openstreetmap.org/wiki/Mechanical_Edits/wikidata -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
[Talk-GB] OSM with Wikidata: now covers UK and Ireland
I've extended my search for matches between OSM and Wikidata again. It now covers all of the UK and Ireland. I used map data from http://download.geofabrik.de/europe/british-isles.html The results are grouped by region or county as well as by category. http://edwardbetts.com/osm-wikidata/gb-ie/ I'm going to figure out how to upload these matches to OSM. I've registered an account with the username Wikidata to use for the uploads. There will be one changeset per county + category for any category with 10 or more matches in that county. Categories with less than 10 matches in the county will be combined into a single changeset. OSM objects with an existing wikidata tag won't be changed. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: 27,232 matches found in England
Edward Betts <edw...@4angle.com> wrote: > Colin Spiller <co...@thespillers.org.uk> wrote: > > Thank you. Are there no entries at all for West Yorkshire? > > West Yorkshire is missing because there is no admin_level tag on the relation. > > https://www.openstreetmap.org/relation/88079 > > boundary=ceremonial, name=West Yorkshire, type=boundary > > My system expects counties to be admin_level=6. This is now fixed. https://edwardbetts.com/osm-wikidata/england/region/Yorkshire_and_the_Humber -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: 27,232 matches found in England
Colin Spillerwrote: > Thank you. Are there no entries at all for West Yorkshire? West Yorkshire is missing because there is no admin_level tag on the relation. https://www.openstreetmap.org/relation/88079 boundary=ceremonial, name=West Yorkshire, type=boundary My system expects counties to be admin_level=6. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] OSM with Wikidata: 27,232 matches found in England
Colin Spillerwrote: > This looks good but doesn't seem to work well for Yorkshire and the Humber. > West Yorkshire doesn't seem to be available at all, and when I select > Schools (or anything else) i just get > Internal Server Error > Sorry, there was a bug, it is now fixed. https://edwardbetts.com/osm-wikidata/england/region/Yorkshire_and_the_Humber/Schools -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] [Imports] OSM with Wikidata: 27232 matches found in England
Neil Matthewswrote: > I had a look at your Bristol matches -- most are reasonable, a few issues: > > Q5015771 — Cabot Circus — Cabot Circus (way, distance: 165 m) building=yes > Matched to parking not the shopping area -- OSM updated, was a suburb > place I've added landuse=commerical to the list of tags that the matcher looks for. > University of Bristol > one of three matches is to operator UWE Bristol -- OSM updated should be > UWE This is because the UWE building has the name B, and B is a substring of 'University of Bristol'. I can adjust the matching to stop this happening. > Stoke Park > should probably match to Stoke Park Estate -- remove duplicating node > from OSM The matcher has a list of possible name endings for parks: park, gardens and common. I've added "estate" to this list. For parks the names "Stoke Park" and "Stoke Park Estate" will be considered to be a match. > Brislington West (ward) > matched to Saint Annes -- probably needs checking further? Saint Annes is given as a polish language alias for Brislington West on Wikidata. I'm going to check if non-English names are useful for matching, or if I should just ignore them. > P.S. Might be fun to see the items for Bristol that couldn't be matched :-) I'll see if I can produce a list. Wikidata contains items with geographic coordinates for things no longer exist, like demolished buildings. Maybe I can detect if the Wikipedia article about the item is written in the past tense. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] [Imports] OSM with Wikidata: 27232 matches found in England
Dan Swrote: > I'm curious how The Shard ended up matching against its correct match > but also London Bridge station? The station doesn't seem to have any > matching metadata: > http://edwardbetts.com/osm-wikidata/england/region/Greater_London/Apartment_buildings The criteria for an apartment building is just anything tagged as a building or landuse=residential The Shard has an alias of 'London Bridge tower', my software trims these endings from apartment building names to find more matches: house, apartments, estate, and tower. I might add 'building=train_station' as an exclusion for the apartment building matching. The matcher will be changed to say if there is an OSM object who's name matches without trimming names, then it takes priority. > I managed to find a few erroneous one-to-one matches in London: > Q12048395 — The Queen's Walk (South Bank) — HMS Belfast (way, distance: 5.0 > km) > Q55019 — Covent Garden — Royal Opera House (way, distance: 71 m) > Q607700 — Monument to the Great Fire of London — Tower Bridge (node, > distance: 1.4 km) > Q5571009 — Globe Theatre (Newcastle Street) — Shakespeare's Globe > (relation, distance: 2.5 km) > - note that this one confuses the historic with the modern > theatre of the same name. The modern one has a separate wkp page. > Q43279 — Wembley Stadium (1923) — Wembley Stadium (way, distance: 0 m) > - again the correct connection should be with the modern stadium > Q3527632 — Dorset Garden Theatre — Queen's (way, distance: 2.8 km) > - another historic thing (Dorset Garden was called Queen's at one point) Thanks I'll investigate these. I'm going to add more debugging to the output, so we can see the names that matched. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
[Talk-GB] OSM with Wikidata: 27,232 matches found in England
I've extended my search for matches between OSM and Wikidata. It now covers all of England instead of just the West Midlands. The results are grouped by region or county as well as by category. http://edwardbetts.com/osm-wikidata/england/ It should be possible to use this a basis for uploading. The results can be grouped by category and county when uploaded. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
[Talk-gb-westmidlands] Review plan for adding 1, 164 wikidata tags in the West Midlands
I've written software to match geographic objects in OSM with Wikidata items. Members of the England West Midlands community have asked me to add Wikidata tags to OSM in the West Midlands. I have produced a list of objects to modify, there are 1,164 of them in 44 categories. Wikidata identifiers are look like Q2256, a Q followed by a number. The tags I plan to add will look will look like this: wikidata=Q2256 Wiki page: http://wiki.openstreetmap.org/wiki/Wikidata The matching is based on finding OSM objects with nearby Wikidata items and comparing the names and tags in OSM with labels and aliases in Wikidata. My plan is to add these tags using the API, one changeset per category. Here is a list of matches grouped by category with links to Wikidata, Wikipedia and OSM for each match. https://edwardbetts.com/osm-wikidata/west_midlands/2016-01-16.html This is the list of Wikidata tags that I actually plan to add: https://edwardbetts.com/osm-wikidata/west_midlands/matches_2016-01-16.txt This second lists has duplicates removed. I've also removed any OSM objects that already have a wikidata tag. Wikidata is licensed as public domain or CC0. I'm not importing any location data from Wikidata, just the Wikidata identifier. There are 44,559 existing wikidata tags in the OSM database. I've added an entry to the Import Catalogue in the "Ongoing Imports, Semi-Automated" section. http://wiki.openstreetmap.org/wiki/Import/Catalogue#Ongoing_Imports.2C_Semi-Automated If this upload is successful I plan to repeat the process for other regions in the UK. Then gradually expand to other parts of the world. The matching software is in a reasonably tidy state, after a bit more clean up I will share it for others to review and contribute. Any questions or comments? -- Edward. ___ Talk-gb-westmidlands mailing list Talk-gb-westmidlands@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb-westmidlands
Re: [Talk-GB] Review plan for adding 1, 164 wikidata tags in the West Midlands
Lester Caine <les...@lsces.co.uk> wrote: > On 18/01/16 06:54, Edward Betts wrote: > > The list of schools in the West Midlands is ready to go. Does anybody have > > an > > objection to me adding the wikidata tags now? If not I'll add them. > > Only done a quick scan, but Pershore High School no longer has just a > node. Not sure on some others since need to check objects to match on of > multiple names. Don't think any body else is editing these schools > currently? My work is based on a data dump downloaded a week ago. My code for adding the wikidata tags to OSM will skip any objects that no longer exist. > But *I* would appreciate breaking up the commits so we don't end up with all > 1164 on the one change set? :( The matches that I've found in the West Midlands are grouped into 44 categories. The plan is to have one changeset per category, not to upload 1164 changes in a single changeset. > The wikidata tag is in addition to the existing wikipedia tag? Yes. I'm not planning to make any changes to wikipedia tags. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-GB] [Imports] Review plan for adding 1, 164 wikidata tags in the West Midlands
Martin Koppenhoeferwrote: > I have started some time ago to add wikidata tags manually myself and have > found that there are a few problems to be careful about. Will you be > checking the matches you have found to see if there would be contradictions > between single wikidata statements and current OSM tags, that will require > reorganization of either the OSM object or the wikidata object? "Partial" > matches are not so uncommon, e.g. you could have a wikidata object > referring to a museum and an OSM object referring to the building housing > the museum (or the other way round). My code tries to resolve duplicates, it can't it will skip the OSM object. My aim is to generate a one-to-one mapping, I only add a tag for a wikidata item to a single OSM object. I've written some heuristic for different cases. For settlements where there is a node and polygon (way or relation) I prefer the node. Hospitals and schools are sometimes tagged with a polygon for each building and a surrounding polygon for the site. In this case the matcher picks the surrounding site polygon. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-gb-westmidlands] [Imports] Review plan for adding 1, 164 wikidata tags in the West Midlands
Martin Koppenhoeferwrote: > I have started some time ago to add wikidata tags manually myself and have > found that there are a few problems to be careful about. Will you be > checking the matches you have found to see if there would be contradictions > between single wikidata statements and current OSM tags, that will require > reorganization of either the OSM object or the wikidata object? "Partial" > matches are not so uncommon, e.g. you could have a wikidata object > referring to a museum and an OSM object referring to the building housing > the museum (or the other way round). My code tries to resolve duplicates, it can't it will skip the OSM object. My aim is to generate a one-to-one mapping, I only add a tag for a wikidata item to a single OSM object. I've written some heuristic for different cases. For settlements where there is a node and polygon (way or relation) I prefer the node. Hospitals and schools are sometimes tagged with a polygon for each building and a surrounding polygon for the site. In this case the matcher picks the surrounding site polygon. -- Edward. ___ Talk-gb-westmidlands mailing list Talk-gb-westmidlands@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb-westmidlands
Re: [Talk-gb-westmidlands] [Talk-GB] Review plan for adding 1, 164 wikidata tags in the West Midlands
Lester Cainewrote: > If there is not a node for a village in the OSM data, then it needs > adding. While wikipedia may return the same page for the village and the > matching parish, I thought that wikidata should distinguish between a > village record and a parish one? For now Wikidata tends to be similar to Wikipedia. There is a single Wikipedia article that describes both a civil parish and village, so the same is true on Wikidata. Over time Wikidata might start adding extra items for the civil parish. This is happening already in Germany. Wikipedia has an article about a village called Brailes, but OSM has nodes for two villages, Upper Brailes and Lower Brailes. The matcher solves this problem by picking the civil parish. https://en.wikipedia.org/wiki/Brailes http://wikidata.org/wiki/Q2155031 http://www.openstreetmap.org/relation/2863394 Another example: Upton Snodsbury in Wikipedia is called Upper Snodsbury in OSM, so the matcher picks the civil parish. There is a note on the Upper Snodsbury node that suggesting that the name might be wrong. https://en.wikipedia.org/wiki/Upton_Snodsbury http://www.openstreetmap.org/relation/1875319 -- Edward. ___ Talk-gb-westmidlands mailing list Talk-gb-westmidlands@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb-westmidlands
Re: [Talk-GB] Review plan for adding 1, 164 wikidata tags in the West Midlands
Lester Cainewrote: > If there is not a node for a village in the OSM data, then it needs > adding. While wikipedia may return the same page for the village and the > matching parish, I thought that wikidata should distinguish between a > village record and a parish one? For now Wikidata tends to be similar to Wikipedia. There is a single Wikipedia article that describes both a civil parish and village, so the same is true on Wikidata. Over time Wikidata might start adding extra items for the civil parish. This is happening already in Germany. Wikipedia has an article about a village called Brailes, but OSM has nodes for two villages, Upper Brailes and Lower Brailes. The matcher solves this problem by picking the civil parish. https://en.wikipedia.org/wiki/Brailes http://wikidata.org/wiki/Q2155031 http://www.openstreetmap.org/relation/2863394 Another example: Upton Snodsbury in Wikipedia is called Upper Snodsbury in OSM, so the matcher picks the civil parish. There is a note on the Upper Snodsbury node that suggesting that the name might be wrong. https://en.wikipedia.org/wiki/Upton_Snodsbury http://www.openstreetmap.org/relation/1875319 -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [Talk-gb-westmidlands] [Talk-GB] Review plan for adding 1, 164 wikidata tags in the West Midlands
Lester Caine <les...@lsces.co.uk> wrote: > On 17/01/16 11:08, Edward Betts wrote: > > This is the list of Wikidata tags that I actually plan to add: > > > > https://edwardbetts.com/osm-wikidata/west_midlands/matches_2016-01-16.txt > > Please remove the School list from this. We are currently adding the > edubase references to each of these, and this will replace the need for > an additional wikidata tag. Better to just have the one primary reference. I can skip the Schools category, if that's the community consensus. > There are also suspicious duplicates through the Towns and Villages > lists, and certainly where I've been adding a wikipedia link for a town > in a number of cases it is the wikipedia end which was wrong. Two > wikidata tags for Warwick? Thanks for spotting the Warwick mistake. Q549761 is the town, Q611294 is the local government district. I'll check to make sure there are no other duplicates like this. > And there should be individual nodes for each village so not sure where the > relations come from? The villages matching relations are the civil parishes. The Wikidata item represents both the village and the civil parish. This happens when the matching algorithm can't find the appropriate village node. I can change the code so that towns and villages only match nodes. -- Edward. ___ Talk-gb-westmidlands mailing list Talk-gb-westmidlands@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb-westmidlands
[Talk-GB] Cambridge pub meetup, Tuesday 10th Feb
There is a meetup this evening in Cambridge. Location: The Castle Inn, 38 Castle Street, Cambridge Time: 19:00 http://www.meetup.com/Cambridge-OpenStreetMap/events/220223825/ Sorry for the late notice. I thought it would be useful to mention here, just in case anybody is interested and wants to come along. -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-gb
Re: [OSM-talk] 176k Wikidata tags to add to OSM
Rob Nickerson rob.j.nicker...@gmail.com wrote: I've started to produce that page now: http://wiki.openstreetmap.org/wiki/Mechanical_Edits/wikidata So far I've included comments from August and November. I need to review the Sep-Oct comments on talk mailing list. Great work, thanks Rob. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
[OSM-talk] 176k Wikidata tags to add to OSM
This is a progress report about my attempt to match Wikidata items and OSM objects automatically. Here are some page about adding Wikidata identifiers to OSM: http://wiki.openstreetmap.org/wiki/Wikidata http://wiki.openstreetmap.org/wiki/Proposed_features/Wikidata The list is available here, it is split up by English Wikipedia category: http://edwardbetts.com/osm-wikidata/ Some OSM/Wikidata items will appear in multiple categories. Each page of results is sorted by distance, then by the English Wikidata label. The results include links to Wikidata, the location on OSM from Wikidata and the matched OSM object. A quick recap about how my system works. I have a list of categories on Wikipedia with the appropriate tags on OpenStreetMap. For example, articles in the subcategories of the category Airports by Country should appear on the map tagged as aeroway=aerodrome. I use a Wikimedia Labs tool called CatScan to get a list of every article in the category or subcategory: https://tools.wmflabs.org/catscan2/catscan2.php For each article in English Wikipedia this is a matching item in Wikidata. I use the Wikidata API to find the Wikidata items within the category. Items without coordinates are skipped. Once all the categories are processed I have a list of Wikidata items that include coordinates and the label in multiple languages. I split this list up by coordinates into half degree squares. I use the Overpass API to look for OSM objects (nodes, ways and relations) with a name and the expected tags. The acceptable distance for most objects is 1km, for some entity types it has been increased further. I've included a distance field in my results, so you can see how far apart the matched items are. The names in the OSM object are compared with the labels and aliases in the Wikidata item. The code looks at the various name keys listed in the http://wiki.openstreetmap.org/wiki/Key:name page. I exclude old_name from the comparison. The matching code considers addr:housename and can match buildings with Wikidata item labels that are street addresses to the addr:housenumber and addr:street tags. For example 8 Canada Square will match a building tagged with addr:housenumber=8 and addr:street=Canada Square The overpass API can calculate the centroid of an OSM object, this is what I used in the past. I've switched to using the bounding box for the object, this gives better results for large objects like lakes and forests. The result is that I now have a list of 176,794 OSM objects and matching Wikidata items. The whole process of extracting the data and looking for matches takes about three days to run. This is after quite a few changes to speed it up. I think there are still more improvements possible. I will post the code on github soon. It has been suggested that I shouldn't be using Wikipedia at all, instead I should be looking at the 'instance of' property in Wikidata. Using English Wikipedia introduces an English-language bias, there are items in Wikidata without an associated article in English Wikipedia. The reason for using Wikipedia Categories is because use of the 'instance of' property is very patchy. The majority of the items in my result list don't include the 'instance of' property. A related piece of work will be to populate this field in Wikidata, but for now I'm focused on linking OSM and Wikidata. The system gets confused by chains of restaurants and shops. The Wikidata item will often include the coordinates of the headquarters. The name will match with a nearby store. I should be able to fix this by filtering out Wikidata chain store items. Example: John Lewis - UK department store chain https://www.wikidata.org/wiki/Q1918981 Wikidata coordinates are 51.497, -0.144 near Victoria station. https://www.openstreetmap.org/?mlat=51.497mlon=-0.14434#map=16/51.4970/-0.1443 The match is for the flag ship store in Oxford Circus, 2km from the HQ. http://www.openstreetmap.org/node/31314236 Some of the coordinates in Wikipedia and Wikidata are wrong, there are many cases where the location in Wikidata is 5km or more from where it should be. London Hackspace moved from Islington to Hackney in 2009, the location has been updated on OSM, but Wikidata still has the old location: http://wikidata.org/wiki/Q6670461 http://www.openstreetmap.org/browse/node/2218654057 There are two pubs in London called Barley Mow that are less than 1k apart, both are mapped on OSM. One of the pubs has an item in Wikidata (Q17985738). My code is matching it to the wrong pub. I will fix this. http://wikidata.org/wiki/Q17985738 is http://www.openstreetmap.org/way/148011247 not http://www.openstreetmap.org/node/462025244 When checking the results for fountains I found that the Butt-Millet Memorial Fountain is mapped twice in different locations: http://wikidata.org/wiki/Q5002757 http://www.openstreetmap.org/way/238456703 http://www.openstreetmap.org/node/358955161 There are already 25k things with a Wikidata tag
Re: [OSM-talk] amenity=bicycle_repair_station :::: only 18 so far
Bryce Nesbitt bry...@obviously.com wrote: I'd like to encourage people to map bicycle repair stations. There are only 18 in the database right now. Can we double that this week? http://wiki.openstreetmap.org/wiki/Tag:amenity%3Dbicycle_repair_station The airport at Portland, Oregon (PDX) now provides a bike assembly station. This should probably be tagged as amenity:bicycle_repair_station http://bikeportland.org/2010/06/28/pdx-airport-now-offers-bike-assembly-station-35768 -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] amenity=bicycle_repair_station :::: only 18 so far
Martin Koppenhoefer dieterdre...@gmail.com wrote: in Rome and Berlin (and surely many more places) another typology of places to repair your bike are common: workshops without commercial interest. They typically do have opening hours and you go there with your bicycle to repair it yourself. Typically there will also be volunteers to look after the tools and who might help you if you kindly ask. In some of those you can also assemble a working bike from broken and abandoned/donated ones. They're not the typical shop as using them is free or they ask for a voluntary donation/fee to keep the place running. If you need spare parts you'll normally buy them in an ordinary shop and bring them there. Here's one example, currently tagged as shop, what doesn't hit it 100%: http://www.openstreetmap.org/node/566512945 We have one a community bike workshop in Cambridge called Wondergears. http://wondergearsbicycleworkshop.wordpress.com/ http://www.camcycle.org.uk/newsletters/115/article15.html It is just tagged as a building: http://www.openstreetmap.org/way/262729747 -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Andy Mabbett a...@pigsonthewing.org.uk wrote: On 27 August 2014 17:47, Edward Betts edw...@4angle.com wrote: I'd like to annotate these 70k objects in OSM with a Wikidata tag automatically. Can we now move forward with this? I'll generate a fresh list of Wikidata items. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] 112k Wikidata tags to add to OSM
I adjusted my criteria for islands, villages, towns and cities. There are now 102,691 matches and 230 mismatches. http://edwardbetts.com/osm-wikidata/ http://edwardbetts.com/osm-wikidata/mismatches.html -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] 112k Wikidata tags to add to OSM
Morray os...@go4more.de wrote: One thing I still don't get is why you are relying on wikipedia categories instead of on the instance of property in wikidata. In my view the correct way would be to use these and if they are not present at the moment make them become present. Since you already have good guesses about what an entity is an isntance of, maybe your guesses could be integrated in a game like: https://tools.wmflabs.org/wikidata-game/# . Afterwards you could rely on these and both communities would have gained. I used the Wikipedia categories because the Wikidata 'instance of' property is often missing. I'm struggling to figure out how to use the Wikidata API to search for items by 'instance of' property. You're right I should add 'instance of' claims to Wikidata items. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
[OSM-talk] 112k Wikidata tags to add to OSM
I modified my code, adding more categories and extending the matching distance for some objects. I started checking addr:housename, some buildings have this tag but are missing the name tag. http://edwardbetts.com/osm-wikidata/ There are now 112,278 matches found. I thought the extended range would help reduce the number of mismatches, but I now have 2,393 mismatches. http://edwardbetts.com/osm-wikidata/mismatches.html I've got some ideas about how to fix some of the mismatches. Many of the mismatches are villages represented by both a node and a relation, but the relation isn't tagged with place=village, so my code can't tell it represents the same thing. Maybe relations that represent villages should be tagged with place=village. I'll could modify my code so it rejects nodes inside a way or relation with the same name. Example: Sachsendorf, Germany http://www.openstreetmap.org/node/240130457 http://www.openstreetmap.org/relation/2253462 -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] 112k Wikidata tags to add to OSM
Archer arc...@gulli.com wrote: There is a difference between municipality and settlements like villages. Wikipedia articles are often about the whole municipality and not about a single village. So the wikidata-tag should only be tagged onto the administrative relation for the municipality Please do not add any Wikidata-Tags to German villages/municipality. This would cause a big mess as the mismatch list shows. I'd oppose this import at all at the moment. There are currently 21533 objects with wikidata tags in the OSM database at the moment. Your algorithm produces 2393 mismatches. This is a error rate of 10 % !! I think it's better to have not so much wikidata-tags in our database than about 11000 wrong tags. Don't be alarmed, this is a work in progress. I'm not going to add any wikidata tags until we have reached a consensus. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] 112k Wikidata tags to add to OSM
Pieren pier...@gmail.com wrote: On Fri, Sep 5, 2014 at 11:06 AM, Edward Betts edw...@4angle.com wrote: I started checking addr:housename, some buildings have this tag but are missing the name tag. addr:housename is most of the time improperly used in OSM (should be in the name tag). I understand it is increasing your matching results but it should be reviewed when this tag is corresponding to a wikipedia article. Agreed, here are the numbers for OSM objects where an addr:housename is present, but the name is missing: Apartment buildings: 11 Castles: 2 Commercial buildings: 3 Government buildings: 2 Houses: 9 Museums: 1 Office buildings: 1 Railway stations: 1 Residential buildings: 9 Schools: 1 Shopping malls: 1 Studios: 1 Try searching for 'no name' in this page for examples: http://edwardbetts.com/osm-wikidata/Apartment%20buildings.html -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Archer arc...@gulli.com wrote: Please don’t understand me wrong. I’m a big fan of Wikidata but I'm against an automated import. The mismatches list gives good examples that your matching algorithm doesn't work very well: http://edwardbetts.com/osm-wikidata/mismatches.html Some examples: 1. Isar Nuclear Power Plant http://wikidata.org/wiki/Q569510: your algorithm matches only one reactor of the power plant: Isar 2 http://www.openstreetmap.org/way/32918120 but the right matching would be Kernkraftwerke Isar http://www.openstreetmap.org/way/23802422 Q569510 is matching Isar 2 (Way 32918120) because Isar 2 is in the list of German aliases in the Wikidata object: [ KKW Isar, AKW Isar, Isar 2, Kernkraftwerk Isar I, Isar 1, Atomkraftwerk Isar ] The German label on the Wikidata item is Kernkraftwerke Isar, notice the extra 'e' on the end of the first word. I could add Levenshtein distance calculations to my matching, we could say if there is a single character difference the names match. With this change both OSM objects would match and my code would skip the wikidata item. The problem with this change is that hill and hall would match. 2. Heligoland http://wikidata.org/wiki/Q3038: you’ve matched the island Heligoland http://www.openstreetmap.org/relation/3787052 but the right match would be the municipality Heligoland http://www.openstreetmap.org/relation/1157962 (for the island there exists a different object in Wikidata) I can't find the Wikidata item that represents the island. 3. Puerto Rico http://wikidata.org/wiki/Q1183: the Wikidata objects says „is a unincorporated area of the United states“ – the right match therefore would be the administrative relation: Puerto Rico http://www.openstreetmap.org/relation/306157 but your algorithm matches the island: Island of Puerto Rico http://www.openstreetmap.org/node/357271412 The English Wikipedia article Puerto Rico is in the 'Islands of Puerto Rico' category, so my code considers Q1183 to represent an island. Node 357271412 is tagged as place=island, so it is perfect match. We could argue that the node doesn't have much purpose in OSM, the tags could be merged into Relation 306157. I also don’t understand why you prefer nodes instead of ways or relations. Ways and relations provide more information (e.g. extent of an area) than nodes. The Matching algorithm should first look for relations, when there’s no relation it should search for ways. Nodes should come last. The matching algorithm is only considering objects within 400m, so the nodes happen to be close, but the centre of the relation is more than 400m from the location in Wikidata. I've modified my matching algorithm to use much large distances for some types of object, it is running now. My hope is that when it is finished the code will detect the presence of the node and relation and skip the Wikidata item. Most of these node vs relation mismatches should disappear. What does your matching algorithm when a Wikidata object describes different objects and therefore should be split? A good example for this is the Wikidata object for Thasos https://www.wikidata.org/wiki/Q204096 (currently it describes the island and the municipality “Thasos”) but the object has to be split into two Wikidata objects so that you can say “the island Thasos lies in the administrative division Thasos”. There are also other examples like mixed up nature reserves, lakes and administrative divisions in Wikidata which you have to solve before you can import the IDs into OSM. My code doesn't do anything special with a wikidata item that represents multiple things like islands and municipalities. If Wikidata/Wikipedia claim a thing is an island, and in OSM there is a thing tagged with place=island and the same name they will match. OSM objects can be tagged as both an island and a municipality. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Archer arc...@gulli.com wrote: 2014-08-31 20:19 GMT+02:00 Edward Betts edw...@4angle.com: Archer arc...@gulli.com wrote: Please don’t understand me wrong. I’m a big fan of Wikidata but I'm against an automated import. The mismatches list gives good examples that your matching algorithm doesn't work very well: http://edwardbetts.com/osm-wikidata/mismatches.html Some examples: 1. Isar Nuclear Power Plant http://wikidata.org/wiki/Q569510: your algorithm matches only one reactor of the power plant: Isar 2 http://www.openstreetmap.org/way/32918120 but the right matching would be Kernkraftwerke Isar http://www.openstreetmap.org/way/23802422 Q569510 is matching Isar 2 (Way 32918120) because Isar 2 is in the list of German aliases in the Wikidata object: [ KKW Isar, AKW Isar, Isar 2, Kernkraftwerk Isar I, Isar 1, Atomkraftwerk Isar ] The German label on the Wikidata item is Kernkraftwerke Isar, notice the extra 'e' on the end of the first word. I could add Levenshtein distance calculations to my matching, we could say if there is a single character difference the names match. With this change both OSM objects would match and my code would skip the wikidata item. The problem with this change is that hill and hall would match. Ok, but the Wikidata object describes the whole power plant and not only one reactor. I'd propose to take is a (WD-Property: P37) into account. For example in Wikidata Q569510 is classified as a nuclear power plant (Q134447) the match algorithm should find the matching OSM tags. For example for power plants the right tag would be power=plant. Otherwise there should be no match. Thanks, that's the solution, my matching criteria included the power=generator tag, I'll remove it, the only matches for a power station are power=station and power=plant. I'm not looking at P37 (instance of) because many of the wikidata items don't include it. I depend on the the article categories from English Wikipedia. 2. Heligoland http://wikidata.org/wiki/Q3038: you’ve matched the island Heligoland http://www.openstreetmap.org/relation/3787052 but the right match would be the municipality Heligoland http://www.openstreetmap.org/relation/1157962 (for the island there exists a different object in Wikidata) I can't find the Wikidata item that represents the island. island: https://www.wikidata.org/wiki/Q3129772 municipality: https://www.wikidata.org/wiki/Q3038 archipelago: https://www.wikidata.org/wiki/Q17515918 Thanks. I also don’t understand why you prefer nodes instead of ways or relations. Ways and relations provide more information (e.g. extent of an area) than nodes. The Matching algorithm should first look for relations, when there’s no relation it should search for ways. Nodes should come last. The matching algorithm is only considering objects within 400m, so the nodes happen to be close, but the centre of the relation is more than 400m from the location in Wikidata. I've modified my matching algorithm to use much large distances for some types of object, it is running now. My hope is that when it is finished the code will detect the presence of the node and relation and skip the Wikidata item. Most of these node vs relation mismatches should disappear. The radius for natural and administrative features should be much bigger. For example if you want to find the island Hispaniola you'll need a radius of 93 km. There are also big glaciers, lakes, etc. What does your matching algorithm when a Wikidata object describes different objects and therefore should be split? A good example for this is the Wikidata object for Thasos https://www.wikidata.org/wiki/Q204096 (currently it describes the island and the municipality “Thasos”) but the object has to be split into two Wikidata objects so that you can say “the island Thasos lies in the administrative division Thasos”. There are also other examples like mixed up nature reserves, lakes and administrative divisions in Wikidata which you have to solve before you can import the IDs into OSM. My code doesn't do anything special with a wikidata item that represents multiple things like islands and municipalities. If Wikidata/Wikipedia claim a thing is an island, and in OSM there is a thing tagged with place=island and the same name they will match. OSM objects can be tagged as both an island and a municipality. I'd propose to drop Wikidata objects which have the following property combinations: is a island and at the same time administrative division is a nature reserve and administrative division is a lake and administrative division is a forest and administrative division These are the combinations where I've encountered problems in Wikidata yet. Thanks, this good to know, I'll investigate these combinations. Another problem
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Archer arc...@gulli.com wrote: I've found some more examples for villages in the Czech Republic (I've looked only randomly) if you need some more please let me know. In your mismatch list there seem to be many german municipalities: http://edwardbetts.com/osm-wikidata/mismatches.html The mismatch list includes lots of German villages beacause somebody has already added wikidata tags to them. Matching of villages might improve significantly when the latest version of the matcher finishes running tomorrow. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Edward Betts edw...@4angle.com wrote: Andrew Guertin andrew.guer...@uvm.edu wrote: 1: Elsewhere in this thread it was mentioned that there are 22000 wikidata ids in OSM currently. Are there any objects which currently have a wikidata id that your code would assign a different id to? Similarly, are there any instances where your code would assign a wikidata id to something and a different object in OSM already has that wikidata id? I haven't checked for either of these conditions. These are both good points and I'll investigate. I downloaded the existing set of wikidata tags and compared it with my list. There are 281 cases where the OSM item tagged with a given wikidata ID is different from the one picked by my code. Here is the list: http://edwardbetts.com/osm-wikidata/mismatches.html -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Edward Betts edw...@4angle.com wrote: Edward Betts edw...@4angle.com wrote: Andrew Guertin andrew.guer...@uvm.edu wrote: 1: Elsewhere in this thread it was mentioned that there are 22000 wikidata ids in OSM currently. Are there any objects which currently have a wikidata id that your code would assign a different id to? Similarly, are there any instances where your code would assign a wikidata id to something and a different object in OSM already has that wikidata id? I haven't checked for either of these conditions. These are both good points and I'll investigate. I downloaded the existing set of wikidata tags and compared it with my list. There are 281 cases where the OSM item tagged with a given wikidata ID is different from the one picked by my code. Here is the list: http://edwardbetts.com/osm-wikidata/mismatches.html I've added OSM objects that link to a different Wikdata item to the end of that page. There are 15 of them. Some were duplicates in Wikidata, the items were merged, but OSM is pointing at the deleted item. In some cases my match is more specific, for example: Tesla Factory (Q7705509) instead of Tesla Motors (Q478214). -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Simon Poole si...@poole.ch wrote: Edward, just so there is no misunderstanding: you are saying of the 21'000 odd wikidata tags 281 gave different results? And if I understand the results correctly the majority of the 281 are simply due to the wikidata tag not being on the place node but on the corresponding admin boundary relation? Plus 15 real errors? I compared my set of 70k suggested matches with the 21k of existing wikidata tags. There are 281 cases where a wikidata item in my list of suggested matches already appears in OSM, but is linked to an object that is different from my suggestion. Often that is because an entity appears in OSM as a node and a relation, my code is matching the node. The second list, contains 15 OSM objects that have a wikidata item that is different from my suggestion. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
SomeoneElse li...@mail.atownsend.org.uk wrote: * Personally, I'm actually fairly agnostic about the process of adding wikidata tags - I can't really see what I'd use them for myself, but am open to the possibility that someone could use them for something. However, an important part of things in OSM is surely that they are on-the-ground verifiable - wikipedia has articles for villages in the UK that don't exist, as do the OS OpenData StreetView maps, and people have added garbage data from both to OSM. How do we know that the wikidata items for which links are added are accurate? I'm going to add wikidata tags to existing OSM objects. There is no risk of adding villages that don't exist, because I'm not going to add any new entities to OSM. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Simon Poole si...@poole.ch wrote: On 28 August 2014 09:09, Simon Poole si...@poole.ch wrote: What you do avoid by not tagging in OSM is maintenance (given that OSM objects are not necessarily a persistent reference to a single real world entity). Very few Wikidata IDs will change (far fewer than Wikipedia article names, for instance; and far fewer than IDs or other tags in OSM). Again, this is a statistically-insignificant edge-case. I wasn't expecting wikidata IDs to change at all. OSM objects will get reused, copied, split, moved, deleted etc. leading to missing or wrong wikidata tags. Naturally these could be detected by re-running Edwards code, but that kind of proves my point. I don't think 'humans will make mistakes in future' should be used as an argument against an import of machine generated data. If it becomes a big problem we could modify the editors to warn when multiple items in a changeset have the same wikidata ID. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Rob Nickerson rob.j.nicker...@gmail.com wrote: In my mind this is a good move and should be supported. Point 3 above could be resolved by running the script regularly to see if there are any new matches. There have also been some good suggestions on this list such as a KeepRight style (i.e. QA) map where problematic objects (e.g. script finds more than one match) can be manually reviewed, confirming whether the script conflicts with any existing wikidata tags in OSM, checking whether the script would add a wikidata tag to an object when there is already a different object in OSM with that wikidata tag, and a check on the 400m distance rule [2]. Are these things possible Edward? I can certainly make a list of cases where there are multiple OSM objects matching a single Wikidata item. Building the interface for viewing and fixing them might be more tricky, it is probably not something I can build right now. I've written the code to look for conflicts. You can see the results here: http://edwardbetts.com/osm-wikidata/mismatches.html I've been thinking about my choice of 400m for the matching. I'm going to change the matching criteria to include a distance field, then set a low distance (200m) for things like restaurants, and a much higher distance (50km) for bigger areas, like national parks. Thanks for the summary Rob, it is really helpful. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Frederik Ramm frede...@remote.org wrote: What we have here is a third-party database whose object identifiers we add to OSM as tags in order to make linking things easier. This is something that has often been requested by people but never been granted on a large scale because we always said that it would be an abuse of our database and our mapper's patience to offload everyone's and their dog's linking requirements onto us. What about all the references listed in this table? http://wiki.openstreetmap.org/wiki/Key:ref#Key_variations There are 79k instances of the ref:INSEE key in OpenStreetMap. http://taginfo.openstreetmap.org/keys/ref%3AINSEE -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Andrew Guertin andrew.guer...@uvm.edu wrote: On 08/27/2014 12:47 PM, Edward Betts wrote: I'd like to annotate these 70k objects in OSM with a Wikidata tag automatically. I like the sound of this. Personally, I think it adds value, and having looked at the code your matching criteria sound good. Thanks! There are a couple of things it would make me happy to see before you go through with this: 1: Elsewhere in this thread it was mentioned that there are 22000 wikidata ids in OSM currently. Are there any objects which currently have a wikidata id that your code would assign a different id to? Similarly, are there any instances where your code would assign a wikidata id to something and a different object in OSM already has that wikidata id? I haven't checked for either of these conditions. These are both good points and I'll investigate. 2: You mention elsewhere in this thread that the maximum distance difference between the wikidata location and the osm object is 400 meters. How was this number arrived at? Could you make a list of matches including and sorted by the distance difference for people to look at? I think it's worth it for interested people to be able to independently verify at what distance the accuracy declines and what a good cutoff is. I don't have any basis for picking 400 meters, I just needed a number and that one seemed reasonable. It might be good to also include in that list what type of feature something is. If you're comparing using centroids, more leniency might be in order for, e.g., a large lake than a small building. This is also a good point. I tried using the overpass 'is_in' command, but I don't understand overpass areas. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Janko Mihelić jan...@gmail.com wrote: There's one fundamental question about wikidata tags; how do you tag multiple objects that have the same wikidata tag? For example, a wikidata entry about a church and a connected monastery. When I was writing the Wikidata proposal on our wiki, I've put a rule that only one object in OSM can have the same wikidata=* tag. So when there are more ways in OSM that represent one element in Wikidata, we should put them in a relation and put the wikidata tag in the relation. Since then I changed my opinion a bit, but I'm not sure if we should just put wikidata=* on all ways, or if we should invent a new tag, wikidata:part=* and put that tag on all the objects. For now I've side stepped this problem. If you look at an institution like a hospital, university or school you'll often find multiple buildings, some might include a name and be tagged amenity=hospital/university/school. If my code spots two or more nearby items with the correct tags and matching names it skips them, so I don't have to deal with multiple OSM items having the same wikidata tag. I can also detect if there are two nearby items with the same name but different tagging. I found an article on Wikipedia that was in the categories for bridge and monument. In OSM we have a 'way' to represent the bridge and a 'node' in the middle of the bridge for the monument. I skip these as well, there are just over 200 of them. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Janko Mihelić jan...@gmail.com wrote: Bridges are bit of a grey area, is a highway with bridge=yes really a bridge, or is it a highway which has a property of being on a bridge? I think we should map these notable bridges as an area with man_made=bridge and put the tag on that. The very first example of a bridge on your list is already problematic: http://www.openstreetmap.org/way/5620489 That way represents both the street and the bridge. I don't think there is any problem with adding a tag for the matching item on wikidata. http://wikidata.org/wiki/Q4547392 -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Janko Mihelić jan...@gmail.com wrote: Here's another example: http://www.openstreetmap.org/way/34012792 This railroad track will get the wikidata tag, the other track and footway won't. And even the track that gets the tag, isn't the whole length of the bridge. And I didn't even look that hard. I found problems on 2 out of 6 bridges I clicked. I'm not against your import, I think your work on this is great, but the bridge part is just not that simple. Thanks, we might need to skip bridges and tunnels. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
[OSM-talk] Adding Wikidata tags to 70k items automatically
I've written some code to match items in Wikidata with items in OSM. Currently I have found 70,849 unique matches, where there is a one-to-one mapping between OSM and Wikidata objects. I'd like to annotate these 70k objects in OSM with a Wikidata tag automatically. For example: Way: Piper's Orchard (43246411) http://www.openstreetmap.org/way/43246411 And on Wikidata: https://www.wikidata.org/wiki/Q7197307 I would like to add wikidata=Q7197307 to Piper's Orchard. The code to find the matches is here: https://github.com/edwardbetts/osm-wikidata Matching criteria: https://github.com/EdwardBetts/osm-wikidata/blob/master/entity_types.json The results are here: http://edwardbetts.com/osm-wikidata/ The best approach is probably to update 100 items with wikidata tags, then we can check them to make sure the edit looks good. If everything is fine I can go ahead and load the other 70k. Does anybody have a strong preference that the edits are split up by region, or loaded in batches? Any objections? I've read https://wiki.openstreetmap.org/wiki/Mechanical_Edit_Policy - if there are no major objections I'll go ahead and create https://wiki.openstreetmap.org/wiki/Mechanical_Edits/edward See also: http://wiki.openstreetmap.org/wiki/Proposed_features/Wikidata http://wiki.openstreetmap.org/wiki/Wikidata http://wiki.openstreetmap.org/wiki/Key:wikidata Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Matthijs Melissen i...@matthijsmelissen.nl wrote: On 27 August 2014 17:47, Edward Betts edw...@4angle.com wrote: I've written some code to match items in Wikidata with items in OSM. Currently I have found 70,849 unique matches, where there is a one-to-one mapping between OSM and Wikidata objects. To clarify (this wasn't clear to me from your e-mail): you are not purely matching name tags, but you also require matching objects to be in close geographic proximity (using the location tag from Wikidata). Doing that, I think the chance of wrong matching is negligible. I also looked through the list myself, and couldn't find any wrong match. Sorry, I should've been clearer. My search starts with articles in English Wikipedia Categories, I look for the matching Wikidata items, then search OSM for items within 400m that have matching tags. For example 'Category:Castles by country' and 'historic=castle'. I compare the names with some fuzzy matching, if the names match then the two items are the same. The credit for this idea goes to Andy Mabbett (pigsonthewing), I just wrote the code. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
SomeoneElse li...@mail.atownsend.org.uk wrote: On 27/08/2014 17:47, Edward Betts wrote: Matching criteria: https://github.com/EdwardBetts/osm-wikidata/blob/master/entity_types.json Presumably there's some geographical matching criteria too (so each Black Hill in the hills list is matched to the correct one)? If so, is there a licence issue where wikidata has imported from wikipedia, and wikipedia has obtained position information from who-knows-where (probably a source not compatibly licensed with OSM)? In at least one example(1) there's no wikipedia reference on the OSM node. Yes, I'm matching based on the coordinates in Wikidata and OSM, the coordinates need to be within 400m of each other. I don't think there is a license problem. I'm not planning to copy location information from Wikidata to OSM, just establishing the mapping between OSM and Wikidata. I think a good comparison is there various ref:*=* tags. See http://wiki.openstreetmap.org/wiki/Key:ref#Key_variations We use keys like ref:mhs and ref:sandre to establish a link to other datasets. There are no rules about the how these datasets were generated. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Svavar Kjarrval sva...@kjarrval.is wrote: If you do this, please split by region. For those of us who monitor specific areas for new changesets, it would be better if we didn't see a whole lot of entries where only one or two of the items in each entry are actually related to the area we are monitoring. Thanks, this is good to know. I'll split by region. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Adding Wikidata tags to 70k items automatically
Paul Norman penor...@mac.com wrote: On 8/27/2014 9:47 AM, Edward Betts wrote: Does anybody have a strong preference that the edits are split up by region, or loaded in batches? Any objections? When the idea of a mechanical edit to add wikidata tags to objects in GB came up, the local view was against it. How will you make sure your mechanical edit doesn't edit objects there? Hi Paul, thanks for pointing out the previous discussion. I've just read it. https://lists.openstreetmap.org/pipermail/talk-gb/2014-June/016096.html It looks like the conclusion of the thread was it would need somebody to write the code. I see no need to exclude GB. -- Edward. ___ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk
[Talk-gb-midanglia] Bus stops at Cambridge railway station moved
The bus stops at Cambridge railway station have moved. They are now along Station Place, rather than clustered at the station. The map needs to be updated. http://www.openstreetmap.org/?lat=52.19306lon=0.13647zoom=17layers=T Also the bus route relations need updating. -- Edward. ___ Talk-gb-midanglia mailing list Talk-gb-midanglia@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb-midanglia
[Talk-GB] Largest buildings in Great Britain
Largest 1000 buildings in Great Britain sorted by size and node count: http://edwardbetts.com/osm/buildings_by_size.html http://edwardbetts.com/osm/buildings_by_node_count.html -- Edward. ___ Talk-GB mailing list Talk-GB@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk-gb
[OSM-talk] Street guide built from OSM data, looking for feedback
I wrote some code (in Python) to generate a very simple street guide using data from OpenStreetMap. Code here: http://github.com/EdwardBetts/streetguide/ Sample output here: http://edwardbetts.com/streetguide/ I just have five samples, all in London: Upper Street, Oxford Street, Regents Street, Covent Garden and Brick Lane. It works in Firefox and Google Chrome, but not yet in Internet Explorer. The output is very rough. You should see a map of the area on the left and a list of Points Of Interest on the right. The list of POIs includes nodes and ways, if you click on the name of a node it'll show you the location on the map. I show extra tags on the node or way. Nodes link to their page in the openstreetmap browse interface, I will add the same for ways. Does this look useful, has anybody else done something similar? I plan to filter out some more of the more borrowing nodes and ways. I'm mostly interested in shops, restaurants, amenities and stations. Things I should add: - Use Nominatim to search for street names and let people generate new guides. - Find nearest station for subway entrances. - Don't include house number if street name is not included. - Improve the HTML and CSS. If you want me to see some more samples send me the bounding box you want, no need to send it to the mailing list. Like this: -0.10849,51.53161,-0.10047,51.54661 I'm interested in ideas about how to better organize the data, or for any other suggestions. -- Edward. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
[OSM-talk] Possible source: Books scanned by the Internet Archive
Title: Survey of properties owned by the city of Boston, part 6: Jamaica Plain - parker hill (1970) Author: Boston Redevelopment Authority http://www.archive.org/details/surveyofproperti0607bost To read the book have a look at http://www.archive.org/stream/surveyofproperti0607bost This is the area covered by the book. http://www.openstreetmap.org/?lat=42.33074lon=-71.10506zoom=16layers=B00FTF Just thought it might be useful. -- Edward. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] OSMXAPI
On Wed, Jul 23, 2008 at 5:54 PM, 80n [EMAIL PROTECTED] wrote: If you find that it is rejecting a request that you think is reasonable then please let me know. I want to find the point where all reasonable requests are still accepted but silly or badly formed ones are rejected. Every few days I download the list of UK stations: http://www.informationfreeway.org/api/0.5/node[railway=station][bbox=-6,50,2,61] I compare that with my previous download to see if any stations have been deleted. Should I be using the planet file for this? -- Edward. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/talk
Re: [OSM-talk] UK post box data
On Sat, Jul 5, 2008 at 2:36 PM, Tom Taylor [EMAIL PROTECTED] wrote: I recently made a Freedom of Information Act request for the location of every UK post box. Royal Mail responded with a 1600 page PDF containing their info. I parsed the PDF and resorted it, the result is in tab separated format: http://edwardbetts.com/postboxes/postboxes.tsv 116089 postboxes. There are three invalid records in the PDF: CA7, CA54, SMARTHILL L345 541, ST JAMES ROAD NR6, HA0, 61 SUDBURY AVENUE -- Edward. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/talk