Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-18 Thread Tobias Wrede

Am 17.04.2019 um 22:03 schrieb Mateusz Konieczny:

Among other popular wikipedia links

"wikipedia='de:Stolpersteine'",
"wikipedia='nl:Toeristisch Overstappunt'",

are also clearly invalid, though here brand:wikipedia would
be wrong and complete removal is probably necessary.


Martin already commented on de:Stolpersteine.

The adding of nl:Toeristisch Overstappunt has been questioned on the 
tagging list (starting with messages 
 and 
) but with no 
apparent outcome. I agree with you, in my opinion they are still dead wrong.


Tobias

___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-18 Thread Mateusz Konieczny



Apr 18, 2019, 12:02 AM by a...@pigsonthewing.org.uk:

> On Wed, 17 Apr 2019 at 22:38, Mateusz Konieczny <> matkoni...@tutanota.com 
> > > wrote:
>
>> Apr 17, 2019, 11:19 PM by >> a...@pigsonthewing.org.uk 
>> >> :
>>
>> On Wed, 17 Apr 2019 at 21:03, Mateusz Konieczny <>> matkoni...@tutanota.com 
>> >> > wrote:
>>
>> [list of examples]
>>
>> It would seem reasonable to have a bot routinely convert those to
>> brand:wikipedia tags, with (say) a white-list for HQ objects.
>>
>> Also HQ should not be linked to Wikipedia article describing company.
>>
>
> That depends, surely, on whether the HQ is represented by tags on the
> building-object, or on a node within a building, which may be a
> building with shared occupancy.
>
Yes, building element can be linked to entry describing building
(not company).

office=* element separated from building may be linked to entry
describing this specific office (unlikely to have a Wikipedia article,
though one may probably create a Wikidata entry).

> And if one is adding, say, the company's URL to an object, it is
> logical to add the Wikidata ID to the same object.
>
AFAIK we use website tag for urls about given element
and url for any sort-of related link.

In office=* element I would consider website tag linking to a company
website as an incorrect.

>> We could also suggest that tools (JOSM, ID, etc) issue a warning when
>> such values are added, ether based on matching items in a list, for
>> retching the item's "insatnce of" value from Wikidata.
>>
>> I am not sure about is using "instance of" data from Wikidata limited by
>> copyright issues.
>>
>
> Wikidata data is PD/CC0, so that is not an issue. Even if it were,
> referring to something is not the same as copying it.
>
Wikidata is CC0 under US law. For example it includes location data from
Google Maps (via import of location data from Wikipedia).

In general, at this moment Wikidata ignores sui generis database rights.

This is not blocking claiming that their database is CC0 under US law,
but it is not true for example under EU law.

And editing OSM based on Wikidata data us not the same as referring to it
(though likely editing OpenStreetMap limited to changing wikidata/wikipedia
links based on Wikidata/Wikipedia is fine. After all one needs to know 
what is linked - anything else would make creating any pointers
to external databases illegal).

___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-18 Thread Mateusz Konieczny



Apr 18, 2019, 5:40 AM by yuriastrak...@gmail.com:

> Andy, you could also use one of the example queries in > sophox.org 
> >  to find all objects that are usually tagged with 
> brand:wikidata=* (e.g. have more than 10 instances of that), and find all 
> other objects that use that same value for wikidata tag.  The query currently 
> finds over 15,000 such objects, so it runs a ~minute, but limiting it to a 
> 1000 shows up much faster -- > http://tinyurl.com/y5n42qlx 
> 
>
Note that this query finds mostly objects with operator:wikidata (and 
operator:wikipedia tags)
what is wrong only for franchises.

For example 
https://www.openstreetmap.org/node/2592353066 

https://www.openstreetmap.org/way/191158292 

https://www.openstreetmap.org/way/241947392 

https://www.openstreetmap.org/node/3300180823 

https://www.openstreetmap.org/node/2356411906

> A random look at some of those features shows that many have both wikidata 
> and brand:wikidata set to the same value (which is surprising).
>
Is it something that may be a correct tagging in some situation? Maybe adding 
it to a JOSM/iD
validator would be  a good idea.

Happens 4801 times - http://tinyurl.com/yxg3x3fc  
(sorry for using suspicious link shortener, full
link is quite long and I see no option to use a better link shortener like with 
Overpass Turbo).

https://sophox.org/#%23defaultView%3AMap%0ASELECT%20%3FosmId%20%3Flocation%20%3Fbwd%20%3FbwdLabel%20%3FbwdDescription%20WHERE%20%7B%0A%0A%20%20%23%20Subquery%20finds%20brand%3Awikidata%20IDs%20used%20more%20than%2010%20times%0A%20%20%7B%0A%20%20%20%20SELECT%20%3Fbwd%20%28count%28%2a%29%20as%20%3Fcount%29%20WHERE%20%7B%0A%20%20%20%20%20%20%3Fo%20osmt%3Abrand%3Awikidata%20%3Fbwd%20.%0A%20%20%20%20%7D%0A%20%20%20%20group%20by%20%3Fbwd%0A%20%20%20%20having%20%28%3Fcount%20%3E%2010%29%0A%20%20%7D%0A%0A%20%20%23%20Find%20OSM%20objects%20where%20wikidata%20or%20operator%3Awikidata%20tag%0A%20%20%23%20is%20one%20of%20the%20common%20brand%3Awikidata%20IDs%0A%20%20VALUES%20%3Ftag%20%7B%20osmt%3Awikidata%20%7D%0A%0A%20%20%3FosmId%20%3Ftag%20%3Fbwd%20%3B%0A%20%20%20%20%20%20%20%20%20osmm%3Aloc%20%3Flocation%20.%0A%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%2Cfr%2Cru%2Ces%2Cde%2Czh%2Cja%22.%20%7D%0A%7D
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-17 Thread Yuri Astrakhan
Andy, you could also use one of the example queries in sophox.org to find
all objects that are usually tagged with brand:wikidata=* (e.g. have more
than 10 instances of that), and find all other objects that use that same
value for wikidata tag.  The query currently finds over 15,000 such
objects, so it runs a ~minute, but limiting it to a 1000 shows up much
faster -- http://tinyurl.com/y5n42qlx

A random look at some of those features shows that many have both wikidata
and brand:wikidata set to the same value (which is surprising).

One relatively painless way of fixing those could be to using Sophox editor
-- essentially a query that will find all cases you want to fix, and allow
you to manually review each one of them (if you are familiar with the issue
and the area), and fix them. See
https://wiki.openstreetmap.org/wiki/Sophox_Editor

On Wed, Apr 17, 2019 at 8:56 AM Andy Mabbett 
wrote:

> There are currently 956 objects in OSM with the tag "wikidata=Q1340361":
>
>https://taginfo.openstreetmap.org/tags/wikidata=Q1340361
>
> where:
>
>https://www.wikidata.org/wiki/Q1340361
>
> is the item for the State Bank of India.
>
> The tag should almost certainly be:
>
>operator:wikidata=Q1340361
>
> or, less likely,
>
>brand:wikidata=Q1340361
>franchise:wikidata=Q1340361
>
> with the only exception perhaps being the bank's HQ.
>
> Can anyone confirm what the correct tag should be, and can we use an
> automated process to correct them?
>
> It's possible that the same issue applies to some of the other
> high--use tags listed at:
>
>https://taginfo.openstreetmap.org/keys/wikidata#values
>
> I've just raised a ticket to ask that Tagnfo display Wikidata labels
> on the latter page, which will make error fixing easier:
>
>https://github.com/taginfo/taginfo/issues/262
>
> --
> Andy Mabbett
> @pigsonthewing
> http://pigsonthewing.org.uk
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-17 Thread Martin Koppenhoefer


sent from a phone

> On 17. Apr 2019, at 22:03, Mateusz Konieczny  wrote:
> 
> Among other popular wikipedia links
> 
> "wikipedia='de:Stolpersteine'",
> 
> are also clearly invalid


this one is not clearly invalid, you could take a stance that the article is 
about the whole of this artwork, and as it is distributed all parts would get 
the tag (or do you propose a relation for it to have one object for the 
whole?). I agree it is clearly redundant though.

Cheers, Martin 
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-17 Thread Andy Mabbett
On Wed, 17 Apr 2019 at 22:38, Mateusz Konieczny  wrote:

> Apr 17, 2019, 11:19 PM by a...@pigsonthewing.org.uk:
>
> On Wed, 17 Apr 2019 at 21:03, Mateusz Konieczny  
> wrote:

> [list of examples]

> It would seem reasonable to have a bot routinely convert those to
> brand:wikipedia tags, with (say) a white-list for HQ objects.
>
> Also HQ should not be linked to Wikipedia article describing company.

That depends, surely, on whether the HQ is represented by tags on the
building-object, or on a node within a building, which may be a
building with shared occupancy.

And if one is adding, say, the company's URL to an object, it is
logical to add the Wikidata ID to the same object.

> We could also suggest that tools (JOSM, ID, etc) issue a warning when
> such values are added, ether based on matching items in a list, for
> retching the item's "insatnce of" value from Wikidata.
>
> I am not sure about is using "instance of" data from Wikidata limited by
> copyright issues.

Wikidata data is PD/CC0, so that is not an issue. Even if it were,
referring to something is not the same as copying it.

> Blacklist of very popular values may be a good idea,
> though personally I think that automatic edit fixing this would be preferable

The two are not mutually exclusive.


--
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-17 Thread Mateusz Konieczny



Apr 17, 2019, 11:19 PM by a...@pigsonthewing.org.uk:

> On Wed, 17 Apr 2019 at 21:03, Mateusz Konieczny <> matkoni...@tutanota.com 
> > > wrote:
>
>>
>> [list of examples]
>>
>
> It would seem reasonable to have a bot routinely convert those to
> brand:wikipedia tags, with (say) a white-list for HQ objects.
>
Also HQ should not be linked to Wikipedia article describing company.

HQ object may be linked using wikipedia (and optionally also wikidata)
only if there is article specifically about the HQ building.

> We could also suggest that tools (JOSM, ID, etc) issue a warning when
> such values are added, ether based on matching items in a list, for
> retching the item's "insatnce of" value from Wikidata.
>
I am not sure about is using "instance of" data from Wikidata limited by 
copyright issues. Blacklist of very popular values may be a good idea,
though personally I think that automatic edit fixing this would be preferable
(it is on my TODO list, but I expect that I will not start working on it before 
2021).

>> "wikipedia='de:Stolpersteine'",
>>
>> "wikipedia='nl:Toeristisch Overstappunt'",
>>
>
> Those should perhaps be wikipedia:type= ?
>
wikipedia:type=en:Tree on natural=tree would make the same sense.

>> and complete removal is probably necessary.
>>
Note that wikipedia:type would invite wikipedia:type=en:Tree on natural=tree
or wikipedia:type=en:Road on every single road.

> And programmatically addressing the commonest cases
> (such as those discussed above) will reduce the number considerably.
>
Feel free to go through discussion necessary before the edit
https://wiki.openstreetmap.org/wiki/Automated_Edits_code_of_conduct 

(edit itself can be done within minutes using JOSM).

___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-17 Thread Andy Mabbett
On Wed, 17 Apr 2019 at 21:03, Mateusz Konieczny  wrote:
>
> Apr 17, 2019, 5:53 PM by a...@pigsonthewing.org.uk:
>
> Can anyone confirm what the correct tag should be, and can we use an
> automated process to correct them?
>
> It seems likely that it should be brand:wikidata and brand:wikipedia.

That seems reasonable.

> Though, if we are lucky this mistake was added by an undiscussed
> automated edit and may be simply reverted.

I do not consider the loss of potentially-useful data to be "lucky".

> I am running monitoring of blatant misuses of Wikipedia tags.

[list of examples]

It would seem reasonable to have a bot routinely convert those to
brand:wikipedia tags, with (say) a white-list for HQ objects.

We could also suggest that tools (JOSM, ID, etc) issue a warning when
such values are added, ether based on matching items in a list, for
retching the item's "insatnce of" value from Wikidata.

> But for some reason people complain less if wikipedia tag is turned
> into brand:wikipedia rather than simply removed so usually I just turn
> invalid wikipedia links to company page into brand:wikipedia tags
> (and do the same with wikidata tags).

Again, that seems like the reasonable and sensible approach. It is
clear what the original editor was aiming at.

> Among other popular wikipedia links
>
> "wikipedia='de:Stolpersteine'",
> "wikipedia='nl:Toeristisch Overstappunt'",

Those should perhaps be wikipedia:type= ?

> and complete removal is probably necessary.

Again, that is throwing away useful data, where the intent of the
person adding it can be deduced with a very high degree of certainty.

> Overall, there are about 30 000 blatantly incorrect wikipedia tags

Though those need to be addresses, 30K out of over 1 million cases is
less than 3%. And programmatically addressing the commonest cases
(such as those discussed above) will reduce the number considerably.

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-17 Thread Mateusz Konieczny

Apr 17, 2019, 5:53 PM by a...@pigsonthewing.org.uk:

> Can anyone confirm what the correct tag should be, and can we use an
> automated process to correct them?
>
It seems likely that it should be brand:wikidata and brand:wikipedia.
Though, if we are lucky this mistake was added by an undiscussed 
automated edit and may be simply reverted.

>
> It's possible that the same issue applies to some of the other
> high--use tags listed at:
>
>  > https://taginfo.openstreetmap.org/keys/wikidata#values 
> 
>
I am running monitoring of blatant misuses of Wikipedia tags.

"wikipedia='es:Café Martínez'",
"wikipedia='en:Indian Overseas Bank'",
"wikipedia='en:Syndicate Bank'",
"wikipedia='en:Bank of India'",
"wikipedia='en:ICICI Bank'",
"wikipedia='en:Punjab National Bank'",
"wikipedia='en:State Bank of India'",
"wikipedia='fr:Algérie Poste'",
"wikipedia='es:Banco de la Nación Argentina'",
"wikipedia='en:Corporation Bank'",
"wikipedia='en:Bank of Baroda'",
"wikipedia='en:Federal Bank'",
"wikipedia='en:Indian Bank'",
"wikipedia='en:Andhra Bank'",

are ones that are fairly popular and almost certainly should
not be added (or added as brand:wikipedia if someone
really, really, really must link it).

Many of them have equally wrong brand:wikidata.

Personally, I consider brand:wikipedia and brand:wikidata as
completely useless (any potential use is lost as people keep adding them
by automatic edits, based on name tag without any verification).

But for some reason people complain less if wikipedia tag is turned
into brand:wikipedia rather than simply removed so usually I just turn
invalid wikipedia links to company page into brand:wikipedia tags
(and do the same with wikidata tags).

Among other popular wikipedia links

"wikipedia='de:Stolpersteine'",
"wikipedia='nl:Toeristisch Overstappunt'",

are also clearly invalid, though here brand:wikipedia would
be wrong and complete removal is probably necessary.

Overall, there are about 30 000 blatantly incorrect wikipedia tags
(repeated wikipedia values, after excluding objects like rivers or pipelines
where repeated wikipedia tags may be a valid tagging), this number is slowly 
decreasing.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-17 Thread Yves
If I follow Michael here, the mechanical edit you can do for sure is a revert 
of the mechanical edit that added the tag in the first place.
Yves___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Bank of India (and other) Wikidata tags

2019-04-17 Thread Michael Reichert
Hi Andy,

Am 17/04/2019 um 17.53 schrieb Andy Mabbett:
> There are currently 956 objects in OSM with the tag "wikidata=Q1340361":
> 
>https://taginfo.openstreetmap.org/tags/wikidata=Q1340361
> 
> where:
> 
>https://www.wikidata.org/wiki/Q1340361
> 
> is the item for the State Bank of India.
> 
> The tag should almost certainly be:
> 
>operator:wikidata=Q1340361
> 
> or, less likely,
> 
>brand:wikidata=Q1340361
>franchise:wikidata=Q1340361
> 
> with the only exception perhaps being the bank's HQ.
> 
> Can anyone confirm what the correct tag should be, and can we use an
> automated process to correct them?
> 
> It's possible that the same issue applies to some of the other
> high--use tags listed at:
> 
>https://taginfo.openstreetmap.org/keys/wikidata#values

The following Overpass query (bbox filter not required) shows all
features with wikidata=Q1340361 which is the Wikidata ID of the Bank of
India.

(node[wikidata=Q1340361];way[wikidata=Q1340361];);out geom meta;

By looking at the result, the following observations can be made:

- nyuriks is the last modifier of most objects
- most objects are banks having wikipedia="en:State Bank of India"

wikipedia=* on shops of chains is considered wrong. Usually, the article
is about the chain, not the individual shop itself.

Opening the changesets which modified the objects in their last version
leads to mechanical edits setting wikidata=* tags by simply copying
taking the value wikipedia=* and looking up its Wikidata ID. This is
should not have happened for the following reasons:

- computer programmes are better at copying and enhancing a planet dump
  with Wikidata IDs would be the better
- adding Wikidata IDs pretends a quality these objects do not have
  because no manual review happended
- the meaning of the Wikipedia articles and its associated Wikidata
  entry do not overlap fully

The errors pointed out by you, Andy, proof that it is an automated edit.
The Automated Edits Code of Conduct applies but was ignored then. The
Automated Edits Code of Conduct exists to prevent such issues. I haven't
digged in detail through the archives of the Talk mailing list but I am
pretty sure to find emails which mentioned these issues. However, the
issues raised back then were ignored.

The Bank of India issue is not an isolated incidence. Looking deeper
into the series of nyurik's mechanical Wikidata edits unveils more
issues. Cleaning up banks in India might remove one of the most obvious
and annoying errors of the mechanical edit but it does not solve all the
other errors still present in OSM. Each of them affects a smaller number
objects, not hundreds but only tens per error. They won't appear on the
first pages of Taginfos's list of tag values. But they still sum up to a
significant amount and make the wikidata=* tag as it is unreliable.
That's why I think that going back a step in this case would be the only
sustainable solution.

Best regards

Michael



PS nyurik's edit isn't the only problem here. The iD editor adds
wikidata=* if wikipedia=* is added without checking that the link of the
Wikipedia entry to the Wikidata entry is right and if the meaning of the
Wikipedia article is wider. I called this an mechanical edit

-- 
Per E-Mail kommuniziere ich bevorzugt GPG-verschlüsselt. (Mailinglisten
ausgenommen)
I prefer GPG encryption of emails. (does not apply on mailing lists)



signature.asc
Description: OpenPGP digital signature
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Bank of India (and other) Wikidata tags

2019-04-17 Thread Andy Mabbett
There are currently 956 objects in OSM with the tag "wikidata=Q1340361":

   https://taginfo.openstreetmap.org/tags/wikidata=Q1340361

where:

   https://www.wikidata.org/wiki/Q1340361

is the item for the State Bank of India.

The tag should almost certainly be:

   operator:wikidata=Q1340361

or, less likely,

   brand:wikidata=Q1340361
   franchise:wikidata=Q1340361

with the only exception perhaps being the bank's HQ.

Can anyone confirm what the correct tag should be, and can we use an
automated process to correct them?

It's possible that the same issue applies to some of the other
high--use tags listed at:

   https://taginfo.openstreetmap.org/keys/wikidata#values

I've just raised a ticket to ask that Tagnfo display Wikidata labels
on the latter page, which will make error fixing easier:

   https://github.com/taginfo/taginfo/issues/262

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk