Re: [OSM-talk] Could we just pause any wikidata edits for a month or two?

Lester Caine Wed, 18 Oct 2017 03:03:42 -0700

On 18/10/17 05:14, Yuri Astrakhan wrote:
> Lester, I agree with you that Wikidata should not contain an object for
> everything that OSM may have.  I don't believe there should be an entry
> for every McDonalds on the planet, or for every artist's work that
> someone may decide to include in OSM.  But that's up to Wikidata
> contributors.  Lets instead talk about practical usages of our data.


A good example, a list of every McDonalds on the planet is a job for
McDonalds. Does that database exist as a freely indexed? I've not
looked, but if all of the objects in OSM have an operator=mcdonalds:MCID
then an automated bot can identify problems with that cross referenced
list. With a growing access to these sorts of datasets a standard method
of using them would be nice and I include using using wikidata feeds in
that 'good practice'.

> Here is a wonderful site I saw at a conference a few days ago.  It lets
> you plan your trip based on the places you are interested in.  You can
> visualise all sorts of places - cultural, religious, hotels, bars -
> anything, and plot your course.  And it uses Wikidata, images from
> Commons, and Wikipedia text itself to describe the places.  The authors
> spoke at length how Wikidata tags in OSM has helped them build it, and
> the difficulty they had in all sorts of "data voodoo" to figure things
> out.  For example, they often correlate OSM & Wikidata locations by
> proximity, and try to guess if it's a match. They have done an
> outstanding job making sense of our data, but I think we could have made
> their job a lot easier with our communal data curation capabilities, and
> also help others who may have similar needs.
> 
> https://opentripmap.com/en/#14/40.7355/-73.9806
> <https://opentripmap.com/en/#14/40.7355/-73.9806>

Nice example of cross referencing but it would be enhanced by links back
to the websites of the various locations identified ... and this would
also allow things like my daughter fell foul off recently ... she was in
London and being 'vegan' went to a vegan restaurant ... which was closed
short term for refurbishment. Road closures and other short term changes
are not something OSM can really manage ...

> You do raise an important point about 1:1 vs part of vs ...  In order to
> be useful in data processing by 3rd party, data needs to answer a
> simple  questions:  does the linked Wikidata/Wikipedia represents this
> whole object, or is it simply related to it in some way.  Here, the 1:1
> is meant somewhat loosely - there are some cases when things don't align
> perfectly, but that's a separate topic.
> 
> If wiki* page is about that object, the consumer may choose to use
> multilingual names, show a portion of Wikipedia articles in the user's
> language, use Wikidata statements, and show images from Commons.

Current searches on wikpedia for things like 'sculptors' and the
location of their installations is somewhat difficult when there is not
an english article. wikidata is helping to find other language
references to subject or object, but in many cases currently 'commons'
is still not well indexed into that mix. Not an OSM problem but one that
holds up adding links easily currently.

> If wiki* is only *related* somehow to the object, no such automatic
> usage is possible. The link is still very valuable for the editors of
> the map, but not as much to the data consumers.  Examples include a wiki
> article that has just a section about this work of art, or wiki page is
> a list that includes all churches in the area, or describe a class of
> these objects (e.g. brand) but not this object itself.  Moreover, I
> suspect our favorite tools like Nominatim would also be mislead if they
> rely on Wiki* links that relate to the object, but not about the object
> itself. After all, if the object is well known, it would probably have
> its own wiki page, or at least a wikidata entry.

Past 'bad' experience of wikipedia have not helped in my adoption of
that as a repository of material but I think the sort of material I've
had stripped from wikipedia in the past SHOULD have a safe home in
wikidata.

>     Some translations are completely different articles?
> 
> I'm not sure what you meant here. I have heard of rare cases when
> unrelated wikipedia articles are connected to each other, but usually
> those get fixed as soon as someone notices.

Again ... wikidata by it's nature is probably helping to identify
'multiple' articles across the different wiki language bases. Again it's
articles on artists that I've hit with different content where there is
not an english version currently.

>     The problem I still see is that many of the items I am looking to
>     link to
>     are elements of an article rather than the whole article, such as the
>     location of the works of a particular artist. At some point in the
>     future wikidata may well have a complete index of QID's for every
>     artist's work, but currently I don't have the time to add wikidata
>     entries where they don't exist, so a link to the artists wikipedia
>     article which may or may not actually list this particular work is
>     second best and in many cases there is not even an english version :(
>  
> Sure, lets just add it as a different tag, not wikipedia/wikidata. We
> could call it related:wikidata or related:wikipedia:en, or subdivide it
> even further. Note that here, unlike the main wikipedia tag, the
> related:wikipedia:en might not be the same as wikidata. Moreover, I
> would argue that here we should use related:wikipedia:xx format with the
> language code, because the article content is likely to differ between
> languages.

This is the 'good practice' document that I think needs drafting in
relation to using wikidata/wikipedia/commons material.

>     Some bot then modifying that link out of context is not helpful and
>     while the idea of 'nobot' flags may seem a solution, it's just adding
>     another layer of complexity which potentially needs to exist for EVERY
>     tag on EVERY object. Something I don't think should be allowed!
> 
> Agree - I think a bot injecting wikipedia/wikidata tags based on some
> heuristics, e.g. "has the same object class and is nearby" is not very
> good and error prone. This could be a human-curated process, e.g. ask
> the user to help deciding which  Wikipedia articles does this object
> represent, and offer some likely candidates, but it shouldn't be
> automatic.  I think Mapbox was working on something like that?

There are many lists of data which I now expect to appear in wikidata.
If THAT is appropriate is not a question to answer here, but like the
'McDonalds' list, list of say 'National Trust' properties would be
better based on the National Trust website than on a secondary copy in
wikidata or some other list?

-- 
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

_______________________________________________
talk mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Could we just pause any wikidata edits for a month or two?

Reply via email to