ssion list for the Wikidata project.
> Betreff: Re: [Wikidata] Which external identifiers are worth covering?
>
> Hi Marco,
>
> I guess this depends what you mean by "exhaustive". Exhaustive in that every
> Wikidata item has ID X, or exhaustive in that we have every in
Hi Jane and Gerard,
Thanks for the suggestions! Labels are definitely a very important
consideration - will have a think about this.
On your questions, Jane, the people we've been working with have
increasingly been adding quite a lot of "occupation:politician" statements,
and even more specific
Hi Marco,
On 07-09-17 20:51, Marco Fossati wrote:
Hi everyone,
As a data quality addict, I've been investigating the coverage of
external identifiers linked to Wikidata items about people.
Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems
that even the second most used
Hoi,
I understand the tendency to have English labels for items seen as
important. However, there are bots who add labels to many, many languages
as the labels tend to be the same. In my opinion we should encourage the
inclusion of information. Yes, we may get duplicates but having the data
early a
Interesting! I noticed that suddenly a lot more politicians were showing up
in my queries - have you been adding the occupation=politician property? I
believe politicians are severely underrepresented on Wikipedia projects
(except for the top people in the news) so if you have good metadata, then
y
Hi folks,
Very happy to see this discussion happening.
I work on the EveryPolitician project [0] and for several years, we have
been mapping official IDs to Wikidata IDs. We have probably half the
national legislators in the world mapped this way, and many of the ones
we’re missing are because we
On 7 September 2017 at 19:51, Marco Fossati wrote:
> external identifiers linked to Wikidata items about people.
I'll take this as an invitation to remind everyone about ORCID iDs ;-)
See:
https://www.wikidata.org/wiki/Wikidata:ORCID
and:
https://en.wikipedia.org/wiki/Wikipedia:ORCID
On 7 September 2017 at 20:16, john cummings wrote:
> I guess this question for me is how do we do this in practice? How do we
> make sure Wikidata stays up to date/synced with external databases we think
> are important?
At least part of that solution has to be to get the maintainers of
those ex
Somewhat related to this discussion is the coli-conc project, which
collects statistics about KOS-type (thesaurus, authority file etc.)
identifier links in Wikidata:
http://coli-conc.gbv.de/concordances/wikidata/
You can also find statistics about indirect mappings, from one KOS via
Wikidata
In general, I think it would be great to store inside Wikidata the graph
of relations between identifiers. Something like:
VIAF linksTo ISNI
VIAF linksTo GND
…
GRID linksTo ISNI
arXiv linksTo DOI
Last time I looked, there was no simple way to do that. So for
WikiProject Universities we have used
Is anyone working on an "auto-resolve" bot? If you have VIAF (but nothing
else), you can resolve other identifiers via the VIAF site; similarly, if
you have only GND, you could try to reverse-lookup VIAF.
I think a list of items that have zero external identifiers, ordered by
"importance" (incomin
As a basic rule for "which external identifiers are worth covering", I
would begin with any national identifiers we have for people (politicians,
artists, writers, theologians, scientists, etc), then national identifiers
for organizations (government-related, GNP-related businesses, nonprofits,
ed
Hi Marco,
I guess this depends what you mean by "exhaustive". Exhaustive in that
every Wikidata item has ID X, or exhaustive in that we have every
instance of ID X in Wikidata?
The first is probably not going to happen, as the vast majority of
external identifiers have a defined scope for what th
I guess this question for me is how do we do this in practice? How do we
make sure Wikidata stays up to date/synced with external databases we think
are important?
On 7 September 2017 at 20:51, Marco Fossati wrote:
> Hi everyone,
>
> As a data quality addict, I've been investigating the coverage
Hi everyone,
As a data quality addict, I've been investigating the coverage of
external identifiers linked to Wikidata items about people.
Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems
that even the second most used ID (VIAF) only covers *25%* of people
items circa.
15 matches
Mail list logo