Re: [Wikidata] Status and ETA External ID conversion

2016-05-09 Thread Lydia Pintscher
On Sun, May 8, 2016 at 9:54 PM Tom Morris  wrote:

> Has the identifier migration stalled? I was just looking at this page:
>
> https://www.wikidata.org/wiki/Q622828
>
> and the first 9 claims on the page are all identifiers. There are only two
> (Freebase & Disease Ontology) in the identifier section at the bottom of
> the page.
>

I just posted an update at
https://www.wikidata.org/wiki/User:Addshore/Identifiers#Let.27s_get_this_done.21


Cheers
Lydia
-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-05-08 Thread Tom Morris
Has the identifier migration stalled? I was just looking at this page:

https://www.wikidata.org/wiki/Q622828

and the first 9 claims on the page are all identifiers. There are only two
(Freebase & Disease Ontology) in the identifier section at the bottom of
the page.

Tom
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-11 Thread Daniel Kinzler
Am 11.03.2016 um 11:20 schrieb Markus Kroetzsch:
> Maybe the community needs a bit more explanation as to why you "consciously"
> decide to override their judgement.

The idea is to give the community a tool to explicitly model their judgement
that something is an identifier, and introduce that idea of external identifiers
into the software exactly because that need was expressed by the community.
Relevant use cases: linking, mapping, and UI structure.

> The use of property P1921 clearly tells you
> what the community wants. If we want to have URIs only for some subset of
> properties, then we will use P1921 only on a subset. It is very easy and gives
> us complete control. The use of ExternalId as an additional restricting
> mechanism is neither helpful nor desired.

Can you given an example of something you want to map to a URI, but that is not
an external identifiers? There are probably edge cases, and thinking about them
and deciding on the desired semantics is a good thing, I believe.

> We can decide for ourselves which
> properties should have URIs exported for them, without needing conscious but
> unprincipled development decisions to constrain us.

"unprincipled", wow. The decision followed the principle that we want to have
software that is extensible and maintainable, and we want a data model that
makes explicit the semantics of values.  Following these principles, the
declaration of what a value is dictates what you can do with it. That's the
basic idea of object oriented design.

Of course, it would be possible to ditch these principles, and use the "duck
typing" approach: anything that has a formatter URL could be linked, etc. But
that introduces several problems:

* modeling: values can suddenly stop "being" identifiers, or become other
things, based on the statements on the property definition. This can lead to
inconsistencies in the way values are represented in dumps etc.

* implementation: we would either need to hard code a special case, or a
mechanism to apply all kinds of behaviors (formatting, mapping, parsing, etc)
based on all kinds of statements on properties. We can hard code for a few
things, but a general mechanism would hardly be scalable or maintainable. We do
have a solid and simple mechanism based on data types that works fine to cover
the use cases for external identifiers.

* stability: if we base more and more behavior of the software on properties and
statements defined by the community, the community would no longer be free to
modify such properties and statements. That would break the software. We do
compromise about this sometimes: Wikibase can be configured to know about a few
properties and items (such as P1630). But we should be careful about it, because
it takes away control from the community.

* consistency: You can't link just any kind of value based on a formatter uri.
That only works for string values, and probably shouldn't be done for string
values that have the "url" data type. So linking would only work for properties
declared to be plain strings per their data type. Again, behavior is bound to
the data type.

These principles are actually why we have data types at all. You were there when
we decided for having them. If we don't care about the points above, we wouldn't
need data types at all, value types would be sufficient. Everything else would
be covered by "if it quacks like a duck...". That would mean a less expressive
data model, and more complicated software. A lot more complicated, if you want
to apply this for everything.

> It would be helpful if you could share some pointers (1) to the original
> announcement and documentation for this restricting behaviour for URI exports
> (clearly, this information is vital for the ongoing discussion on property
> conversion),

It's a modeling tool, not a restriction. If there are things that should be
mapped to URIs but for some reason shouldn't have the ExternalId type, we should
look at these edge cases closely to find out what is wrong. Since clearly, if
it's not an identifier of some sort, it can't sensibly have a URI, and if it is
an identifier of some sort, there should be no reason not to mark it as such to
the software, by making it an ExternalId.

> and (2) to the discussions have lead to this design (surely you
> must have consulted with some RDF/SPARQL users and developers to conclude that
> some P1921 should be ignored).

I do not think any should be ignored. I think that properties that use P1921
should be ExternalIds. Please explain why you would not want that.

> I am really curious to learn what "we" refers to
> in "we made a conscious decision".

Decisions about the design and implementation of the software are made by the
development team ("us"), based on requirements and considerations on technical
as well as the product level, which in turn is informed from community
interaction, among other things.

As is often the case, solutions that have to be maintainable and scalable are
not quit

Re: [Wikidata] Status and ETA External ID conversion

2016-03-11 Thread Markus Kroetzsch

On 10.03.2016 18:43, Daniel Kinzler wrote:

Am 10.03.2016 um 10:26 schrieb Markus Kroetzsch:

I am surprised by the amount of confusion in this discussion. There is
absolutely no relationship between mapping of Wikidata values to URIs and the
external id datatype.


You are correct that such a relationship does not necessarily follow from first
principles. You are however incorrect in saying that there is no relationship in
Wikibase: The way the data model is currently defined and the way mappings are
implemented, we made a conscious decision to support such mappings only for
ExternalId values.


Maybe the community needs a bit more explanation as to why you 
"consciously" decide to override their judgement. The use of property 
P1921 clearly tells you what the community wants. If we want to have 
URIs only for some subset of properties, then we will use P1921 only on 
a subset. It is very easy and gives us complete control. The use of 
ExternalId as an additional restricting mechanism is neither helpful nor 
desired. We can decide for ourselves which properties should have URIs 
exported for them, without needing conscious but unprincipled 
development decisions to constrain us.


It would be helpful if you could share some pointers (1) to the original 
announcement and documentation for this restricting behaviour for URI 
exports (clearly, this information is vital for the ongoing discussion 
on property conversion), and (2) to the discussions have lead to this 
design (surely you must have consulted with some RDF/SPARQL users and 
developers to conclude that some P1921 should be ignored). I am really 
curious to learn what "we" refers to in "we made a conscious decision".


Markus




I think it would help the discussion if we could keep apart:
- what follows from formal principles
- what you (or I) consider best
- what the software currently does


(3) The external id datatype does not provide any mapping and the criteria used
for it by the community do not imply that such mappings should exist for these
cases, or that they should not exist for other cases.


That is incorrect from the way Wikibase defines and uses the ExternalId
datatype: the intent is indeed to say that something is an identifier that can
be mapped, and that such a (direct) mapping is not supported for other data
types. (That doesn't mean we will not offer different mappings for other data
types, perhaps URLs for looking up coordinates, etc).

Modeling this explicitly is indeed the reason to have this datatype.


I am most worried about Daniel's remark. He says that we wants to use external
ids to identify properties with "values that identify a resource", but does not
mention the existing, community-supported mechanism for doing just that (2), and
instead proposes another mechanism (3), which the community is clearly not using
for this purpose at all.


That's a misunderstanding. The plan is to support P1921 for URI mappings, and we
already do support P1630 for URL mappings. But we intentionally do this only for
ExternalId values, not for plain strings or other types.

So, the technical implementation does follow the community convention, with the
restriction that properties that should use this kind of mapping need to
explicitly be declared to be identifiers. We are also considering implementing
validation and normalization for ExternalId values, but it's not clear yet how
we can safely apply community supplied validation and normalization patterns.




--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Egon Willighagen
I think the predicate may need depend on the type of thing we're
linking, and on what is being linked. A strong predicate, like
owl:sameAs, requires a (very) strong similarity between concepts...
this is often not the case... mind you, this applies also to the
things being linked... there are enough alternatives, like rdf:seeAlso
as probably one of the least informative predicaties, via
skos:closeMatch and skos:exactMatch ... for chemicals, I have been
involved in work by the Open PHACTS project (now foundation), led by
Alasdair Gray, on "scientific lenses"... I'm biased, but the
(conference) papers are a good read anyway... [eg Q23034460]

Egon

On Thu, Mar 10, 2016 at 8:08 PM, Young,Jeff (OR)  wrote:
> Then perhaps umbel:isLike instead of owl:sameAs?
>
> http://wiki.opensemanticframework.org/index.php/UMBEL_Vocabulary#isLike_Property
>
> It conveys sameAs but with a hint of uncertainty.
>
>> -Original Message-
>> From: Wikidata [mailto:wikidata-boun...@lists.wikimedia.org] On Behalf Of
>> Stas Malyshev
>> Sent: Thursday, March 10, 2016 1:52 PM
>> To: Discussion list for the Wikidata project. 
>> Subject: Re: [Wikidata] Status and ETA External ID conversion
>>
>> Hi!
>>
>> > Couldn't you use P460 when there is doubt?
>> >
>> > https://www.wikidata.org/wiki/Property:P460
>>
>> P460's type is Item, which means it is relation between two Wikidata items.
>> External ID is relation between Wikidata item and something outside Wikidata.
>>
>> --
>> Stas Malyshev
>> smalys...@wikimedia.org
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Daniel Kinzler
Am 10.03.2016 um 20:08 schrieb Young,Jeff (OR):
> Then perhaps umbel:isLike instead of owl:sameAs?
> 
> http://wiki.opensemanticframework.org/index.php/UMBEL_Vocabulary#isLike_Property

In some cases owl:equivalentProperty may be appropriate
https://www.w3.org/TR/owl-ref/#equivalentProperty-def


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Young,Jeff (OR)
Then perhaps umbel:isLike instead of owl:sameAs?

http://wiki.opensemanticframework.org/index.php/UMBEL_Vocabulary#isLike_Property

It conveys sameAs but with a hint of uncertainty.

> -Original Message-
> From: Wikidata [mailto:wikidata-boun...@lists.wikimedia.org] On Behalf Of
> Stas Malyshev
> Sent: Thursday, March 10, 2016 1:52 PM
> To: Discussion list for the Wikidata project. 
> Subject: Re: [Wikidata] Status and ETA External ID conversion
> 
> Hi!
> 
> > Couldn't you use P460 when there is doubt?
> >
> > https://www.wikidata.org/wiki/Property:P460
> 
> P460's type is Item, which means it is relation between two Wikidata items.
> External ID is relation between Wikidata item and something outside Wikidata.
> 
> --
> Stas Malyshev
> smalys...@wikimedia.org
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Stas Malyshev
Hi!

> Couldn't you use P460 when there is doubt?
> 
> https://www.wikidata.org/wiki/Property:P460

P460's type is Item, which means it is relation between two Wikidata
items. External ID is relation between Wikidata item and something
outside Wikidata.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Stas Malyshev
Hi!

> From a machine processing point of view, a more interesting statement is
> probably: 
> 
> wd:Q1000336 owl:sameAs  >

That is much bolder claim, since it essentially says this is identity,
they both refer to the same thing. And that may not always be true, as
different databases may have different rules and different approaches to
what entries mean - e.g. one database may talk about "book" meaning the
content of the book regardless of how it is materialized, and another
may mean by "book" a specific printed edition or even a specific
physical object. It _might_ be appropriate for some cases, but I'm not
sure can say that for all cases where we have links between Wikidata and
other databases.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Tom Morris
On Thu, Mar 10, 2016 at 12:58 PM, Egon Willighagen <
egon.willigha...@gmail.com> wrote:

> On Thu, Mar 10, 2016 at 6:12 PM, Tom Morris  wrote:
> > On Wed, Mar 9, 2016 at 7:37 PM, Stas Malyshev 
> wrote:
> > From a machine processing point of view, a more interesting statement is
> probably:
> >
> > wd:Q1000336 owl:sameAs 
>
> Yes, but this proposal matches part of the discussion...


Actually, it doesn't, but for some reason you chose not to quote the
original URL which showed the difference. The URL
https://www.freebase.com/m/03pvzn  is
not the same as the URI above.


> owl:sameAs is
> in many cases not appropriate and likely should not be the goal in the
> first place: in many cases there is not such a clear 1-to-1 relation,
> and even if there is a 1-to-1 relation, the above may still be
> inappropriate.


So choose a predicate that you think is more appropriate. The important
thing is that the URIs match so that computers can tell that they're the
same thing.

Tom
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Young,Jeff (OR)
Couldn't you use P460 when there is doubt?

https://www.wikidata.org/wiki/Property:P460

Jeff

> -Original Message-
> From: Wikidata [mailto:wikidata-boun...@lists.wikimedia.org] On Behalf Of
> Egon Willighagen
> Sent: Thursday, March 10, 2016 12:58 PM
> To: Discussion list for the Wikidata project. 
> Subject: Re: [Wikidata] Status and ETA External ID conversion
> 
> On Thu, Mar 10, 2016 at 6:12 PM, Tom Morris  wrote:
> > On Wed, Mar 9, 2016 at 7:37 PM, Stas Malyshev
>  wrote:
> > From a machine processing point of view, a more interesting statement is
> probably:
> >
> > wd:Q1000336 owl:sameAs <https://rdf.freebase.com/ns/m.03pvzn>
> 
> Yes, but this proposal matches part of the discussion... owl:sameAs is in many
> cases not appropriate and likely should not be the goal in the first place: in
> many cases there is not such a clear 1-to-1 relation, and even if there is a 
> 1-to-
> 1 relation, the above may still be inappropriate.
> 
> Egon
> 
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/EgonWillighagen
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Luiz Augusto
tl;dr

As far I can see developers expect to properties being listed by the
community, but the listing is kept in a way that the community normally
looks as a draft waiting for complaints until some amount of time and later
being implemented... communication issues, eh?

BTW I've listed some properties on [1], all of them examples of ID
uniqueness being a non-issue.

The remaining user subpages may have also in the very exact scope, but I
will wait until the 'community process expectation' versus 'this is a thing
that will get done even no one says a word' gets clarified.

[1] - https://www.wikidata.org/w/index.php?diff=311688067
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Egon Willighagen
On Thu, Mar 10, 2016 at 6:12 PM, Tom Morris  wrote:
> On Wed, Mar 9, 2016 at 7:37 PM, Stas Malyshev  wrote:
> From a machine processing point of view, a more interesting statement is 
> probably:
>
> wd:Q1000336 owl:sameAs 

Yes, but this proposal matches part of the discussion... owl:sameAs is
in many cases not appropriate and likely should not be the goal in the
first place: in many cases there is not such a clear 1-to-1 relation,
and even if there is a 1-to-1 relation, the above may still be
inappropriate.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Daniel Kinzler
Am 10.03.2016 um 10:26 schrieb Markus Kroetzsch:
> I am surprised by the amount of confusion in this discussion. There is
> absolutely no relationship between mapping of Wikidata values to URIs and the
> external id datatype.

You are correct that such a relationship does not necessarily follow from first
principles. You are however incorrect in saying that there is no relationship in
Wikibase: The way the data model is currently defined and the way mappings are
implemented, we made a conscious decision to support such mappings only for
ExternalId values.

I think it would help the discussion if we could keep apart:
- what follows from formal principles
- what you (or I) consider best
- what the software currently does

> (3) The external id datatype does not provide any mapping and the criteria 
> used
> for it by the community do not imply that such mappings should exist for these
> cases, or that they should not exist for other cases.

That is incorrect from the way Wikibase defines and uses the ExternalId
datatype: the intent is indeed to say that something is an identifier that can
be mapped, and that such a (direct) mapping is not supported for other data
types. (That doesn't mean we will not offer different mappings for other data
types, perhaps URLs for looking up coordinates, etc).

Modeling this explicitly is indeed the reason to have this datatype.

> I am most worried about Daniel's remark. He says that we wants to use external
> ids to identify properties with "values that identify a resource", but does 
> not
> mention the existing, community-supported mechanism for doing just that (2), 
> and
> instead proposes another mechanism (3), which the community is clearly not 
> using
> for this purpose at all.

That's a misunderstanding. The plan is to support P1921 for URI mappings, and we
already do support P1630 for URL mappings. But we intentionally do this only for
ExternalId values, not for plain strings or other types.

So, the technical implementation does follow the community convention, with the
restriction that properties that should use this kind of mapping need to
explicitly be declared to be identifiers. We are also considering implementing
validation and normalization for ExternalId values, but it's not clear yet how
we can safely apply community supplied validation and normalization patterns.

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Tom Morris
On Wed, Mar 9, 2016 at 7:37 PM, Stas Malyshev 
wrote:

>
> > I don't see why it is an issue that some external identifiers don't
> > translate to URIs. What complex logic is involved here? In RDF we should
> > just add the plain identifier like we have it now as the default value,
>
> If we say "since external IDs are in fact URIs, since they refer to
> external databases, then let's mark them as URI property and render them
> as full URI - i.e. let's instead of:
>
> wd:Q1000336 wdt:P646 "/m/03pvzn"
>
> say this:
>
> wd:Q1000336 wdt:P646 
>
> This may make a lot of sense, since the interesting URL that people
> would like to see may be the latter, and the former is kind of
> chopped-off form of it we use for our internal purposes. OTOH, what if
> it wasn't easy or possible to generate the latter from the former
> automatically? Then we need some logic to figure that out.


>From a machine processing point of view, a more interesting statement is
probably:

wd:Q1000336 owl:sameAs >

This is supposed to redirect to either RDF
https://rdf.freebase.com/rdf/m.03pvzn  or
HTML https://www.freebase.com/m/03pvzn based on content negotiation, but
that seems to be broken right now and it always returns RDF.

Tom
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Markus Kroetzsch

Dear all,

I am surprised by the amount of confusion in this discussion. There is 
absolutely no relationship between mapping of Wikidata values to URIs 
and the external id datatype. There is no reason in RDF or elsewhere why 
the two should be related.


(1) The mapping of Wikidata strings to URLs is controlled by the 
formatted URL (P1630) property and its qualifiers.
(2) The mapping of Wikidata strings to URIs is controlled by the URI 
pattern for RDF resource (P1921) property and its qualifiers.
(3) The external id datatype does not provide any mapping and the 
criteria used for it by the community do not imply that such mappings 
should exist for these cases, or that they should not exist for other cases.


We can add any amount of URLs (1) or URIs (2) to the RDF store without 
problems. We only need to make a pick which ones to use (this can be 
done using qualifiers, similar to the used by (P1535) qualifier that is 
already used for formatter URL if there are many). RDF imposes no 
requirements how these URLs/URIs should look, whether they are unique or 
not, or whether they are issued by a particular authority or not. There 
is no danger of confusion since URIs are by their very nature global 
IDs, so you can use many of them on one resource without any problems. 
None of the issues discussed in this thread seems to play much of a role 
in RDF or in RDF consumers.


I am most worried about Daniel's remark. He says that we wants to use 
external ids to identify properties with "values that identify a 
resource", but does not mention the existing, community-supported 
mechanism for doing just that (2), and instead proposes another 
mechanism (3), which the community is clearly not using for this purpose 
at all. In fact, there is no need for a technical change in Wikibase 
here: if we want external URIs in RDF, we just have to add them based on 
the data we find in P1921. If the developers don't like to follow the 
community in this case, they should explain their technical (!) concerns 
to the community and gather feedback.


Regards,

Markus


On 10.03.2016 01:37, Stas Malyshev wrote:

Hi!


In theory, having an identifier datatype and rendering strings as urls
are two separate things. We could dispatch the rendering based on
property_info and support the "formatter url" property for more values
(eg. coordinates) without even having an identifier datatype. It is just
a good idea to conceptually separate external identifiers from other
string values.


Correct in theory. In practice however if we create implication between
the two, we need to be careful to not create cases where it would be
hard for automatic tools to produce correct result.


I don't see why it is an issue that some external identifiers don't
translate to URIs. What complex logic is involved here? In RDF we should
just add the plain identifier like we have it now as the default value,


If we say "since external IDs are in fact URIs, since they refer to
external databases, then let's mark them as URI property and render them
as full URI - i.e. let's instead of:

wd:Q1000336 wdt:P646 "/m/03pvzn"

say this:

wd:Q1000336 wdt:P646 

This may make a lot of sense, since the interesting URL that people
would like to see may be the latter, and the former is kind of
chopped-off form of it we use for our internal purposes. OTOH, what if
it wasn't easy or possible to generate the latter from the former
automatically? Then we need some logic to figure that out.


and the expanded urls as derived values if available.


What you mean by "derived values"?




--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-09 Thread Stas Malyshev
Hi!

> In theory, having an identifier datatype and rendering strings as urls
> are two separate things. We could dispatch the rendering based on
> property_info and support the "formatter url" property for more values
> (eg. coordinates) without even having an identifier datatype. It is just
> a good idea to conceptually separate external identifiers from other
> string values.

Correct in theory. In practice however if we create implication between
the two, we need to be careful to not create cases where it would be
hard for automatic tools to produce correct result.

> I don't see why it is an issue that some external identifiers don't
> translate to URIs. What complex logic is involved here? In RDF we should
> just add the plain identifier like we have it now as the default value,

If we say "since external IDs are in fact URIs, since they refer to
external databases, then let's mark them as URI property and render them
as full URI - i.e. let's instead of:

wd:Q1000336 wdt:P646 "/m/03pvzn"

say this:

wd:Q1000336 wdt:P646 

This may make a lot of sense, since the interesting URL that people
would like to see may be the latter, and the former is kind of
chopped-off form of it we use for our internal purposes. OTOH, what if
it wasn't easy or possible to generate the latter from the former
automatically? Then we need some logic to figure that out.

> and the expanded urls as derived values if available.

What you mean by "derived values"?

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-09 Thread Bene*
Hey

Am 10.03.2016 um 00:49 schrieb James Heald:
> On 09/03/2016 23:09, Stas Malyshev wrote:
>> Hi!
>>
>>> Technically, the main purpose of having a separate datatype was to
>>> explicity
>>> model values that identify a resource (in the RDF sense, where
>>> resource means
>>> "anything that can be identified unambiguously"), so we can apply
>>> mappings (e.g.
>>> to URIs and URLs) when exporting and displaying them.
>>
>> We need also to be careful here as we have some external ID-like types
>> that do not translate to URIs. So we should either not convert them to
>> that type, or we'd have complex logic of which type the corresponding
>> properties should be (since we should tell whether this property uses
>> string or URI, and we should be able to do it automatically when
>> generating RDF).
>>
>> Right now I'd suggest not converting such properties, unless there's a
>> good reason to.
>>

In theory, having an identifier datatype and rendering strings as urls
are two separate things. We could dispatch the rendering based on
property_info and support the "formatter url" property for more values
(eg. coordinates) without even having an identifier datatype. It is just
a good idea to conceptually separate external identifiers from other
string values.

I don't see why it is an issue that some external identifiers don't
translate to URIs. What complex logic is involved here? In RDF we should
just add the plain identifier like we have it now as the default value,
and the expanded urls as derived values if available.

> 
> Somewhat related to what Stas writes, can I remind again that we have
> many properties that have single identifiers, that map to different URLs
> for different purposes (eg a URL for human readers, a slightly different
> URL for RDF).
> 
> Both of those URLs should be available /somewhere/ in the triplestore or
> the RDF dump -- but probably neither of them are what one would want the
> simple wdt: form of the property on SPARQL to return.

I agree that for the simple wdt: form we should still have the plain id
without any expanded urls. For the derived values (full urls but also
relevant for other data types), we still need to find a proper way to
represent those derived data values in our data model. As soon as we
tackle that issue, it will be possible to provide those urls in the api
output as well as the serialized RDF.

Best regards
Bene



signature.asc
Description: OpenPGP digital signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-09 Thread James Heald

On 09/03/2016 23:09, Stas Malyshev wrote:

Hi!


Technically, the main purpose of having a separate datatype was to explicity
model values that identify a resource (in the RDF sense, where resource means
"anything that can be identified unambiguously"), so we can apply mappings (e.g.
to URIs and URLs) when exporting and displaying them.


We need also to be careful here as we have some external ID-like types
that do not translate to URIs. So we should either not convert them to
that type, or we'd have complex logic of which type the corresponding
properties should be (since we should tell whether this property uses
string or URI, and we should be able to do it automatically when
generating RDF).

Right now I'd suggest not converting such properties, unless there's a
good reason to.



Somewhat related to what Stas writes, can I remind again that we have 
many properties that have single identifiers, that map to different URLs 
for different purposes (eg a URL for human readers, a slightly different 
URL for RDF).


Both of those URLs should be available /somewhere/ in the triplestore or 
the RDF dump -- but probably neither of them are what one would want the 
simple wdt: form of the property on SPARQL to return.


  -- James.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-09 Thread Stas Malyshev
Hi!

> Technically, the main purpose of having a separate datatype was to explicity
> model values that identify a resource (in the RDF sense, where resource means
> "anything that can be identified unambiguously"), so we can apply mappings 
> (e.g.
> to URIs and URLs) when exporting and displaying them.

We need also to be careful here as we have some external ID-like types
that do not translate to URIs. So we should either not convert them to
that type, or we'd have complex logic of which type the corresponding
properties should be (since we should tell whether this property uses
string or URI, and we should be able to do it automatically when
generating RDF).

Right now I'd suggest not converting such properties, unless there's a
good reason to.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-09 Thread Daniel Kinzler
Am 07.03.2016 um 11:54 schrieb Markus Kroetzsch:
> In general, the community uses several classes for properties that could have
> been used for UI organisation, rather than introducing new datatypes. 

Technically, the main purpose of having a separate datatype was to explicity
model values that identify a resource (in the RDF sense, where resource means
"anything that can be identified unambiguously"), so we can apply mappings (e.g.
to URIs and URLs) when exporting and displaying them.

Using the datatype for the UI structure is an attempt to kill two birds with one
stone. I think it's a pretty good start, but I agree that we should revisit this
once we have gathered some feedback. It would not be too hard to base the
structure on different criteria (well, depends on the criteria).

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-08 Thread Andy Mabbett
"
On 8 March 2016 at 21:43, Lydia Pintscher  wrote:
> On Tue, Mar 8, 2016 at 10:31 PM Andy Mabbett 
> wrote:
>>
>> On 5 March 2016 at 15:09, Maarten Dammers  wrote:
>>
>> > You call
>> > https://www.wikidata.org/wiki/Special:Contributions/Maintenance_script
>> > a dying pace?
>>
>> Only twelve items converted, in the first 8 days of March; and none
>> since the 2nd...
>
>
> Yes because until 2 days ago there wasn't more to convert. Marius will be
> back from learning for his exams in the next days and then we'll continue.

This was in response to the comment "the whole process has slowed down
to a dying pace"; not a criticism of Marius. That "there wasn't more
to convert" suggests that the original comment was well-founded.

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-08 Thread Andy Mabbett
On 6 March 2016 at 16:37, Tom Morris  wrote:

> I've looked at the identifier list a couple of times with an eye towards
> helping with the curation, but I could never make heads nor tails of what
> the criteria were, whether there was consensus about the criteria, why some
> perfectly acceptably identifiers were being vehemently argued against and
> one what grounds, etc. The "community" driving this process on those wiki
> pages seems to be just a handful of vocal and opinionated people. Is that
> going to generate good results?

No - and you're correct in your assessment.

We haven't even managed to convert ISBN-10, ISBN-13, ISSN, or various
ISO International Standard identifiers.

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-08 Thread Lydia Pintscher
On Tue, Mar 8, 2016 at 10:31 PM Andy Mabbett 
wrote:

> On 5 March 2016 at 15:09, Maarten Dammers  wrote:
>
> > You call
> > https://www.wikidata.org/wiki/Special:Contributions/Maintenance_script
> > a dying pace?
>
> Only twelve items converted, in the first 8 days of March; and none
> since the 2nd...
>

Yes because until 2 days ago there wasn't more to convert. Marius will be
back from learning for his exams in the next days and then we'll continue.

Cheers
Lydia
-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-08 Thread Andy Mabbett
On 5 March 2016 at 15:09, Maarten Dammers  wrote:

> You call
> https://www.wikidata.org/wiki/Special:Contributions/Maintenance_script
> a dying pace?

Only twelve items converted, in the first 8 days of March; and none
since the 2nd...

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-07 Thread Egon Willighagen
On Mon, Mar 7, 2016 at 9:13 AM, Lydia Pintscher
 wrote:
> Ok. I think we're making this much more complicated than necessary. The
> question you should ask yourself is: Does this identify a concept in another
> database/website/...? Nice to have: a website to link to.
> Once we have that we can look at corner cases and exceptions.

OK, thanks for the clarification. Then I will oppose arguments about
uniqueness with my opinions, experiences, and argument and focus on
this instead.

This helps a lot!

Egon

> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-07 Thread Markus Kroetzsch

On 07.03.2016 09:13, Lydia Pintscher wrote:

On Mon, Mar 7, 2016 at 2:57 AM Tom Morris mailto:tfmor...@gmail.com>> wrote:

On Sun, Mar 6, 2016 at 5:31 PM, Lydia Pintscher
mailto:lydia.pintsc...@wikimedia.de>>
wrote:

On Sun, Mar 6, 2016 at 10:56 PM Stas Malyshev
mailto:smalys...@wikimedia.org>> wrote:

Is there a process somewhere of how the checking is done,
what are
criteria, etc.? I've read
https://www.wikidata.org/wiki/User:Addshore/Identifiers but
there's a
lot of discussion but not clear if it ever come to some end.
Also not
clear what the process is - should I just move a property I
like to
"good to convert"? Should I run it through some checklist
first? Should
I ask somebody?


Yes. Good ones should be moved to good to convert. If no-one
disagrees we'll convert them.


So, no decision criteria? Just whatever we individually like?

What are the rules for "disputed" - is some process for
review planned?


Let's concentrate on the ones people can agree on for now. We'll
tackle the ones that are disputed in the next step. If editors
can't sort it out I will make an executive decision at some
point but I don't think this will be needed.


I think the fact that some obvious good identifiers like IMDb have
been blocked has made potential contributors unsure how to evaluate
other candidates which would also, on the surface, seem obviously good.

Perhaps since the criteria aren't being used, someone could just
delete all the proposed criteria from the page and replace the old
text with something like "Whatever you, personally, think is best"
so that people know what's expected of them? That might help break
the logjam. I know it would make me more comfortable in contributing.



Ok. I think we're making this much more complicated than necessary. The
question you should ask yourself is: Does this identify a concept in
another database/website/...? Nice to have: a website to link to.
Once we have that we can look at corner cases and exceptions.


The community actually already has a class for such properties:

"Wikidata property representing a unique identifier" 
http://www.wikidata.org/entity/Q19847637


In general, the community uses several classes for properties that could 
have been used for UI organisation, rather than introducing new 
datatypes. The current discussion is caused mainly by the fact that 
there is just *one* new datatype, but many types of identifiers based on 
different criteria -- so people argue which one the new datatype should 
represent. The classes used on properties are much less controversial, 
because one just have one for each criterion that people consider 
relevant. For example, there also is


"multi-source external identifier"
http://www.wikidata.org/entity/Q21264328

There are many other classes that could be used in the interface, e.g., 
"Wikidata property for human relationships" 
 that one could use very well 
to group properties. One would not need to use all classes to group 
properties: there would be a (short) list that the community would 
decide on. I think this is the best approach to get reasonable property 
groups Reasonator-style into Wikidata at some point. It works much 
better than creating new datatypes for each case, it can build on 
existing data (rather than starting new discussions on datatype 
conversion), and it has the advantage that it can also group properties 
of different types.


Markus


--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-07 Thread Pine W
This use of a WMF email account raises some legal and wikipolitical
ambiguities that are best avoided. I strongly recommend using a non-WMF
email account for anyone who is speaking outside of a WMF role. Pinging
James to ask for clarification on the policy. And let's fork this portion
of the discussion. (:

Pine
On Mar 7, 2016 00:39, "Stas Malyshev"  wrote:

> Hi!
>
> > Yes, sure, your free time is a different matter. I just thought you are
> > speaking as a WMF employee here, since you were using this email. I am
>
> It's Sunday here, so no :) I do use two separate logins for WMF official
> and volunteer work on Wiki, but using two emails is too cumbersome for me.
>
> > probably over-sensitive there since I am used to the very strict
> > policies of WMDE. They are very careful to keep paid and private
> > activities separate by using different accounts.
>
> Surely, it is common in WMF too. But again two email accounts seems
> excessive to me. Usually it's pretty clear from the context but if
> needed, I will clarify.
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-07 Thread Stas Malyshev
Hi!

> Yes, sure, your free time is a different matter. I just thought you are
> speaking as a WMF employee here, since you were using this email. I am

It's Sunday here, so no :) I do use two separate logins for WMF official
and volunteer work on Wiki, but using two emails is too cumbersome for me.

> probably over-sensitive there since I am used to the very strict
> policies of WMDE. They are very careful to keep paid and private
> activities separate by using different accounts.

Surely, it is common in WMF too. But again two email accounts seems
excessive to me. Usually it's pretty clear from the context but if
needed, I will clarify.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-07 Thread Markus Kroetzsch

On 06.03.2016 23:31, Stas Malyshev wrote:

Hi!


In your case, however, the answer probably is: you cannot contribute
there at all, since you are a Wikimedia employee and this is a
content-related community discussion. ;-)


Many WMF employees contribute to wikis in their non-work time, as far as
I know. I don't even seek to participate in the discussion (though I
don't think WMF employment would disqualify me from contributing in
volunteer capacity, given my affiliations - as they are - are clearly
stated) - but only to know the results so I could contribute in editor
capacity, following whatever rules are there.


Yes, sure, your free time is a different matter. I just thought you are 
speaking as a WMF employee here, since you were using this email. I am 
probably over-sensitive there since I am used to the very strict 
policies of WMDE. They are very careful to keep paid and private 
activities separate by using different accounts.


Markus

--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-07 Thread Lydia Pintscher
On Mon, Mar 7, 2016 at 2:57 AM Tom Morris  wrote:

> On Sun, Mar 6, 2016 at 5:31 PM, Lydia Pintscher <
> lydia.pintsc...@wikimedia.de> wrote:
>
>> On Sun, Mar 6, 2016 at 10:56 PM Stas Malyshev 
>> wrote:
>>
>>> Is there a process somewhere of how the checking is done, what are
>>> criteria, etc.? I've read
>>> https://www.wikidata.org/wiki/User:Addshore/Identifiers but there's a
>>> lot of discussion but not clear if it ever come to some end. Also not
>>> clear what the process is - should I just move a property I like to
>>> "good to convert"? Should I run it through some checklist first? Should
>>> I ask somebody?
>>>
>>
>> Yes. Good ones should be moved to good to convert. If no-one disagrees
>> we'll convert them.
>>
>
> So, no decision criteria? Just whatever we individually like?
>
> What are the rules for "disputed" - is some process for review planned?
>>>
>>
>> Let's concentrate on the ones people can agree on for now. We'll tackle
>> the ones that are disputed in the next step. If editors can't sort it out I
>> will make an executive decision at some point but I don't think this will
>> be needed.
>>
>
> I think the fact that some obvious good identifiers like IMDb have been
> blocked has made potential contributors unsure how to evaluate other
> candidates which would also, on the surface, seem obviously good.
>
> Perhaps since the criteria aren't being used, someone could just delete
> all the proposed criteria from the page and replace the old text with
> something like "Whatever you, personally, think is best" so that people
> know what's expected of them? That might help break the logjam. I know it
> would make me more comfortable in contributing.
>


Ok. I think we're making this much more complicated than necessary. The
question you should ask yourself is: Does this identify a concept in
another database/website/...? Nice to have: a website to link to.
Once we have that we can look at corner cases and exceptions.

Cheers
Lydia
-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-06 Thread Tom Morris
On Sun, Mar 6, 2016 at 5:31 PM, Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> On Sun, Mar 6, 2016 at 10:56 PM Stas Malyshev 
> wrote:
>
>> Is there a process somewhere of how the checking is done, what are
>> criteria, etc.? I've read
>> https://www.wikidata.org/wiki/User:Addshore/Identifiers but there's a
>> lot of discussion but not clear if it ever come to some end. Also not
>> clear what the process is - should I just move a property I like to
>> "good to convert"? Should I run it through some checklist first? Should
>> I ask somebody?
>>
>
> Yes. Good ones should be moved to good to convert. If no-one disagrees
> we'll convert them.
>

So, no decision criteria? Just whatever we individually like?

What are the rules for "disputed" - is some process for review planned?
>>
>
> Let's concentrate on the ones people can agree on for now. We'll tackle
> the ones that are disputed in the next step. If editors can't sort it out I
> will make an executive decision at some point but I don't think this will
> be needed.
>

I think the fact that some obvious good identifiers like IMDb have been
blocked has made potential contributors unsure how to evaluate other
candidates which would also, on the surface, seem obviously good.

Perhaps since the criteria aren't being used, someone could just delete all
the proposed criteria from the page and replace the old text with something
like "Whatever you, personally, think is best" so that people know what's
expected of them? That might help break the logjam. I know it would make me
more comfortable in contributing.

Tom
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-06 Thread Lydia Pintscher
On Sun, Mar 6, 2016 at 10:56 PM Stas Malyshev 
wrote:

> Is there a process somewhere of how the checking is done, what are
> criteria, etc.? I've read
> https://www.wikidata.org/wiki/User:Addshore/Identifiers but there's a
> lot of discussion but not clear if it ever come to some end. Also not
> clear what the process is - should I just move a property I like to
> "good to convert"? Should I run it through some checklist first? Should
> I ask somebody?
>

Yes. Good ones should be moved to good to convert. If no-one disagrees
we'll convert them.


> What are the rules for "disputed" - is some process for review planned?
>

Let's concentrate on the ones people can agree on for now. We'll tackle the
ones that are disputed in the next step. If editors can't sort it out I
will make an executive decision at some point but I don't think this will
be needed.

Cheers
Lydia
-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-06 Thread Stas Malyshev
Hi!

> In your case, however, the answer probably is: you cannot contribute
> there at all, since you are a Wikimedia employee and this is a
> content-related community discussion. ;-)

Many WMF employees contribute to wikis in their non-work time, as far as
I know. I don't even seek to participate in the discussion (though I
don't think WMF employment would disqualify me from contributing in
volunteer capacity, given my affiliations - as they are - are clearly
stated) - but only to know the results so I could contribute in editor
capacity, following whatever rules are there.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-06 Thread Markus Kroetzsch

On 06.03.2016 22:56, Stas Malyshev wrote:

Hi!


The community is checking each property to verify it should be converted:

https://www.wikidata.org/wiki/User:Addshore/Identifiers/0

https://www.wikidata.org/wiki/User:Addshore/Identifiers/1

https://www.wikidata.org/wiki/User:Addshore/Identifiers/2


Is there a process somewhere of how the checking is done, what are
criteria, etc.? I've read
https://www.wikidata.org/wiki/User:Addshore/Identifiers but there's a
lot of discussion but not clear if it ever come to some end. Also not
clear what the process is - should I just move a property I like to
"good to convert"? Should I run it through some checklist first? Should
I ask somebody?
What are the rules for "disputed" - is some process for review planned?

I think some more definite statement would help, especially to people
willing to contribute.


+1 I have had the same questions.

In your case, however, the answer probably is: you cannot contribute 
there at all, since you are a Wikimedia employee and this is a 
content-related community discussion. ;-)


Best,

Markus






--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-06 Thread Stas Malyshev
Hi!

> The community is checking each property to verify it should be converted:
> 
> https://www.wikidata.org/wiki/User:Addshore/Identifiers/0
> 
> https://www.wikidata.org/wiki/User:Addshore/Identifiers/1
> 
> https://www.wikidata.org/wiki/User:Addshore/Identifiers/2

Is there a process somewhere of how the checking is done, what are
criteria, etc.? I've read
https://www.wikidata.org/wiki/User:Addshore/Identifiers but there's a
lot of discussion but not clear if it ever come to some end. Also not
clear what the process is - should I just move a property I like to
"good to convert"? Should I run it through some checklist first? Should
I ask somebody?
What are the rules for "disputed" - is some process for review planned?

I think some more definite statement would help, especially to people
willing to contribute.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-06 Thread Tom Morris
If an identifier system provides for merging of entities along with the
retention of both their previous IDs (as all good identifier systems which
guarantee stable identifiers should), duplicate IDs are inevitable.  Well
known examples include Freebase, MusicBrainz, OpenLibrary, and yes, even
Wikipedia & Wikidata. Duplicates may be silently resolved as is the case
with Freebase, redirects like OpenLibrary and Wiki*, or a hybrid like
MusicBrainz (some page types redirect, others don't). Merged identities may
be relatively rare (Freebase) or more common (OpenLibrary, MusicBrainz),
but they'll always happen. Mandating uniqueness would force the "losing"
IDs to be deleted from Wikidata, losing the benefit that they bring for
enhancing and strengthening the mesh of identifiers.

I've looked at the identifier list a couple of times with an eye towards
helping with the curation, but I could never make heads nor tails of what
the criteria were, whether there was consensus about the criteria, why some
perfectly acceptably identifiers were being vehemently argued against and
one what grounds, etc. The "community" driving this process on those wiki
pages seems to be just a handful of vocal and opinionated people. Is that
going to generate good results?

Tom

On Sun, Mar 6, 2016 at 4:17 AM, Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> Another reason why "uniqueness" is not such a good criterion: it cannot be
> applied to decide the type of a newly created property (no statements, no
> uniqueness score). In general, the fewer statements there are for a
> property, the more likely they are to be unique. The criterion rewards data
> incompleteness (example: if Luca deletes the six multiple ids he mentioned,
> then the property could be converted -- and he could later add the
> statements again). If you think about it, it does not seem like a very good
> idea to make the datatype of a property depend on its current usage in
> Wikidata.
>
> Markus
>
>
> On 05.03.2016 17:15, Markus Krötzsch wrote:
>
>> Hi,
>>
>> I agree with Egon that the uniqueness requirement is rather weird. What
>> it means is that a thing is only considered an "identifier" if it points
>> to a database that uses a similar granularity for modelling the world as
>> Wikidata. If the external database is more fine-grained than Wikidata
>> (several ids for one item), then it is not a valid "identifier",
>> according to the uniqueness idea. I wonder what good this may do. In
>> particular, anybody who cares about uniqueness can easily determine it
>> from the data without any property type that says this.
>>
>> Markus
>>
>>
>> On 05.03.2016 15:35, Egon Willighagen wrote:
>>
>>> On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
>>>  wrote:
>>>
 On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen
 

> What is the exact process? Do you just plan to wait longer to see if
> anyone supports/contradicts my tagging? Should I get other Wikidata
> users and contributors to back up my suggestion?
>

 Add them to the list Katie linked if you think they should be
 converted. We
 wait a bit to see if anyone disagrees and I also do a quick sanity
 check for
 each property myself before conversion.

>>>
>>> I am adding comments for now. I am also looking at the comments for
>>> what it takes to be "identifier":
>>>
>>>
>>> https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers
>>>
>>>
>>> What is the resolution in these? There are some strong, often
>>> contradiction, opinions...
>>>
>>> For example, the uniqueness requirement is interesting... if an
>>> identifier must be unique for a single Wikidata entry, this is
>>> effectively disqualifying most identifiers used in the life
>>> sciences... simply because Wikidata rarely has the exact same concept
>>> in Wikidata as it has in the remote database.
>>>
>>> I'm sure we can give examples from any life science field, but
>>> consider a gene: the concept of a gene in Wikidata is not like a gene
>>> sequence in a DNA sequence database. Hence, an identifier from that
>>> database could not be linked as "identifier" to that Wikidata entry.
>>>
>>> Same for most identifiers for small organic compounds (like drugs,
>>> metabolites, etc). I already commented on CAS (P231) and InChI (P234),
>>> both are used as identifier, but none are unique to concepts used as
>>> "types" in Wikidata. The CAS for formaldehyde and formaline is
>>> identical. The InChI may be unique, but only of you strongly type the
>>> definition of a chemical graph instead of a substance (as is now)...
>>> etc.
>>>
>>> So, in order to make a decision which chemical identifiers should be
>>> marked as "identifier" type depends on resolution of those required
>>> characteristics...
>>>
>>> Can you please inform me about the state of those characteristics
>>> (accepted or declined)?
>>>
>>> Egon
>>>
>>> Cheers
 Lydia
 --
 Lydia Pintscher - http://abou

Re: [Wikidata] Status and ETA External ID conversion

2016-03-06 Thread Magnus Manske
Agreed. In Mix'n'match, 145 out of 177catalogs have at least one instance
of two or more external IDs matched to a single Wikidata item. External
datasets, even curated ones, are messy.

Maybe the criterion should be "intended to be unique", or somesuch.

On Sun, Mar 6, 2016 at 9:18 AM Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> Another reason why "uniqueness" is not such a good criterion: it cannot
> be applied to decide the type of a newly created property (no
> statements, no uniqueness score). In general, the fewer statements there
> are for a property, the more likely they are to be unique. The criterion
> rewards data incompleteness (example: if Luca deletes the six multiple
> ids he mentioned, then the property could be converted -- and he could
> later add the statements again). If you think about it, it does not seem
> like a very good idea to make the datatype of a property depend on its
> current usage in Wikidata.
>
> Markus
>
> On 05.03.2016 17:15, Markus Krötzsch wrote:
> > Hi,
> >
> > I agree with Egon that the uniqueness requirement is rather weird. What
> > it means is that a thing is only considered an "identifier" if it points
> > to a database that uses a similar granularity for modelling the world as
> > Wikidata. If the external database is more fine-grained than Wikidata
> > (several ids for one item), then it is not a valid "identifier",
> > according to the uniqueness idea. I wonder what good this may do. In
> > particular, anybody who cares about uniqueness can easily determine it
> > from the data without any property type that says this.
> >
> > Markus
> >
> >
> > On 05.03.2016 15:35, Egon Willighagen wrote:
> >> On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
> >>  wrote:
> >>> On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen
> >>> 
>  What is the exact process? Do you just plan to wait longer to see if
>  anyone supports/contradicts my tagging? Should I get other Wikidata
>  users and contributors to back up my suggestion?
> >>>
> >>> Add them to the list Katie linked if you think they should be
> >>> converted. We
> >>> wait a bit to see if anyone disagrees and I also do a quick sanity
> >>> check for
> >>> each property myself before conversion.
> >>
> >> I am adding comments for now. I am also looking at the comments for
> >> what it takes to be "identifier":
> >>
> >>
> https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers
> >>
> >>
> >> What is the resolution in these? There are some strong, often
> >> contradiction, opinions...
> >>
> >> For example, the uniqueness requirement is interesting... if an
> >> identifier must be unique for a single Wikidata entry, this is
> >> effectively disqualifying most identifiers used in the life
> >> sciences... simply because Wikidata rarely has the exact same concept
> >> in Wikidata as it has in the remote database.
> >>
> >> I'm sure we can give examples from any life science field, but
> >> consider a gene: the concept of a gene in Wikidata is not like a gene
> >> sequence in a DNA sequence database. Hence, an identifier from that
> >> database could not be linked as "identifier" to that Wikidata entry.
> >>
> >> Same for most identifiers for small organic compounds (like drugs,
> >> metabolites, etc). I already commented on CAS (P231) and InChI (P234),
> >> both are used as identifier, but none are unique to concepts used as
> >> "types" in Wikidata. The CAS for formaldehyde and formaline is
> >> identical. The InChI may be unique, but only of you strongly type the
> >> definition of a chemical graph instead of a substance (as is now)...
> >> etc.
> >>
> >> So, in order to make a decision which chemical identifiers should be
> >> marked as "identifier" type depends on resolution of those required
> >> characteristics...
> >>
> >> Can you please inform me about the state of those characteristics
> >> (accepted or declined)?
> >>
> >> Egon
> >>
> >>> Cheers
> >>> Lydia
> >>> --
> >>> Lydia Pintscher - http://about.me/lydia.pintscher
> >>> Product Manager for Wikidata
> >>>
> >>> Wikimedia Deutschland e.V.
> >>> Tempelhofer Ufer 23-24
> >>> 10963 Berlin
> >>> www.wikimedia.de
> >>>
> >>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
> >>>
> >>> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> >>> unter
> >>> der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
> >>> Körperschaften I Berlin, Steuernummer 27/029/42207.
> >>>
> >>> ___
> >>> Wikidata mailing list
> >>> Wikidata@lists.wikimedia.org
> >>> https://lists.wikimedia.org/mailman/listinfo/wikidata
> >>>
> >>
> >>
> >>
> >
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wik

Re: [Wikidata] Status and ETA External ID conversion

2016-03-06 Thread Markus Krötzsch
Another reason why "uniqueness" is not such a good criterion: it cannot 
be applied to decide the type of a newly created property (no 
statements, no uniqueness score). In general, the fewer statements there 
are for a property, the more likely they are to be unique. The criterion 
rewards data incompleteness (example: if Luca deletes the six multiple 
ids he mentioned, then the property could be converted -- and he could 
later add the statements again). If you think about it, it does not seem 
like a very good idea to make the datatype of a property depend on its 
current usage in Wikidata.


Markus

On 05.03.2016 17:15, Markus Krötzsch wrote:

Hi,

I agree with Egon that the uniqueness requirement is rather weird. What
it means is that a thing is only considered an "identifier" if it points
to a database that uses a similar granularity for modelling the world as
Wikidata. If the external database is more fine-grained than Wikidata
(several ids for one item), then it is not a valid "identifier",
according to the uniqueness idea. I wonder what good this may do. In
particular, anybody who cares about uniqueness can easily determine it
from the data without any property type that says this.

Markus


On 05.03.2016 15:35, Egon Willighagen wrote:

On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
 wrote:

On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen


What is the exact process? Do you just plan to wait longer to see if
anyone supports/contradicts my tagging? Should I get other Wikidata
users and contributors to back up my suggestion?


Add them to the list Katie linked if you think they should be
converted. We
wait a bit to see if anyone disagrees and I also do a quick sanity
check for
each property myself before conversion.


I am adding comments for now. I am also looking at the comments for
what it takes to be "identifier":

https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers


What is the resolution in these? There are some strong, often
contradiction, opinions...

For example, the uniqueness requirement is interesting... if an
identifier must be unique for a single Wikidata entry, this is
effectively disqualifying most identifiers used in the life
sciences... simply because Wikidata rarely has the exact same concept
in Wikidata as it has in the remote database.

I'm sure we can give examples from any life science field, but
consider a gene: the concept of a gene in Wikidata is not like a gene
sequence in a DNA sequence database. Hence, an identifier from that
database could not be linked as "identifier" to that Wikidata entry.

Same for most identifiers for small organic compounds (like drugs,
metabolites, etc). I already commented on CAS (P231) and InChI (P234),
both are used as identifier, but none are unique to concepts used as
"types" in Wikidata. The CAS for formaldehyde and formaline is
identical. The InChI may be unique, but only of you strongly type the
definition of a chemical graph instead of a substance (as is now)...
etc.

So, in order to make a decision which chemical identifiers should be
marked as "identifier" type depends on resolution of those required
characteristics...

Can you please inform me about the state of those characteristics
(accepted or declined)?

Egon


Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata










___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Maarten Dammers

Hi Luca,

Op 5-3-2016 om 16:45 schreef Luca Martinelli:


Point taken, I apologise for using too dramatic tones.
Looks like more people are eager to get this over with and can't wait to 
get everything converted

Nonetheless, I stick to the point that probably a ">99% unique
identifier" threshold is too high. Just to make another example
(disclaimer: I asked for this property since it is yet another
catalogue that my institution runs), P1949 has not been converted to
identifier because it has "only 98.82% unique out of 507 uses", that
translates in only *six* cases out of 505 items which have two P1949
identifiers.
That's correct. As I said in my previous email: We're first doing the 
easy properties. You can see the easy properties at 
https://www.wikidata.org/wiki/User:ArthurPSmith/Identifiers/1 . The easy 
ones are the ones that have 99%+ single value and 99%+ unique. Compare 
that with https://www.wikidata.org/wiki/User:Addshore/Identifiers/1 and 
you'll notice we still have loads of easy ones we have to process (the 
unchecked list is still quite long).


Once we get those out of the way, we'll get to the more difficult ones. 
I prefer quality over speed here. I don't expect any problems with 
converting P1949.


Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Gerard Meijssen
Hoi,
Lets take things slowly. It is vital that we get Wikipedia well connected
first. Plenty of challenges there. If we concentrate on what Wikipedia
needs in all its languages, we will get a perspective of what is notable
for us. Other sources have their criteria..
Thanks,
 GerardM

On 5 March 2016 at 19:56, Andy Mabbett  wrote:

> On 5 March 2016 at 16:15, Markus Krötzsch 
> wrote:
>
> > I agree with Egon that the uniqueness requirement is rather weird. What
> it
> > means is that a thing is only considered an "identifier" if it points to
> a
> > database that uses a similar granularity for modelling the world as
> > Wikidata. If the external database is more fine-grained than Wikidata
> > (several ids for one item), then it is not a valid "identifier",
> according
> > to the uniqueness idea.
>
> Then we should create a Wikidata item for each concept on that
> external database.
>
> --
> Andy Mabbett
> @pigsonthewing
> http://pigsonthewing.org.uk
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Andy Mabbett
On 5 March 2016 at 16:15, Markus Krötzsch  wrote:

> I agree with Egon that the uniqueness requirement is rather weird. What it
> means is that a thing is only considered an "identifier" if it points to a
> database that uses a similar granularity for modelling the world as
> Wikidata. If the external database is more fine-grained than Wikidata
> (several ids for one item), then it is not a valid "identifier", according
> to the uniqueness idea.

Then we should create a Wikidata item for each concept on that
external database.

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Andy Mabbett
On 5 March 2016 at 14:25, Lydia Pintscher  wrote:

> I also do a quick sanity check for each property myself before conversion.

You might also like to do a sanity check on those marked as not
suitable for conversion.

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread James Heald

Just do them all, as fast as the bot can go.

Revert them /if/ somebody complains (which is unlikely).

Make this a process of having to contract out for an identifier /not/ to 
be done, rather than having to contract in for it to be done.



Personally, I am rather more interested in what happens next, after the 
datatype-renaming stage is done.


How does the external-ID datatype then evolve?

How does it cope with a external ID possibly having a short-form 
representation, a URL for humans (currently specified by P1630 for the 
group as a whole), a URL for RDF (currently specified by P1921 for the 
group as a whole), also sometimes a locally preferred name, or a locally 
disambiguated name in the external source.


What becomes its wdt: value for SPARQL?

What other object-values will get hung off its detailed statement form ?

What will specified using qualifiers?


Some more clarifications of current forward thinking on this might also 
help with people's concerns about how to respond to departures from 
strict 1-to-1-ness in the mappings (whether many-to-one or one-to-many).



  -- James.




On 05/03/2016 16:15, Markus Krötzsch wrote:

Hi,

I agree with Egon that the uniqueness requirement is rather weird. What
it means is that a thing is only considered an "identifier" if it points
to a database that uses a similar granularity for modelling the world as
Wikidata. If the external database is more fine-grained than Wikidata
(several ids for one item), then it is not a valid "identifier",
according to the uniqueness idea. I wonder what good this may do. In
particular, anybody who cares about uniqueness can easily determine it
from the data without any property type that says this.

Markus


On 05.03.2016 15:35, Egon Willighagen wrote:

On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
 wrote:

On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen


What is the exact process? Do you just plan to wait longer to see if
anyone supports/contradicts my tagging? Should I get other Wikidata
users and contributors to back up my suggestion?


Add them to the list Katie linked if you think they should be
converted. We
wait a bit to see if anyone disagrees and I also do a quick sanity
check for
each property myself before conversion.


I am adding comments for now. I am also looking at the comments for
what it takes to be "identifier":

https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers


What is the resolution in these? There are some strong, often
contradiction, opinions...

For example, the uniqueness requirement is interesting... if an
identifier must be unique for a single Wikidata entry, this is
effectively disqualifying most identifiers used in the life
sciences... simply because Wikidata rarely has the exact same concept
in Wikidata as it has in the remote database.

I'm sure we can give examples from any life science field, but
consider a gene: the concept of a gene in Wikidata is not like a gene
sequence in a DNA sequence database. Hence, an identifier from that
database could not be linked as "identifier" to that Wikidata entry.

Same for most identifiers for small organic compounds (like drugs,
metabolites, etc). I already commented on CAS (P231) and InChI (P234),
both are used as identifier, but none are unique to concepts used as
"types" in Wikidata. The CAS for formaldehyde and formaline is
identical. The InChI may be unique, but only of you strongly type the
definition of a chemical graph instead of a substance (as is now)...
etc.

So, in order to make a decision which chemical identifiers should be
marked as "identifier" type depends on resolution of those required
characteristics...

Can you please inform me about the state of those characteristics
(accepted or declined)?

Egon


Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata








___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Markus Krötzsch

Hi,

I agree with Egon that the uniqueness requirement is rather weird. What 
it means is that a thing is only considered an "identifier" if it points 
to a database that uses a similar granularity for modelling the world as 
Wikidata. If the external database is more fine-grained than Wikidata 
(several ids for one item), then it is not a valid "identifier", 
according to the uniqueness idea. I wonder what good this may do. In 
particular, anybody who cares about uniqueness can easily determine it 
from the data without any property type that says this.


Markus


On 05.03.2016 15:35, Egon Willighagen wrote:

On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
 wrote:

On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen 

What is the exact process? Do you just plan to wait longer to see if
anyone supports/contradicts my tagging? Should I get other Wikidata
users and contributors to back up my suggestion?


Add them to the list Katie linked if you think they should be converted. We
wait a bit to see if anyone disagrees and I also do a quick sanity check for
each property myself before conversion.


I am adding comments for now. I am also looking at the comments for
what it takes to be "identifier":

https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers

What is the resolution in these? There are some strong, often
contradiction, opinions...

For example, the uniqueness requirement is interesting... if an
identifier must be unique for a single Wikidata entry, this is
effectively disqualifying most identifiers used in the life
sciences... simply because Wikidata rarely has the exact same concept
in Wikidata as it has in the remote database.

I'm sure we can give examples from any life science field, but
consider a gene: the concept of a gene in Wikidata is not like a gene
sequence in a DNA sequence database. Hence, an identifier from that
database could not be linked as "identifier" to that Wikidata entry.

Same for most identifiers for small organic compounds (like drugs,
metabolites, etc). I already commented on CAS (P231) and InChI (P234),
both are used as identifier, but none are unique to concepts used as
"types" in Wikidata. The CAS for formaldehyde and formaline is
identical. The InChI may be unique, but only of you strongly type the
definition of a chemical graph instead of a substance (as is now)...
etc.

So, in order to make a decision which chemical identifiers should be
marked as "identifier" type depends on resolution of those required
characteristics...

Can you please inform me about the state of those characteristics
(accepted or declined)?

Egon


Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata








___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Luca Martinelli
2016-03-05 16:09 GMT+01:00 Maarten Dammers :
> Hi Luca,
>
> Op 5-3-2016 om 14:30 schreef Luca Martinelli:
>>
>> Probably the threshold we set up for the conversion is too high, and
>> this might be one of the causes why the whole process has slowed down
>> to a dying pace.
>
> You call
> https://www.wikidata.org/wiki/Special:Contributions/Maintenance_script a
> dying pace?
>
> Instead of complaining here people should participate in
> https://www.wikidata.org/wiki/User:Addshore/Identifiers/0 . Still plenty of
> easy properties that are clearly distinct, unique and have an external url.
> It doesn't make sense to discus the more complicated cases if we haven't
> gotten the easy cases out of the way yet.

Point taken, I apologise for using too dramatic tones.

Nonetheless, I stick to the point that probably a ">99% unique
identifier" threshold is too high. Just to make another example
(disclaimer: I asked for this property since it is yet another
catalogue that my institution runs), P1949 has not been converted to
identifier because it has "only 98.82% unique out of 507 uses", that
translates in only *six* cases out of 505 items which have two P1949
identifiers.

More, I did not intervene because of my blatant conflict of interest
AND because I do not know with who discuss this and where, not even
the general "what is an identifier" discussion. Probably there is a
place where this discussion is going on, and I apologise again for not
knowing (though I have some pretty good excuses), and I'm serious when
I say that I'd be thankful to you if you please can point me in the
general direction of where this is happening. :)
(https://www.wikidata.org/wiki/User:Addshore/Identifiers maybe? Though
that discussion seems to be pretty blocked)

L.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Maarten Dammers

Hi Luca,

Op 5-3-2016 om 14:30 schreef Luca Martinelli:

Probably the threshold we set up for the conversion is too high, and
this might be one of the causes why the whole process has slowed down
to a dying pace.
You call 
https://www.wikidata.org/wiki/Special:Contributions/Maintenance_script a 
dying pace?


Instead of complaining here people should participate in 
https://www.wikidata.org/wiki/User:Addshore/Identifiers/0 . Still plenty 
of easy properties that are clearly distinct, unique and have an 
external url.
It doesn't make sense to discus the more complicated cases if we haven't 
gotten the easy cases out of the way yet.


Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
Never mind. I found these in already done.

Egon

On Sat, Mar 5, 2016 at 3:42 PM, Egon Willighagen
 wrote:
> Mmm... I previously added a few chemical identifiers, like KEGG,
> ChEBI, DrugBank, but I cannot find them anymore... :/
>
> Egon
>
> On Sat, Mar 5, 2016 at 3:16 PM, Egon Willighagen
>  wrote:
>> Hi Lydia, all,
>>
>> On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
>>  wrote:
>>> On 05.03.2016 14:45, Lydia Pintscher wrote:
 Give it another 2 to 3 weeks and it'll get there. More and more editors
 are exposed to the separation in the UI now and start noticing the ones
 that intuitively should be moved into the identifier section.
>>>
>>> Ok, let's see what happens. I am not saying that the other criteria applied
>>> now in the discussions are bad. It's just another use of the datatype than I
>>> would have expected.
>>
>> I'm one of the people who noticed the separation and indeed wondered
>> why some of the chemistry-related identifiers I tagged and added in
>> the long lists of identifiers were not included yet...
>>
>> What is the exact process? Do you just plan to wait longer to see if
>> anyone supports/contradicts my tagging? Should I get other Wikidata
>> users and contributors to back up my suggestion?
>>
>> Originally, I though the idea was just to remove/leave/add them in/to
>> the list, but people started making comments now. I will do this more
>> explicitly now. Also for the IDs I added.
>>
>> Egon
>>
>> --
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> LinkedIn: http://se.linkedin.com/in/egonw
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>> ORCID: -0001-7542-0286
>> ImpactStory: https://impactstory.org/EgonWillighagen
>
>
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/EgonWillighagen



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
Mmm... I previously added a few chemical identifiers, like KEGG,
ChEBI, DrugBank, but I cannot find them anymore... :/

Egon

On Sat, Mar 5, 2016 at 3:16 PM, Egon Willighagen
 wrote:
> Hi Lydia, all,
>
> On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
>  wrote:
>> On 05.03.2016 14:45, Lydia Pintscher wrote:
>>> Give it another 2 to 3 weeks and it'll get there. More and more editors
>>> are exposed to the separation in the UI now and start noticing the ones
>>> that intuitively should be moved into the identifier section.
>>
>> Ok, let's see what happens. I am not saying that the other criteria applied
>> now in the discussions are bad. It's just another use of the datatype than I
>> would have expected.
>
> I'm one of the people who noticed the separation and indeed wondered
> why some of the chemistry-related identifiers I tagged and added in
> the long lists of identifiers were not included yet...
>
> What is the exact process? Do you just plan to wait longer to see if
> anyone supports/contradicts my tagging? Should I get other Wikidata
> users and contributors to back up my suggestion?
>
> Originally, I though the idea was just to remove/leave/add them in/to
> the list, but people started making comments now. I will do this more
> explicitly now. Also for the IDs I added.
>
> Egon
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/EgonWillighagen



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
 wrote:
> On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen 
>> What is the exact process? Do you just plan to wait longer to see if
>> anyone supports/contradicts my tagging? Should I get other Wikidata
>> users and contributors to back up my suggestion?
>
> Add them to the list Katie linked if you think they should be converted. We
> wait a bit to see if anyone disagrees and I also do a quick sanity check for
> each property myself before conversion.

I am adding comments for now. I am also looking at the comments for
what it takes to be "identifier":

https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers

What is the resolution in these? There are some strong, often
contradiction, opinions...

For example, the uniqueness requirement is interesting... if an
identifier must be unique for a single Wikidata entry, this is
effectively disqualifying most identifiers used in the life
sciences... simply because Wikidata rarely has the exact same concept
in Wikidata as it has in the remote database.

I'm sure we can give examples from any life science field, but
consider a gene: the concept of a gene in Wikidata is not like a gene
sequence in a DNA sequence database. Hence, an identifier from that
database could not be linked as "identifier" to that Wikidata entry.

Same for most identifiers for small organic compounds (like drugs,
metabolites, etc). I already commented on CAS (P231) and InChI (P234),
both are used as identifier, but none are unique to concepts used as
"types" in Wikidata. The CAS for formaldehyde and formaline is
identical. The InChI may be unique, but only of you strongly type the
definition of a chemical graph instead of a substance (as is now)...
etc.

So, in order to make a decision which chemical identifiers should be
marked as "identifier" type depends on resolution of those required
characteristics...

Can you please inform me about the state of those characteristics
(accepted or declined)?

Egon

> Cheers
> Lydia
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Lydia Pintscher
On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen 
wrote:

> Hi Lydia, all,
>
> On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
>  wrote:
> > On 05.03.2016 14:45, Lydia Pintscher wrote:
> >> Give it another 2 to 3 weeks and it'll get there. More and more editors
> >> are exposed to the separation in the UI now and start noticing the ones
> >> that intuitively should be moved into the identifier section.
> >
> > Ok, let's see what happens. I am not saying that the other criteria
> applied
> > now in the discussions are bad. It's just another use of the datatype
> than I
> > would have expected.
>
> I'm one of the people who noticed the separation and indeed wondered
> why some of the chemistry-related identifiers I tagged and added in
> the long lists of identifiers were not included yet...
>
> What is the exact process? Do you just plan to wait longer to see if
> anyone supports/contradicts my tagging? Should I get other Wikidata
> users and contributors to back up my suggestion?


Add them to the list Katie linked if you think they should be converted. We
wait a bit to see if anyone disagrees and I also do a quick sanity check
for each property myself before conversion.

Cheers
Lydia
-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
Hi Lydia, all,

On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
 wrote:
> On 05.03.2016 14:45, Lydia Pintscher wrote:
>> Give it another 2 to 3 weeks and it'll get there. More and more editors
>> are exposed to the separation in the UI now and start noticing the ones
>> that intuitively should be moved into the identifier section.
>
> Ok, let's see what happens. I am not saying that the other criteria applied
> now in the discussions are bad. It's just another use of the datatype than I
> would have expected.

I'm one of the people who noticed the separation and indeed wondered
why some of the chemistry-related identifiers I tagged and added in
the long lists of identifiers were not included yet...

What is the exact process? Do you just plan to wait longer to see if
anyone supports/contradicts my tagging? Should I get other Wikidata
users and contributors to back up my suggestion?

Originally, I though the idea was just to remove/leave/add them in/to
the list, but people started making comments now. I will do this more
explicitly now. Also for the IDs I added.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread David Cuenca Tudela
Markus, you are not the only one, I am also skeptical about the criteria
used. For me the main problem is perhaps the misunderstanding that the
"external identifier" label creates, actually what I was expecting was
something more like "external references", a place where to put all the
external sources to wikidata in one place. But we'll see how it goes.

Cheers,
Micru

On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> On 05.03.2016 14:45, Lydia Pintscher wrote:
>
>> On Sat, Mar 5, 2016 at 1:28 PM Markus Krötzsch
>> mailto:mar...@semantic-mediawiki.org>>
>> wrote:
>>
>> Thanks, Katie. I see that the external ID datatype does not work as
>> planed. At least I thought the original idea was to clean up the UI by
>> moving hard-to-understand string IDs to a separate section. From the
>> discussions on these pages, I see that the community uses criteria
>> that
>> are completely unrelated to UI aspects, but have something to do with
>> the degree to which the property encodes a one-to-one mapping. I guess
>> this is also valid, but won't be useful for UI purposes. I will need
>> to
>> use another solution for my case then.
>>
>>
>> Give it another 2 to 3 weeks and it'll get there. More and more editors
>> are exposed to the separation in the UI now and start noticing the ones
>> that intuitively should be moved into the identifier section.
>>
>
> Ok, let's see what happens. I am not saying that the other criteria
> applied now in the discussions are bad. It's just another use of the
> datatype than I would have expected.
>
> Markus
>
>
>> Cheers
>> Lydia
>> --
>> Lydia Pintscher - http://about.me/lydia.pintscher
>> Product Manager for Wikidata
>>
>> Wikimedia Deutschland e.V.
>> Tempelhofer Ufer 23-24
>> 10963 Berlin
>> www.wikimedia.de 
>>
>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>>
>> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
>> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
>> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
Etiamsi omnes, ego non
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Markus Krötzsch

On 05.03.2016 14:45, Lydia Pintscher wrote:

On Sat, Mar 5, 2016 at 1:28 PM Markus Krötzsch
mailto:mar...@semantic-mediawiki.org>>
wrote:

Thanks, Katie. I see that the external ID datatype does not work as
planed. At least I thought the original idea was to clean up the UI by
moving hard-to-understand string IDs to a separate section. From the
discussions on these pages, I see that the community uses criteria that
are completely unrelated to UI aspects, but have something to do with
the degree to which the property encodes a one-to-one mapping. I guess
this is also valid, but won't be useful for UI purposes. I will need to
use another solution for my case then.


Give it another 2 to 3 weeks and it'll get there. More and more editors
are exposed to the separation in the UI now and start noticing the ones
that intuitively should be moved into the identifier section.


Ok, let's see what happens. I am not saying that the other criteria 
applied now in the discussions are bad. It's just another use of the 
datatype than I would have expected.


Markus



Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de 

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Lydia Pintscher
On Sat, Mar 5, 2016 at 1:28 PM Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> Thanks, Katie. I see that the external ID datatype does not work as
> planed. At least I thought the original idea was to clean up the UI by
> moving hard-to-understand string IDs to a separate section. From the
> discussions on these pages, I see that the community uses criteria that
> are completely unrelated to UI aspects, but have something to do with
> the degree to which the property encodes a one-to-one mapping. I guess
> this is also valid, but won't be useful for UI purposes. I will need to
> use another solution for my case then.
>

Give it another 2 to 3 weeks and it'll get there. More and more editors are
exposed to the separation in the UI now and start noticing the ones that
intuitively should be moved into the identifier section.

Cheers
Lydia
-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Luca Martinelli
2016-03-05 13:26 GMT+01:00 Markus Krötzsch :
> Thanks, Katie. I see that the external ID datatype does not work as planed.
> At least I thought the original idea was to clean up the UI by moving
> hard-to-understand string IDs to a separate section. From the discussions on
> these pages, I see that the community uses criteria that are completely
> unrelated to UI aspects, but have something to do with the degree to which
> the property encodes a one-to-one mapping. I guess this is also valid, but
> won't be useful for UI purposes. I will need to use another solution for my
> case then.

My2c, sorry if I'm going offtopic.

My impression on some properties is that we're probably
underestimating some problems that are independent from our will, such
as:
* the possibility that the original catalogue might have some
duplicates, and we can actually help the original catalogue to correct
this issue;
* the possibility that the Wikimedia approach and the catalogue's
approach might bring one of the two sides to define something as two
different things, while the other sides comprises it as a whole (for
example, "palace+gardens");
* the possibility that some identifiers *are* standardised, but the
authority did not published a single catalogue, leaving the single
institutes to care for their own catalogue (for example, the
International Standard Identifier for Libraries and Related
Organizations, aka P791);
* and so on.

Particularly the ISIL one is an important example to me, since I work
for the Italian institution that actually is entitled to conduct the
census of Italian libraries and assign the ISIL code to every and each
library in Italy. There is no single world catalogue of that
identifier? I really don't see it as a problem, as long as there it is
at least one national authority that does that job. We're probably
underestimating the fact that not everything has been standardised at
a world level - and that we can live with that just fine.

Probably the threshold we set up for the conversion is too high, and
this might be one of the causes why the whole process has slowed down
to a dying pace.

L.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Markus Krötzsch
Thanks, Katie. I see that the external ID datatype does not work as 
planed. At least I thought the original idea was to clean up the UI by 
moving hard-to-understand string IDs to a separate section. From the 
discussions on these pages, I see that the community uses criteria that 
are completely unrelated to UI aspects, but have something to do with 
the degree to which the property encodes a one-to-one mapping. I guess 
this is also valid, but won't be useful for UI purposes. I will need to 
use another solution for my case then.


Markus

On 05.03.2016 11:20, Katie Filbert wrote:

On Sat, Mar 5, 2016 at 11:14 AM, Markus Krötzsch
mailto:mar...@semantic-mediawiki.org>>
wrote:

Hi,

I noticed that many id properties still use the string datatype
(including extremely frequent ids like
https://www.wikidata.org/wiki/Property:P213 and
https://www.wikidata.org/wiki/Property:P227).

Why is the conversion so slow, and when is it supposed to be completed?


The community is checking each property to verify it should be converted:

https://www.wikidata.org/wiki/User:Addshore/Identifiers/0

https://www.wikidata.org/wiki/User:Addshore/Identifiers/1

https://www.wikidata.org/wiki/User:Addshore/Identifiers/2

I'm sure help is welcome in checking properties.

and then we convert them in batches.

Cheers,
Katie


Cheers,

Markus

___
Wikidata mailing list
Wikidata@lists.wikimedia.org 
https://lists.wikimedia.org/mailman/listinfo/wikidata




--
Katie Filbert
Wikidata Developer

Wikimedia Germany e.V. | Tempelhofer Ufer 23-24, 10963 Berlin
Phone (030) 219 158 26-0

http://wikimedia.de

Wikimedia Germany - Society for the Promotion of free knowledge eV
Entered in the register of Amtsgericht Berlin-Charlottenburg under the
number 23 855 as recognized as charitable by the Inland Revenue for
corporations I Berlin, tax number 27/681/51985.


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Katie Filbert
On Sat, Mar 5, 2016 at 11:14 AM, Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> Hi,
>
> I noticed that many id properties still use the string datatype (including
> extremely frequent ids like https://www.wikidata.org/wiki/Property:P213
> and https://www.wikidata.org/wiki/Property:P227).
>
> Why is the conversion so slow, and when is it supposed to be completed?
>

The community is checking each property to verify it should be converted:

https://www.wikidata.org/wiki/User:Addshore/Identifiers/0

https://www.wikidata.org/wiki/User:Addshore/Identifiers/1

https://www.wikidata.org/wiki/User:Addshore/Identifiers/2

I'm sure help is welcome in checking properties.

and then we convert them in batches.

Cheers,
Katie



>
> Cheers,
>
> Markus
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
Katie Filbert
Wikidata Developer

Wikimedia Germany e.V. | Tempelhofer Ufer 23-24, 10963 Berlin
Phone (030) 219 158 26-0

http://wikimedia.de

Wikimedia Germany - Society for the Promotion of free knowledge eV Entered
in the register of Amtsgericht Berlin-Charlottenburg under the number 23
855 as recognized as charitable by the Inland Revenue for corporations I
Berlin, tax number 27/681/51985.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Markus Krötzsch

Hi,

I noticed that many id properties still use the string datatype 
(including extremely frequent ids like 
https://www.wikidata.org/wiki/Property:P213 and 
https://www.wikidata.org/wiki/Property:P227).


Why is the conversion so slow, and when is it supposed to be completed?

Cheers,

Markus

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata