Re: [Wikidata] WCQS Beta Downtime beginning Feb 4 18:30 UTC

2021-02-04 Thread Maarten Dammers

Hi Ryan and Guillaume,

Last time I checked WCQS was short for "Wikimedia Commons Query Service" 
( https://commons.wikimedia.org/wiki/Commons:SPARQL_query_service ) so 
I'm a bit puzzled why you posted this on the Wikidata mailing list 
instead of the Wikimedia Commons list? I hope it will be back soon.


Maarten

On 03-02-2021 22:39, Guillaume Lederrey wrote:
We ran some numbers and it looks that the data reload is going to take 
around 2.5 days, during which WCQS will be unavailable. Sorry for this 
interruption of service.


On Wed, 3 Feb 2021, 21:16 Guillaume Lederrey, > wrote:


On Wed, Feb 3, 2021 at 8:53 PM Ryan Kemper mailto:rkem...@wikimedia.org>> wrote:

Hi all,

Our host *wcqs-beta-01.eqiad.wmflabs* is running low on disk
space due to its blazegraph journal dataset size. In order to
free up space we will need to take the service down, delete
the journal and re-import from the latest dump. Service
interruption will begin at *Feb 4 18:30 UTC* and continue
until the data reload is complete.


Just to be clear, this is the host behind
https://wcqs-beta.wmflabs.org/.

We'll send out a notification when the downtime begins and
when it ends as well.

*Note*: This doesn't affect WDQS, only the WCQS beta.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org 
https://lists.wikimedia.org/mailman/listinfo/wikidata



-- 
	*Guillaume Lederrey* (he/him)

Engineering Manager
Wikimedia Foundation 


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] ACM UMAP 2021: Third Call-for-Papers - Updated submission information

2021-01-12 Thread Maarten Dammers

Hi Violeta,

On 12-01-2021 00:01, Violeta Ilik wrote:
I want to say I am surprised but no, I am not. This list is full of 
unwelcoming people who somehow continue to thrive here.


This list is very welcoming to people who want discuss anything related 
to Wikidata, off-topic spammy conference call-for-papers on the other 
hand, are not welcome. Please don't attack other members of this list. 
Play the ball, not the player.


Maarten


Unbelievable.

-vi


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] ACM UMAP 2021: Third Call-for-Papers - Updated submission information

2021-01-11 Thread Maarten Dammers

Hi Oana,

On 11-01-2021 08:58, Oana Inel wrote:

--- Apologies for cross-posting ---


Apologies not accepted. This doesn't seem to be on topic for this list. 
Have a look at 
https://ruben.verborgh.org/blog/2014/01/31/apologies-for-cross-posting/


Maarten



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata-tech] Remote hackathon (May 9 - 11, 2020)

2020-04-20 Thread Maarten Dammers

Hi everyone,

The first weekend of May the Wikimedia Hackathon 2020 in Tirana was 
supposed to happen. As an alternative, we're organizing a remote 
hackathon. Please have a look at 
https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2020/Remote_Hackathon 
and sign up if you're interested in participating.


All of the sessions, projects, and initiatives are run by the 
participants themselves and not overseen by a core organizing team.  All 
ideas including technical projects and sessions, non-technical projects 
and sessions, and social events/socal time are very welcome.


Please spread the word!

Maarten


___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-tech] Blank node deprecation in WDQS & Wikibase RDF model

2020-04-17 Thread Maarten Dammers

Hi David,

Peter brings up some very valid points and I agree with him. I don't 
really like how you present this as a done deal to the community. Now it 
looks like you have some software performance problem, you think you 
found some solution and without any community consultation, you're 
pushing this through.


Maarten

On 17-04-20 16:11, David Causse wrote:

Thanks for the feedback,
just a note to say that I responded via 
https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team/Query_Service_and_search


David Causse

On Thu, Apr 16, 2020 at 8:16 PM Peter F. Patel-Schneider 
mailto:pfpschnei...@gmail.com>> wrote:


I am taking the liberty of replying to the list because of the
problems with
supplied justification for this change that are part of the
original message.

I believe that https://phabricator.wikimedia.org/T244341#5889997
is inadequate
for determining that blank nodes are problematic.  First, the fact
that
determining isomorphism in RDF graphs with blank nodes is
non-polynomial is a
red herring.  If the blank nodes participate in only one triple then
isomorphism remains easy.  Second, the query given to remove a
some-value SNAK
is incorrect in general - it will remove all triples with the
blank node as
object.  (Yes, if the blank nodes found are leaves then no extra
triples are
removed.)  A simpler DELETE WHERE will have the seemingly-desired
result.

This is not to say that blank nodes do not cause problems.
According to the
semanticss of both RDF and SPARQL blank nodes are anonymous so to
repeatedly
access the same blank node in a graph one has to access the stored
graph using
an interface that exposes the retained identity of blank nodes. 
It looks as
if the WDSQ is built on a system that has such an interface. As
the WDQS
already uses user-visible features that are not part of SPARQL,
adding (or
maybe even only utilizing) a non-standard interface that is only used
internally would not be a problem.

One problem when using generated URLs to replace blank nodes is
that these
generated URLs have to be guaranteed stable and unique (not just
stable) for
the lifetime of the query service.  Another problem is that yet
another
non-standard function is being introduced, pulling the RDF dump of
Wikidata
yet further from RDF.

So this is a significant change as far as users are concerned that
also has
potential implementation issues.   Why not just use an internal
interface that
exposes a retained identity for blank nodes?

Peter F. Patel-Schneider



On 4/16/20 8:34 AM, David Causse wrote:
> Hi,
>
> This message is relevant for people writing SPARQL queries and
using the
> Wikidata Query Service:
>
> As part of the work of redesigning the WDQS updater[0] we
identified that
> blank nodes[1] are problematic[2] and we plan to deprecate their
usage in
> the wikibase RDF model[3]. To ease the deprecation process we are
> introducing the new function wikibase:isSomeValue() that can be
used in
> place of isBlank() when it was used to filter SomeValue[4].
>
> What does this mean for you: nothing will change for now, we are
only
> interested to know if you encounter any issues with the
> wikibase:isSomeValue() function when used as a replacement of
the isBlank()
> function. More importantly, if you used the isBlank() function
for other
> purposes than identifying SomeValue (unknown values in the UI),
please let
> us know as soon as possible.
>
> The current plan is as follow:
>
> 1. Introduce a new wikibase:isSomeValue() function
> We are at this step. You can already use wikibase:isSomeValue()
in the Query
> Service. Here’s an example query (Humans whose gender we know we
don't know):
> SELECT ?human WHERE {
> ?human wdt:P21 ?gender
> FILTER wikibase:isSomeValue(?gender) .
> }
> You can also search the wikis[8] to find all the pages where the
function
> isBlank is referenced in a SPARQL query.
>
> 2. Generate stable labels for blank nodes in the wikibase RDF output
> Instead of "autogenerated" blank node labels wikidata will now
provide a
> stable label for blank nodes. In other words the wikibase
triples using
> blank nodes such as:
> s:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 ps:P576 _:genid2 ;
> will become
> s:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 ps:P576
> _:1668ace9a6860f7b32569c45fe5a5c0d ;
> This is not a breaking change.
>
> 3. [BREAKING CHANGE] Convert blank nodes to IRIs in the WDQS updater
> At this point some WDQS servers will start returning IRIs such
> as
http://www.wikidata.org/somevalue/1668ace9a6860f7b32569c45fe5a5c0d (the
> exact form of the IRI is 

Re: [Wikidata] [Wikidata-tech] Blank node deprecation in WDQS & Wikibase RDF model

2020-04-17 Thread Maarten Dammers

Hi David,

Peter brings up some very valid points and I agree with him. I don't 
really like how you present this as a done deal to the community. Now it 
looks like you have some software performance problem, you think you 
found some solution and without any community consultation, you're 
pushing this through.


Maarten

On 17-04-20 16:11, David Causse wrote:

Thanks for the feedback,
just a note to say that I responded via 
https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team/Query_Service_and_search


David Causse

On Thu, Apr 16, 2020 at 8:16 PM Peter F. Patel-Schneider 
mailto:pfpschnei...@gmail.com>> wrote:


I am taking the liberty of replying to the list because of the
problems with
supplied justification for this change that are part of the
original message.

I believe that https://phabricator.wikimedia.org/T244341#5889997
is inadequate
for determining that blank nodes are problematic.  First, the fact
that
determining isomorphism in RDF graphs with blank nodes is
non-polynomial is a
red herring.  If the blank nodes participate in only one triple then
isomorphism remains easy.  Second, the query given to remove a
some-value SNAK
is incorrect in general - it will remove all triples with the
blank node as
object.  (Yes, if the blank nodes found are leaves then no extra
triples are
removed.)  A simpler DELETE WHERE will have the seemingly-desired
result.

This is not to say that blank nodes do not cause problems.
According to the
semanticss of both RDF and SPARQL blank nodes are anonymous so to
repeatedly
access the same blank node in a graph one has to access the stored
graph using
an interface that exposes the retained identity of blank nodes. 
It looks as
if the WDSQ is built on a system that has such an interface. As
the WDQS
already uses user-visible features that are not part of SPARQL,
adding (or
maybe even only utilizing) a non-standard interface that is only used
internally would not be a problem.

One problem when using generated URLs to replace blank nodes is
that these
generated URLs have to be guaranteed stable and unique (not just
stable) for
the lifetime of the query service.  Another problem is that yet
another
non-standard function is being introduced, pulling the RDF dump of
Wikidata
yet further from RDF.

So this is a significant change as far as users are concerned that
also has
potential implementation issues.   Why not just use an internal
interface that
exposes a retained identity for blank nodes?

Peter F. Patel-Schneider



On 4/16/20 8:34 AM, David Causse wrote:
> Hi,
>
> This message is relevant for people writing SPARQL queries and
using the
> Wikidata Query Service:
>
> As part of the work of redesigning the WDQS updater[0] we
identified that
> blank nodes[1] are problematic[2] and we plan to deprecate their
usage in
> the wikibase RDF model[3]. To ease the deprecation process we are
> introducing the new function wikibase:isSomeValue() that can be
used in
> place of isBlank() when it was used to filter SomeValue[4].
>
> What does this mean for you: nothing will change for now, we are
only
> interested to know if you encounter any issues with the
> wikibase:isSomeValue() function when used as a replacement of
the isBlank()
> function. More importantly, if you used the isBlank() function
for other
> purposes than identifying SomeValue (unknown values in the UI),
please let
> us know as soon as possible.
>
> The current plan is as follow:
>
> 1. Introduce a new wikibase:isSomeValue() function
> We are at this step. You can already use wikibase:isSomeValue()
in the Query
> Service. Here’s an example query (Humans whose gender we know we
don't know):
> SELECT ?human WHERE {
> ?human wdt:P21 ?gender
> FILTER wikibase:isSomeValue(?gender) .
> }
> You can also search the wikis[8] to find all the pages where the
function
> isBlank is referenced in a SPARQL query.
>
> 2. Generate stable labels for blank nodes in the wikibase RDF output
> Instead of "autogenerated" blank node labels wikidata will now
provide a
> stable label for blank nodes. In other words the wikibase
triples using
> blank nodes such as:
> s:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 ps:P576 _:genid2 ;
> will become
> s:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 ps:P576
> _:1668ace9a6860f7b32569c45fe5a5c0d ;
> This is not a breaking change.
>
> 3. [BREAKING CHANGE] Convert blank nodes to IRIs in the WDQS updater
> At this point some WDQS servers will start returning IRIs such
> as
http://www.wikidata.org/somevalue/1668ace9a6860f7b32569c45fe5a5c0d (the
> exact form of the IRI is 

Re: [Wikidata] WDQS and SPARQL Endpoint Compatibility

2020-03-31 Thread Maarten Dammers

Hi Egon,

On 31-03-20 09:02, Egon Willighagen wrote:


My bot produces a weekly federation report at

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/Federation_report



The WikiPathways SPARQL endpoint URL has changed, and I have requested 
an update (Jan 2020, [0]), but no update or reply yet.


Maarten, that is causing the simple query in this report to fail.


I created https://phabricator.wikimedia.org/T249041 for this. The 
Wikimedia site requests is one of the more active boards in my 
experience. Let's see how it goes. If it goes well, we can just start 
moderating 
https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input and 
create a site request for everything that gets approved.


Maarten



Egon

0.https://www.wikidata.org/wiki/Wikidata_talk:SPARQL_federation_input#Updated_URL_for_the_WikiPathways_SPARQL_endpoint

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] WDQS and SPARQL Endpoint Compatibility

2020-03-30 Thread Maarten Dammers
Since Stas left last year, unfortunately nobody from the WMF has done 
anything with 
https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input . I don't 
know if the new SPARQL people are even aware of this page.


My bot produces a weekly federation report at 
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/Federation_report 



Maarten

On 30-03-20 22:41, Lucas Werkmeister wrote:

The current whitelist is documented at
https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/SPARQL_Federation_endpoints
and new additions can be proposed at
https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input.

Cheers,
Lucas

On 30.03.20 20:31, Kingsley Idehen wrote:

All,

I am opening up this thread to discuss the generic support of SPARQL
endpoints by WDQS. Correct me if I am wrong, but right now it can use
SPARQL-FED against a select number of registered endpoints?

As you all know, the LOD Cloud Knowledge Graph is a powerful repository
of loosely-coupled, data, information, and knowledge. One that could
really help humans and software agents in the collective quest to defeat
the COVID19 disease.


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Notability and classing "notable" properties

2019-11-19 Thread Maarten Dammers
Are you looking for 
https://www.wikidata.org/wiki/MediaWiki:Wikibase-SortedProperties ?


On 19-11-19 22:15, Thad Guidry wrote:
When viewing Items on Wikidata that I am researching or quickly having 
to disambiguate, I often end up scrolling down endlessly to see 
important "notable" properties for People.
Some of them we are familiar with such as "award received" 
 or "notable work" 
.


For example, Frank Lloyd Wright 

So my 3 Questions ::

1. I'm curious if there is already a preference or tool that would 
allow those "popular" or "notable" kinds of properties to be shown 
further up on the Item pages when looking at People and deriving 
Notability for them?


2. More generally, What or Who controls the listview-item of divs 
inside div.wikibase-statementgrouplistview ?
Perhaps, one way to look at this is that of "notability" 
, and where we can 
definitely see that some properties lend themselves to that concept of 
"notability"  like 
"award received"  and 
others not such much like "sex or gender" 
.  For instance, 
properties that are an instance of

"Wikidata property related to awards, prizes and honours"


3. How do others collect the "notable" properties floating around 
Wikidata?


Thad
https://www.linkedin.com/in/thadguidry/

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata-tech] How to respect throttling and retry-after headers on the Wikidata Query Service.

2019-11-17 Thread Maarten Dammers

Hi Andra,

Pywikibot should take care of that for you, see 
https://github.com/wikimedia/pywikibot/blob/master/pywikibot/comms/http.py#L317 



Maarten

On 02-11-19 11:36, Andra Waagmeester wrote:

Hi,

    I hope this is the right mailing list to discuss this issue.
Some time ago I ran into a series of temporary bans, I thought I 
managed to tackle this basically by doing a full stop once it gets any 
response header code other than 200.


However, this seems not to have fixed it, since I received the 
following message:


"requests.exceptions.HTTPError: 403 Client Error: You have been banned 
until 2019-10-18T10:21:36.495Z, please respect throttling and 
retry-after headers. for url: https://query.wikidata.org/sparql;


I am looking into this from scratch and see if I can implement a 
better solution and certainly one that really respects the retry-after 
time instead of going full stop.


Whatever I try now, I keep getting 200 headers and I don't want to 
start an excessive bot run to get into a ban state to see the exact 
header that the bot needs to respect.


Is there an example of such a header which I can use to make my own 
test script?


Or is there example python could that successfully deals with a 
retry-after header?


Regards,

Andra



___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata] searching for Wikidata items

2019-06-05 Thread Maarten Dammers

Hi Tim,

Pywikibot has generators around the API. For example for search you have 
https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.html#pywikibot.pagegenerators.SearchPageGenerator 
. So basically anything you can search for as a user can also be used as 
a generator in Pywikibot.


Say for example all bands that have "Bush" in their name. We have the 
band Bush at https://www.wikidata.org/wiki/Q247949 . With a bit of a 
trick you can see what the search engine knows about a page: 
https://www.wikidata.org/w/index.php?title=Q247949=cirrusdump . 
We can use this to limit the search engine to limit the results to only 
instance of (P31) band (Q215380), see 
https://www.wikidata.org/w/index.php?search=bush+-wbhasstatement%3A%22P31%3DQ215380%22=Special%3ASearch=advanced=1=%7B%7D=1=1 
or as API output at 
https://www.wikidata.org/w/api.php?action=query=search=bush%20-wbhasstatement:%22P31=Q215380%22=json


Pywikibot accepts the same search string:
>>> import pywikibot
>>> from pywikibot import pagegenerators
>>> query = 'bush -wbhasstatement:"P31=Q215380"'
>>> repo = pywikibot.Site().data_repository()
>>> searchgen = pagegenerators.SearchPageGenerator(query,site=repo)
>>> for item in searchgen:
... print (item.title())
...
Q1156378
Q16945866
Q16953971
Q247949
Q2928714
Q5001360
Q5001432
Q7720714
Q7757229
>>>

Maarten

On 04-06-19 15:44, Marielle Volz wrote:
Yes, the api is at 
https://www.wikidata.org/w/api.php?action=query=search=Bush


There's a sandbox where you can play with the various options:
https://www.wikidata.org/wiki/Special:ApiSandbox#action=query=json=search=Bush


On Tue, Jun 4, 2019 at 2:22 PM Tim Finin > wrote:


What's the best way to search Wikidata for items whose name or
alias matches a string?  The search available via pywikibot seems
to only find a match if the search string is a prefix of an item's
name or alias, so searching for "Bush" does not return any of the
the George Bush items. I don't want to use a SPARQL query with a
regex, since I expect that to be slow.

The search box on the Wikidata pages is closer to what I want.  Is
there a good way to call this via an API?

Ideally, I'd like to be able to specify a language and also a set
of types, but I can do that once I've identified candidates based
on a simple match with a query string.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org 
https://lists.wikimedia.org/mailman/listinfo/wikidata


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata-tech] wb_terms redesign

2019-05-04 Thread Maarten Dammers

Hi Alaa,

On 25-04-19 16:38, Alaa Sarhan wrote:
> This is really a defective redesign. It reintroduced numeric IDs to 
be removed by T114902. See also T179928. We should reconsider 
reintroduce a new table to link unperfixed and perfixed entity ID.


The new schema has been optimized as much as possible to allow maximum 
scalability as it will contain a massive amount of data that we hope 
it doubles or even triple in size as soon as we can.


The new schema has been optimized for your use cases and complete breaks 
any tools combining page table data with wikibase data. If you really 
would care about tool developers, you wouldn't trash the unprefixed ID.


Maarten


___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata] [Wikidata-tech] wb_terms redesign

2019-05-04 Thread Maarten Dammers

Hi Alaa,

On 25-04-19 16:38, Alaa Sarhan wrote:
> This is really a defective redesign. It reintroduced numeric IDs to 
be removed by T114902. See also T179928. We should reconsider 
reintroduce a new table to link unperfixed and perfixed entity ID.


The new schema has been optimized as much as possible to allow maximum 
scalability as it will contain a massive amount of data that we hope 
it doubles or even triple in size as soon as we can.


The new schema has been optimized for your use cases and complete breaks 
any tools combining page table data with wikibase data. If you really 
would care about tool developers, you wouldn't trash the unprefixed ID.


Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Semantic annotation of red links on Wikipedia

2018-09-29 Thread Maarten Dammers
New property proposed at 
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Wikipedia_suggested_article_name



On 28-09-18 11:27, Lucie-Aimée Kaffee wrote:
The idea of linking red links is very interesting, I believe, 
especially as we have Wikidata items to many of the missing articles.
We discussed the concept of "smart red links" (linking to the 
ArticlePlaceholder pages, as someone pointed out before) a while ago, 
documented at 
https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder/Smart_red_links


I believe it's a very interesting direction to explore, especially for 
Wikipedias with a smaller amount of articles and therefore naturally a 
higher amount of red links.


On Thu, 27 Sep 2018 at 21:06, Maarten Dammers <mailto:maar...@mdammers.nl>> wrote:


Hello,

On 27-09-18 01:16, Andy Mabbett wrote:
> On 24 September 2018 at 18:48, Maarten Dammers
mailto:maar...@mdammers.nl>> wrote:
>
>> Wouldn't it be nice to be able to make a connection between the
red link on
>> Wikipedia and the Wikidata item?
> This facility already exists:
>
>

https://en.wikipedia.org/wiki/Template:Interlanguage_link#Link_to_Reasonator_and_Wikidata
You seem to have done some selective quoting and selective reading. I
addressed this in my original email:

On 24-09-18 19:48, Maarten Dammers wrote:
> Where to store this link? I'm not sure about that. On some
Wikipedia's
> people have tested with local templates around the red links.
That's
> not structured data, clutters up the Wikitext, it doesn't scale and
> the local communities generally don't seem to like the approach.
> That's not the way to go.
James also shared some links related to this.

Maarten




___
Wikidata mailing list
Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata



--
Lucie-Aimée Kaffee
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Looking for "data quality check" bots

2018-09-29 Thread Maarten Dammers

Hi Ettore,


On 26-09-18 14:31, Ettore RIZZA wrote:

Dear all,

I'm looking for Wikidata bots that perform accuracy audits. For 
example, comparing the birth dates of persons with the same date 
indicated in databases linked to the item by an external-id.
Let's have a look at the evolution of automated editing. The first step 
is to add missing data from anywhere. Bots importing date of birth are 
an example of this. The next step is to add data from somewhere with a 
source or add sources to existing unsourced or badly sourced statements. 
As far as I can see that's where we are right now, see for example edits 
like 
https://www.wikidata.org/w/index.php?title=Q41264=revision=619653838=616277912 
is . Of course the next step would be to be able to compare existing 
sourced statements with external data to find differences. But how would 
the work flow be? Take for example Johannes Vermeer ( 
https://www.wikidata.org/wiki/Q41264 ). Extremely well documented and 
researched, but 
http://www.getty.edu/vow/ULANFullDisplay?find500032927 
and https://rkd.nl/nl/explore/artists/80476 combined provide 3 different 
dates of birth and 3 different dates of death. When it comes to these 
kind of date mismatches, it's generally first come, first served (first 
date added doesn't get replaced). This mismatch could show up in some 
report. I can check it as a human and maybe do some adjustments, but how 
would I sign it of to prevent other people from doing the same thing 
over and over again?


With federated SPARQL queries it becomes much easier to generate reports 
of mismatches. See for example 
https://www.wikidata.org/wiki/Property_talk:P1006/Mismatches .


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Semantic annotation of red links on Wikipedia

2018-09-27 Thread Maarten Dammers

Hello,

On 27-09-18 01:16, Andy Mabbett wrote:

On 24 September 2018 at 18:48, Maarten Dammers  wrote:


Wouldn't it be nice to be able to make a connection between the red link on
Wikipedia and the Wikidata item?

This facility already exists:


https://en.wikipedia.org/wiki/Template:Interlanguage_link#Link_to_Reasonator_and_Wikidata
You seem to have done some selective quoting and selective reading. I 
addressed this in my original email:


On 24-09-18 19:48, Maarten Dammers wrote:
Where to store this link? I'm not sure about that. On some Wikipedia's 
people have tested with local templates around the red links. That's 
not structured data, clutters up the Wikitext, it doesn't scale and 
the local communities generally don't seem to like the approach. 
That's not the way to go. 

James also shared some links related to this.

Maarten




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Semantic annotation of red links on Wikipedia

2018-09-24 Thread Maarten Dammers

Hi James,


On 24-09-18 20:08, James Heald wrote:
The problem, if you don't put something on the wikipage itself, is how 
then do you determine which [[John A. Smith]] a redlink was intended 
to refer to, if there is more than one possibility.
That's a classic disambiguation problem. Most Wikipedia's seem to be 
pretty good at dealing with these. At least for the Dutch Wikipedia I 
know people working on disambiguation are quite active and I encounter 
quite a few disambiguated red links. If this would really become an 
issue, a qualifier could be used to track based on what article (it's 
linked item) the link was made. So in the case of Friedrich Ris, that 
would be https://www.wikidata.org/wiki/Q1624113 
(https://nl.wikipedia.org/wiki/Aethriamanta_aethra).


Maarten



But Maarten is right, that at least on en-wiki, the suggestion of 
adding templates to link to Wikidata content has met with considerable 
hostility, expressed in two recent RfCs:
https://en.wikipedia.org/wiki/Wikipedia_talk:Manual_of_Style/Archive_202#RfC:_Linking_to_wikidata 



https://en.wikipedia.org/wiki/Wikipedia_talk:Manual_of_Style/Archive_204#New_RFC_on_linking_to_Wikidata 



  -- James.



On 24/09/2018 18:48, Maarten Dammers wrote:

Hi everyone,

According to https://www.youtube.com/watch?v=TLuM4E6IE5U : "Semantic 
annotation is the process of attaching additional information to 
various concepts (e.g. people, things, places, organizations etc) in 
a given text or any other content. Unlike classic text annotations 
for reader's reference, semantic annotations are used by machines to 
refer to."
(more at 
https://ontotext.com/knowledgehub/fundamentals/semantic-annotation/ )


On Wikipedia a red link is a link to an article that hasn't been 
created (yet) in that language. Often another language does have an 
article about the subject or at least we have a Wikidata item about 
the subject. Take for example 
https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris . It has 
over 250 incoming links, but the person doesn't have an article in 
Dutch. We have a Wikidata item with links to 7 Wikipedia's at 
https://www.wikidata.org/wiki/Q116510 , but no way to relate 
https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris with 
https://www.wikidata.org/wiki/Q116510 .


Wouldn't it be nice to be able to make a connection between the red 
link on Wikipedia and the Wikidata item?


Let's assume we have this list somewhere. We would be able to offer 
all sorts of nice features to our users like:

* Hover of the link to get a hovercard in your favorite backup language
* Generate an article placeholder for the user with basic information 
in the local language
* Pre-populate the translate extension so you can translate the 
article from another language

(probably plenty of other good uses)

Where to store this link? I'm not sure about that. On some 
Wikipedia's people have tested with local templates around the red 
links. That's not structured data, clutters up the Wikitext, it 
doesn't scale and the local communities generally don't seem to like 
the approach. That's not the way to go. Maybe a better option would 
be to create a new property on Wikidata to store the name of the 
future article. Something like Q116510: Pxxx -> (nl)"Friedrich Ris". 
Would be easiest because the infrastructure is there and you can just 
build tools on top of it, but I'm afraid this will cause a lot of 
noise on items. A couple of suggestions wouldn't be a problem, but 
what is keeping people from adding the suggestion in 100 languages? 
Or maybe restrict the usage that a Wikipedia must have at least 1 (or 
n) incoming links before people are allowed to add it?
We could create a new projects on the Wikimedia Cloud to store the 
links, but that would be quite the extra time investment setting up 
everything.


What do you think?

Maarten




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



---
This email has been checked for viruses by AVG.
https://www.avg.com


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Semantic annotation of red links on Wikipedia

2018-09-24 Thread Maarten Dammers

Hi everyone,

According to https://www.youtube.com/watch?v=TLuM4E6IE5U : "Semantic 
annotation is the process of attaching additional information to various 
concepts (e.g. people, things, places, organizations etc) in a given 
text or any other content. Unlike classic text annotations for reader's 
reference, semantic annotations are used by machines to refer to."
(more at 
https://ontotext.com/knowledgehub/fundamentals/semantic-annotation/ )


On Wikipedia a red link is a link to an article that hasn't been created 
(yet) in that language. Often another language does have an article 
about the subject or at least we have a Wikidata item about the subject. 
Take for example 
https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris . It has over 
250 incoming links, but the person doesn't have an article in Dutch. We 
have a Wikidata item with links to 7 Wikipedia's at 
https://www.wikidata.org/wiki/Q116510 , but no way to relate 
https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris with 
https://www.wikidata.org/wiki/Q116510 .


Wouldn't it be nice to be able to make a connection between the red link 
on Wikipedia and the Wikidata item?


Let's assume we have this list somewhere. We would be able to offer all 
sorts of nice features to our users like:

* Hover of the link to get a hovercard in your favorite backup language
* Generate an article placeholder for the user with basic information in 
the local language
* Pre-populate the translate extension so you can translate the article 
from another language

(probably plenty of other good uses)

Where to store this link? I'm not sure about that. On some Wikipedia's 
people have tested with local templates around the red links. That's not 
structured data, clutters up the Wikitext, it doesn't scale and the 
local communities generally don't seem to like the approach. That's not 
the way to go. Maybe a better option would be to create a new property 
on Wikidata to store the name of the future article. Something like 
Q116510: Pxxx -> (nl)"Friedrich Ris". Would be easiest because the 
infrastructure is there and you can just build tools on top of it, but 
I'm afraid this will cause a lot of noise on items. A couple of 
suggestions wouldn't be a problem, but what is keeping people from 
adding the suggestion in 100 languages? Or maybe restrict the usage that 
a Wikipedia must have at least 1 (or n) incoming links before people are 
allowed to add it?
We could create a new projects on the Wikimedia Cloud to store the 
links, but that would be quite the extra time investment setting up 
everything.


What do you think?

Maarten




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Mapping Wikidata to other ontologies

2018-09-22 Thread Maarten Dammers

Hi everyone,

Last week I presented Wikidata at the Semantics conference in Vienna ( 
https://2018.semantics.cc/ ). One question I asked people was: What is 
keeping you from using Wikidata? One of the common responses is that 
it's quite hard to combine Wikidata with the rest of the semantic web. 
We have our own private ontology that's a bit on an island. Most of our 
triples are in our own private format and not available in a more 
generic, more widely use ontology.


Let's pick an example: Claude Lussan. No clue who he is, but my bot 
seems to have added some links and the item isn't too big. Our URI is 
http://www.wikidata.org/entity/Q2977729 and this is equivalent of 
http://viaf.org/viaf/29578396 and 
http://data.bibliotheken.nl/id/thes/p173983111 . If you look at 
http://www.wikidata.org/entity/Q2977729.rdf this equivalence is 
represented as:

http://viaf.org/viaf/29578396"/>
http://data.bibliotheken.nl/id/thes/p173983111"/>

Also outputting it in a more generic way would probably make using it 
easier than it is right now. Last discussion about this was at 
https://www.wikidata.org/wiki/Property_talk:P1921 , but no response 
since June.


That's one way of linking up, but another way is using equivalent 
property ( https://www.wikidata.org/wiki/Property:P1628 ) and equivalent 
class ( https://www.wikidata.org/wiki/Property:P1709 ). See for example 
sex or gender ( https://www.wikidata.org/wiki/Property:P21) how it's 
mapped to other ontologies. This won't produce easier RDF, but some 
smart downstream users have figured out some SPARQL queries. So linking 
up our properties and classes to other ontologies will make using our 
data easier. This is a first step. Maybe it will be used in the future 
to generate more RDF, maybe not and we'll just document the SPARQL 
approach properly.


The equivalent property and equivalent class are used, but not that 
much. Did anyone already try a structured approach with reporting? I'm 
considering parsing popular ontology descriptions and producing reports 
of what is linked to what so it's easy to make missing links, but I 
don't want to do double work here.


What ontologies are important because these are used a lot? Some of the 
ones I came across:

* https://www.w3.org/2009/08/skos-reference/skos.html
* http://xmlns.com/foaf/spec/
* http://schema.org/
* https://creativecommons.org/ns
* http://dbpedia.org/ontology/
* http://vocab.org/open/
Any suggestions?

Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Indexing everything (was Re: Indexing all item properties in ElasticSearch)

2018-08-04 Thread Maarten Dammers

Hi Stas and Hay,


On 28-07-18 02:12, Stas Malyshev wrote:

Hi!


I could definitely see a usecase for 1) and maybe for 2). For example,
let's say i remember that one movie that Rutger Hauer played in, just
searching for 'movie rutger hauer' gives back nothing:

https://www.wikidata.org/w/index.php?search=movie+rutger+hauer

While Wikipedia gives back quite a nice list of options:

https://en.wikipedia.org/w/index.php?search=movie+rutger+hauer

Well, this is not going to change with the work we're discussing. The
reason you don't get anything from Wikidata is because "movie" and
"rutger hauer" are labels from different documents and ElasticSearch
does not do joins. We only index each document in itself, and possibly
some additional data, but indexing labels from other documents is now
beyond what we're doing. We could certainly discuss it but that would be
separate (and much bigger) discussion.
Changing the topic because I would like to start this separate and 
bigger discussion. Query and search are quite similar, but also very 
different (if you search you'll run into nice articles like 
https://everypageispageone.com/2011/07/13/search-vs-query/ ). Currently 
our query service is a very strong and complete service, but Wikidata 
search is very poor. Let's take Blade Runner.

* https://www.wikidata.org/wiki/Q184843 is what a human sees
* http://www.wikidata.org/entity/Q184843.json our internal JSON structure
* http://www.wikidata.org/entity/Q184843.rdf source for the query engine
* https://www.wikidata.org/w/index.php?title=Q184843=cirrusdump 
what's indexed in the search engine


In my ideal world, everything I see as a human gets indexed into the 
search engine preferably in a per language index. For example for Dutch 
something like a text_nl field with the, label, description, aliases, 
statements and references in there. So index *everything* and never see 
a Qnumber or Pnumber in there (extra incentive for people to add labels 
in their language). Probably also everything duplicated in the text 
field to fall back to. In this index you would have the "movie Rutger 
Hauer", you would have the cast members ("rolverdeling: Harrison Ford" 
etc.). Yes, this will give a significant increase of index size, but 
will make it much more easier to actually find things.


As for implementation: We already have the logic to serialize our json 
to the RDF format. Maybe also add a serialization format for this that 
is easy to ingest by search engines? I noticed Google having a hard time 
indexing some of our items, see for example 
https://www.google.com/search?q=The+Feast+of+the+Seagods+site%3Awikidata.org=utf-8=utf-8 
. Duck Duck Go seems to be doing a better job 
https://duckduckgo.com/?q=The+Feast+of+the+Seagods+site%3Awikidata.org=h_=web 
. Making it easier to index not only for our own search would be a nice 
added benefit.


How feasible is this? Do we already have one or multiple tasks for this 
on Phabricator? Phabricator has gotten a bit unclear when it comes to 
Wikidata search, I think because of misunderstanding between people what 
the goal of the task is. Might be worthwhile spending some time on 
structuring that.


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] [Wikimedia-l] Solve legal uncertainty of Wikidata

2018-07-04 Thread Maarten Dammers

Hi Mathieu,

On 04-07-18 11:07, mathieu stumpf guntz wrote:

Hi,

Le 19/05/2018 à 03:35, Denny Vrandečić a écrit :


Regarding attribution, commonly it is assumed that you have to 
respect it transitively. That is one of the reasons a license that 
requires BY sucks so hard for data: unlike with text, the attribution 
requirements grow very quickly. It is the same as with modified 
images and collages: it is not sufficient to attribute the last 
author, but all contributors have to be attributed.
If we want our data to be trustable, then we need traceability. That 
is reporting this chain of sources as extensively as possible, 
whatever the license require or not as attribution. CC-0 allow to 
break this traceability, which make an aweful license to whoever is 
concerned with obtaining reliable data.

A license is not the way to achieve this. We have references for that.


This is why I think that whoever wants to be part of a large 
federation of data on the web, should publish under CC0.
As long as one aim at making a federation of untrustable data banks, 
that's perfect. ;)
So I see you started forum shopping (trying to get the Wikimedia-l 
people in) and making contentious trying to be funny remarks. That's 
usually a good indication a thread is going nowhere.


No, Wikidata is not going to change the CC0. You seem to be the only 
person wanting that and trying to discredit Wikidata will not help you 
in your crusade. I suggest the people who are still interested in this 
to go to https://phabricator.wikimedia.org/T193728 and make useful 
comments over there.


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD Cloud

2018-06-30 Thread Maarten Dammers
The domain mcc.ae is down for email (see 
https://mxtoolbox.com/domain/mcc.ae/ ) and http://www.mcc.ae/ shows a 
for sale sign. Any idea how to reach the maintainer?


Maarten


On 29-06-18 17:27, David Abián wrote:

I guess Wikidata disappeared from the files yesterday, a few minutes
before 14:00 GMT, when a new version of the cloud was generated. It's
probably a mistake/bug in that generation process.


El 29/06/18 a las 15:13, Maarten Dammers escribió:

Looks like after the last update Wikidata dropped out again?
https://lod-cloud.net/versions/2018-30-05/lod-data.json contains
Wikidata, but in https://lod-cloud.net/lod-data.json it seems to be
currently missing, it does list Wikidata as a target in some other sets.

CCed the maintainer.

Maarten





___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD Cloud

2018-06-29 Thread Maarten Dammers
Looks like after the last update Wikidata dropped out again? 
https://lod-cloud.net/versions/2018-30-05/lod-data.json contains 
Wikidata, but in https://lod-cloud.net/lod-data.json it seems to be 
currently missing, it does list Wikidata as a target in some other sets.


CCed the maintainer.

Maarten


On 27-06-18 22:26, Maarten Dammers wrote:


Hi Léa and Lucas,

Excellent news! https://lod-cloud.net/dataset/wikidata seems to 
contain the info in a more human readable (and machine readable) way. 
If we add some URI link, does it automagically appear or does Lucas 
has to do some manual work? I assume Lucas has to do some manual work. 
I would suggest you document this somewhere more central so we don't 
have to bother Lucas all the time for updates. Do you already have a 
phabricator task for that?


Maarten


On 11-06-18 17:17, Léa Lacroix wrote:

Hello all,

Thanks to Lucas who filled the necessary requirements, Wikidata now 
appears in the LOD cloud graph: http://lod-cloud.net


Currently, the graph doesn't display all the actual connections of 
Wikidata. The only connections that show up are the properties that 
link to other projects or databases, and having a specific statement 
on them to link to an RDF endpoint.


If you see something missing, you can contribute by adding the 
statement “formatter URI for RDF resource” on properties where the 
resource supports RDF (example 
<https://www.wikidata.org/wiki/Property:P214#P1921>).


You can learn more about the procedure to update the graph and a list 
of the existing and missing datasets here 
<https://www.wikidata.org/wiki/User:Lucas_Werkmeister_%28WMDE%29/LOD_Cloud>, 



Thanks to Lucas and John for making this happening!

--
Léa Lacroix
Project Manager Community Communication for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de <http://www.wikimedia.de>

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg 
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das 
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD Cloud

2018-06-27 Thread Maarten Dammers

Hi Léa and Lucas,

Excellent news! https://lod-cloud.net/dataset/wikidata seems to contain 
the info in a more human readable (and machine readable) way. If we add 
some URI link, does it automagically appear or does Lucas has to do some 
manual work? I assume Lucas has to do some manual work. I would suggest 
you document this somewhere more central so we don't have to bother 
Lucas all the time for updates. Do you already have a phabricator task 
for that?


Maarten


On 11-06-18 17:17, Léa Lacroix wrote:

Hello all,

Thanks to Lucas who filled the necessary requirements, Wikidata now 
appears in the LOD cloud graph: http://lod-cloud.net


Currently, the graph doesn't display all the actual connections of 
Wikidata. The only connections that show up are the properties that 
link to other projects or databases, and having a specific statement 
on them to link to an RDF endpoint.


If you see something missing, you can contribute by adding the 
statement “formatter URI for RDF resource” on properties where the 
resource supports RDF (example 
).


You can learn more about the procedure to update the graph and a list 
of the existing and missing datasets here 
, 



Thanks to Lucas and John for making this happening!

--
Léa Lacroix
Project Manager Community Communication for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de 

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg 
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das 
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikiata and the LOD cloud

2018-05-04 Thread Maarten Dammers
It almost feels like someone doesn’t want Wikidata in there? Maybe that website 
is maintained by DBpedia fans? Just thinking out loud here because DBpedia is 
very popular in the academic world and Wikidata a huge threat for that 
popularity.

Maarten

> Op 4 mei 2018 om 17:20 heeft Denny Vrandečić  het 
> volgende geschreven:
> 
> I'm pretty sure that Wikidata is doing better than 90% of the current bubbles 
> in the diagram.
> 
> If they wanted to have Wikidata in the diagram it would have been there 
> before it was too small to read it. :)
> 
>> On Tue, May 1, 2018 at 7:47 AM Peter F. Patel-Schneider 
>>  wrote:
>> Thanks for the corrections.
>> 
>> So https://www.wikidata.org/entity/Q42 is *the* Wikidata IRI for Douglas
>> Adams.  Retrieving from this IRI results in a 303 See Other to
>> https://www.wikidata.org/wiki/Special:EntityData/Q42, which (I guess) is the
>> main IRI for representations of Douglas Adams and other pages with
>> information about him.
>> 
>> From https://www.wikidata.org/wiki/Special:EntityData/Q42 content
>> negotiation can be used to get the JSON representation (the default), other
>> representations including Turtle, and human-readable information.  (Well
>> actually I'm not sure that this is really correct.  It appears that instead
>> of directly using content negotiation, another 303 See Other is used to
>> provide an IRI for a document in the requested format.)
>> 
>> https://www.wikidata.org/wiki/Special:EntityData/Q42.json and
>> https://www.wikidata.org/wiki/Special:EntityData/Q42.ttl are the useful
>> machine-readable documents containing the Wikidata information about Douglas
>> Adams.  Content negotiation is not possible on these pages.
>> 
>> https://www.wikidata.org/wiki/Q42 is the IRI that produces a human-readable
>> version of the information about Douglas Adams.  Content negotiation is not
>> possible on this page, but it does have link rel="alternate" to the
>> machine-readable pages.
>> 
>> Strangely this page has a link rel="canonical" to itself.  Shouldn't that
>> link be to https://www.wikidata.org/entity/Q42?  There is a human-visible
>> link to this IRI, but there doesn't appear to be any machine-readable link.
>> 
>> RDF links to other IRIs for Douglas Adams are given in RDF pages by
>> properties in the wdtn namespace.  Many, but not all, identifiers are
>> handled this way.  (Strangely ISNI (P213) isn't even though it is linked on
>> the human-readable page.)
>> 
>> So it looks as if Wikidata can be considered as Linked Open Data but maybe
>> some improvements can be made.
>> 
>> 
>> peter
>> 
>> 
>> 
>> On 05/01/2018 01:03 AM, Antoine Zimmermann wrote:
>> > On 01/05/2018 03:25, Peter F. Patel-Schneider wrote:
>> >> As far as I can tell real IRIs for Wikidata are https URIs.  The http IRIs
>> >> redirect to https IRIs.
>> >
>> > That's right.
>> >
>> >>   As far as I can tell no content negotiation is
>> >> done.
>> >
>> > No, you're mistaken. Your tried the URL of a wikipage in your curl command.
>> > Those are for human consumption, thus not available in turtle.
>> >
>> > The "real IRIs" of Wikidata entities are like this:
>> > https://www.wikidata.org/entity/Q{NUMBER}
>> >
>> > However, they 303 redirect to
>> > https://www.wikidata.org/wiki/Special:EntityData/Q{NUMBER}
>> >
>> > which is the identifier of a schema:Dataset. Then, if you HTTP GET these
>> > URIs, you can content negotiate them to JSON
>> > (https://www.wikidata.org/wiki/Special:EntityData/Q{NUMBER}.json) or to
>> > turtle (https://www.wikidata.org/wiki/Special:EntityData/Q{NUMBER}.ttl).
>> >
>> >
>> > Suprisingly, there is no connection between the entity IRIs and the 
>> > wikipage
>> > URLs. If one was given the IRI of an entity from Wikidata, and had no
>> > further information about how Wikidata works, they would not be able to
>> > retrieve HTML content about the entity.
>> >
>> >
>> > BTW, I'm not sure the implementation of content negotiation in Wikidata is
>> > correct because the server does not tell me the format of the resource to
>> > which it redirects (as opposed to what DBpedia does, for instance).
>> >
>> >
>> > --AZ
>> 
>> 
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata + Wikipedia outreach

2018-01-06 Thread Maarten Dammers

On 05-01-18 22:55, Jane Darnell wrote:
I object to your use of the catalog property to link to something that 
is not a catalog. I don't see why my objection leads you to expect me 
to offer an alternative way to track your project. I am not 
responsible for your project and don't understand what it is. If you 
can't understand that then you should not probably not be editing 
Wikidata.

To add to that. I see three things:
1. Using the wrong property ( catalog (P972) ). Solution -> move to 
another property, this depends on point 3
2. Notability of the people BLT. Solution -> Add more information and 
links to establish notability (or worse case, delete)
3. Using Wikidata as a shopping list for a Wikiproject. Have a 
discussion if we, the Wikidata community,  want that (point 1 might not 
be needed if the end result is don't want)


For the people like Jane and I, you're basically squatting the current 
catalog (P972) property. So we care most about point 1. Point 2 and 3 
are for the BLT community to solve.


Point 3 is probably the hardest one. On 
https://en.wikipedia.org/wiki/Wikipedia:Meetup/Black_Lunch_Table/Lists_of_Articles 
I found the shopping lists for the BLT project. People seem to be in the 
hand curated list and in the Listeria list. Clicking around I found 
https://www.wikidata.org/wiki/Q20011585 which seems to indicate that you 
had a Black Lunch Table meetup on 9 december 2017 at " The 8th Floor" 
and judging from 
https://en.wikipedia.org/wiki/Wikipedia:Meetup/Black_Lunch_Table/Triangle_Jan_2018 
that seems correct. At the bottom of this page is another Listeria 
shopping list based on this. I'm not sure we should store this kind of 
data on Wikidata.


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] claim change ignored

2017-10-22 Thread Maarten Dammers

Hi Marco,

Op 21-10-2017 om 14:48 schreef Marco Neumann:

in any event it's a false claim in this example and I will remove the
claim now. 2-2=0 ;)
I undid your edit. You seem to be mixing up father ( 
https://www.wikidata.org/wiki/Q2650401 ) and child ( 
https://www.wikidata.org/wiki/Q15434505). Description also updated to 
make the difference clearer.


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Which external identifiers are worth covering?

2017-09-08 Thread Maarten Dammers

Hi Marco,

On 07-09-17 20:51, Marco Fossati wrote:

Hi everyone,

As a data quality addict, I've been investigating the coverage of 
external identifiers linked to Wikidata items about people.


Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems 
that even the second most used ID (VIAF) only covers *25%* of people 
items circa.

Then, there is a long tail of IDs that are barely used at all.

So here is my question:
*which external identifiers deserve an effort to achieve exhaustive 
coverage?*
I've been doing this for painters. See 
https://www.wikidata.org/wiki/Wikidata:WikiProject_sum_of_all_paintings/Creator_no_authority_control 
and 
https://www.wikidata.org/wiki/Wikidata:WikiProject_sum_of_all_paintings/Creator_missing_collection_authority_control 
.


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Script and API module for constraint checks

2017-04-27 Thread Maarten Dammers

Hi Léa,


On 27-04-17 14:47, Léa Lacroix wrote:

Hello all,

In the past few months, the development team has mentored a student, 
Olga, to help us developing a user script 
 
that displays the constraints on the item pages.


To use the script, add the following line to your user/common.js:

mw.loader.load( '//www.wikidata.org/w/index.php?title=User:Jonas_Kress_(WMDE)/check_constraints.js=raw=text/javascript 
' );
Is it a conscious choice not to make a gadget of this or just didn't 
think of it? No messing with javascript. Makes it much easier for user 
to try it.


Maarten
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Sitelink removal in Wikidata

2017-04-26 Thread Maarten Dammers

Hi Amir,


On 26-04-17 05:30, Amir Ladsgroup wrote:

Hey,
One common form of vandalism in Wikidata is removing sitelinks (we 
already have an abuse filter flagging them).

Yes, that seems to happen quite a lot.
One of my friends in Persian Wikipedia (who is not a wikidata editor 
and only cares about Persian Wikipedia) asked me to write a tool that 
lists all Persian Wikipedia sitelink removals. So I wrote something 
small and fast but it's usable for any wiki. For example English 
Wikipedia: 
http://tools.wmflabs.org/dexbot/tools/deleted_sitelinks.php?wiki=enwiki


It's slow due to nature of the database query but once it responds, 
you might find good things to revert.


Since this is the most useful for Wikipedia editors who don't want to 
patrol Wikidata (in that case, this query 
 is 
the most useful) I'm reaching to wider audiences. Sorry for spamming.
Looks the same as 
https://www.wikidata.org/wiki/Wikidata:Database_reports/removed_sitelinks/nlwiki 
to me. I updated 
https://www.wikidata.org/wiki/Wikidata:Database_reports/removed_sitelinks/Configuration 
to also create 
https://www.wikidata.org/wiki/Wikidata:Database_reports/removed_sitelinks/fawiki 
.


Maarten
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-23 Thread Maarten Dammers

Hi Egon,

Yaroslav is one of our (very active) users/admins/bureaucrats and a 
professor at TU Delft.  Maybe he can join you?


Maarten


On 23-04-17 12:34, Egon Willighagen wrote:

Dear Amit,

you make me painfully aware of something I should have mentioned (my
apologies for haven forgotten that; still tired from the science march
yesterday): it is not primarily international Open Science meeting...
the context is really about how to implement this Dutch Plan Open
Science... so, I was planning to target researchers using Wikidata in
the local Dutch region... other sessions will be about Open Science in
the Dutch funding environment, etc.

That said: there is more information which will get more informative
over the next weeks here:

"Open Science: the National Plan and you" ->
https://www.openscience.nl/nationaal-plan

Your reply also makes me wonder if there is Mozilla Open Science
projects running in NL?

Egon


On Sun, Apr 23, 2017 at 12:04 PM, AMIT KUMAR JAISWAL
 wrote:

Hey Egon,

Thanks for letting us know about this Open Science meeting.

I'm interested in forming a team and currently I'm working with couple
of Open Source projects ranges from Machine Learning/AI, Natural
Language Processing and recently started with Deep Learning.
Apart from this I'm also doing few competitions on Kaggle :
https://www.kaggle.com/amitkumarjaiswal.

Please let me know how can I join/participate in this meeting.

Regards

Thank you
Amit Kumar Jaiswal

On 4/23/17, Egon Willighagen  wrote:

Hi Wikidata community,

on May 29 in Delft, The Netherlands, the first national meeting is
planned for researchers (at a various stage of their career) about the
Dutch National Plan Open Science. There will be a large session where
organisations and individuals can present their experiences with Open
Science...

I will join the meeting and want to see Wikidata there, and plan to
host a table about Wikidata in research... every since the joined
H2020 funding application (which we didn't get), I have been using
Wikidata for our interoperabilty work in various research projects...

However, the more the merrier, and I'm hoping to co-host a Wikidata
table at this meeting... who else is interested in teaming up and
showing the Dutch research community how Wikidata can help them with
their Open Science? My own work is in the area of the life sciences,
but I know many others are using Wikidata for other research fields,
and the meeting is for all research, not just the natural sciences...

Looking forward to hearing from you,

greetings Egon

--
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



--
Amit Kumar Jaiswal
Mozilla Representative  | LinkedIn
 | Portfolio

Kanpur, India
Mo. : +91-8081187743 | T : @AMIT_GKP | PGP : EBE7 39F0 0427 4A2C

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata






___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Federation in the Wikidata Query Service

2017-04-01 Thread Maarten Dammers

Hi Stas,


On 01-04-17 00:54, Stas Malyshev wrote:

Hi!


How about adding an ODbL licensed service? Would it be possible? I am
thinking about SPOI  and their SPARQL endpoint
..

ODBL seems to be in the same vein as CC-BY-SA, so if CC-BY is ok, that
should be OK too. Please add the descriptions to
https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input

Great new feature you have here! I would only add endpoints that use 
free licenses that are compatible with our terms of use ( 
https://wikimediafoundation.org/wiki/Terms_of_Use#7._Licensing_of_Content 
). See http://freedomdefined.org/Definition for a more general 
explanation. This would include ODbL ( 
https://opendatacommons.org/licenses/odbl/summary/ ), but would exclude 
any ND (NoDerivatives) and any NC (NonCommercial) licenses.


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] New Status Indicator Icon about Relative Page Completeness

2016-11-16 Thread Maarten Dammers

Hi Simon,


On 15-11-16 09:51, Simon Razniewski wrote:

It can be enabled by adding the following line to your /common.js/:
  /importScript( 'User:Ls1g/recoin-core.js' );/
Why don't you turn it into a gadget before promoting it? That would 
lower the barrier a lot for people just wanting to try out your new cool 
tool. Maybe creating a gadget seems to be too complicated or cumbersome? 
An overview of the current gadgets are at 
https://www.wikidata.org/wiki/Special:Gadgets


I was about to create a gadget out of it when I noticed the line 
https://tadaqua.inf.unibz.it/api/getmissingattributes.php in 
https://www.wikidata.org/wiki/User:Ls1g/recoin-core.js . I'm pretty sure 
grabbing data from a third party domain is a violation of the WMF 
privacy policy, because the owner of the domain tadaqua.inf.unibz.it is 
able to track users who enable this script. Not sure. Someone WMF legal 
could probably confirm this. Probably best to move it to 
http://tools.wmflabs.org/ .


Maarten



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] SPARQL power users and developers

2016-09-30 Thread Maarten Dammers

Hi Denny,

On 30-09-16 20:47, Denny Vrandečić wrote:
Markus, do you have access to the corresponding HTTP request logs? The 
fields there might be helpful (although I might be overtly optimistic 
about it)
I was about to say the same. I use pywikibot quite a lot and it sends 
some nice headers like described at 
https://www.mediawiki.org/wiki/API:Main_page#Identifying_your_client .


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata-tech] Wikidata for cultural heritage?

2016-07-01 Thread Maarten Dammers

Hi Richard,

I think that persons are probably most mature. All humans have instance 
of (P31) human (Q5) and for some subsets (like artists) a lot properties 
have a high coverage. Wikidata already connects to a lot of other 
sources on the web making it a central node for linked open data. For 
painters I keep track of this at 
https://www.wikidata.org/wiki/User:Multichill/Paintings_creator_no_authority_control 
.


For events you probably need subclass of event, see 
https://tools.wmflabs.org/wikidata-todo/tree.html?lang=en=Q1656682=279
For place geographic location is probably the best one, see 
https://tools.wmflabs.org/wikidata-todo/tree.html?lang=en=Q2221906=279


Also take a look at 
https://www.wikidata.org/wiki/Wikidata:WikiProject_Visual_arts/Item_structure 
. We tried to link to external concepts here.


Maarten

Op 30-6-2016 om 12:07 schreef Richard Light:


Lydia,

Thank you for your responses.  I suggest that it would add to the 
utility of Wikidata entities if their basic type ('person', 'place', 
'event', etc.) was explicitly stated in the RDF.


Best wishes,

Richard

On 2016-06-29 4:56 PM, Lydia Pintscher wrote:

On Sun, Jun 19, 2016 at 1:27 PM, Richard Light
  wrote:

Hi,

My view is that, in order for there to be any point in cultural heritage
bodies (museums, libraries, archives, historians) publishing their
collections etc. as Linked Data, there needs to be a common Linked Data
framework representing the historical space-time universe, which they can
all quote.  Current practice (such as the British Museum Linked Data
offering) suggests that concepts such as people, places and events will
otherwise be represented either by useless string values or by
system-specific URLs which have no wider meaning.

As a result, I would like to explore the potential for Wikidata to act as
this lingua franca for the cultural heritage community.

You'll see from my earlier messages to this list that I have been grappling
with the SPARQL end-point. Initially I was confused by the interactive
version of the Query Service [1], which differs in its response format from
the similarly-URLed end-point and doesn't provide an RDF/XML response.  I
have now managed to set up Wikidata as a 'web termlist' service for artists,
within the Modes software (see attached screenshot). (The data in the pop-up
window is generated on the fly from the Wikidata RDF.)

At this point, I have the following questions:

1. what level of stability is planned as regards Wikidata identifiers/URLs?
Can I treat the full URL (e.g.[3]) as persistent, or can I only rely on the
core Wikidata identifier (e.g. [4]) remaining unchanged into the indefinite
future?  (Can I even rely on that?)

Wikidata's IDs are supposed to be stable.


2. what is the policy on inclusivity?  Do entities need to be 'notable' in
some sense to be accepted into Wikidata?  (I'm imagining a research body
wanting to offer very precise place or event data, or someone with the
ambition to include in Wikidata details of any person who ever lived.)

https://www.wikidata.org/wiki/Wikidata:Notability


3. is there a template for each entity type (e.g. person, place, event)
which guarantees that a query for certain properties will at least identify
entities of the desired type?  (My artist termlist query includes a test '$s
ps:P31 wd:Q5' which picks out humans: I'm not clear how I would do the same
for events or places.)

No that does not exist.


Cheers
Lydia



--
*Richard Light*


___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech



___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Maarten Dammers

Hi Luca,

Op 5-3-2016 om 16:45 schreef Luca Martinelli:


Point taken, I apologise for using too dramatic tones.
Looks like more people are eager to get this over with and can't wait to 
get everything converted

Nonetheless, I stick to the point that probably a ">99% unique
identifier" threshold is too high. Just to make another example
(disclaimer: I asked for this property since it is yet another
catalogue that my institution runs), P1949 has not been converted to
identifier because it has "only 98.82% unique out of 507 uses", that
translates in only *six* cases out of 505 items which have two P1949
identifiers.
That's correct. As I said in my previous email: We're first doing the 
easy properties. You can see the easy properties at 
https://www.wikidata.org/wiki/User:ArthurPSmith/Identifiers/1 . The easy 
ones are the ones that have 99%+ single value and 99%+ unique. Compare 
that with https://www.wikidata.org/wiki/User:Addshore/Identifiers/1 and 
you'll notice we still have loads of easy ones we have to process (the 
unchecked list is still quite long).


Once we get those out of the way, we'll get to the more difficult ones. 
I prefer quality over speed here. I don't expect any problems with 
converting P1949.


Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Maarten Dammers

Hi Luca,

Op 5-3-2016 om 14:30 schreef Luca Martinelli:

Probably the threshold we set up for the conversion is too high, and
this might be one of the causes why the whole process has slowed down
to a dying pace.
You call 
https://www.wikidata.org/wiki/Special:Contributions/Maintenance_script a 
dying pace?


Instead of complaining here people should participate in 
https://www.wikidata.org/wiki/User:Addshore/Identifiers/0 . Still plenty 
of easy properties that are clearly distinct, unique and have an 
external url.
It doesn't make sense to discus the more complicated cases if we haven't 
gotten the easy cases out of the way yet.


Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Place name ambiguation

2015-12-29 Thread Maarten Dammers

Hi Tom,

Op 29-12-2015 om 19:33 schreef Tom Morris:
Thanks Stas & Thomas.  That's unambiguous. :-)  (And thanks to 
Jdforrester who went through and fixed all my examples)
Please keep the long label around as an alias. This really helps when 
you enter data. I wonder if someone ever ran a bot to clean up these 
disambiguations. The US alone must be thousands of items.


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Mix'n'match: how to preserve manually audited items for posterity?

2015-11-21 Thread Maarten Dammers

Hi Dario,

Op 21-11-2015 om 18:34 schreef Dario Taraborelli:
- shouldn’t a manually unmatched item be created directly on Wikidata 
(after all DBI is all about notable individuals who would easily pass 
Wikidata’s notability threshold for biographies)

If the person in question is notable, you should create an item.
- shouldn’t the relation between /Giulio (Cesare) Baldigara /(Q1010811 
) and the newly created item 
for /Giulio Baldigara/ be explicitly represented via a /not the same 
as/ property, to prevent future humans or machines from accidentally 
remerging the two items based on some kind of heuristics
You can use P1889: "different from" 
(https://www.wikidata.org/wiki/Property:P1889)


Maarten
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Welcome Meta, MediaWiki and Wikispecies

2015-10-23 Thread Maarten Dammers

Op 21-10-2015 om 12:32 schreef Magnus Manske:

Anyone running a bot to integrate Wikispecies pages?
This moved to 
https://www.wikidata.org/wiki/Wikidata:Bot_requests#Wikispecies_sitelinks :-)


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] We plan to drop the wb_entity_per_page table

2015-08-09 Thread Maarten Dammers

Hi Marius,

hoo schreef op 7-8-2015 om 19:55:

Hey folks,

we plan to drop the wb_entity_per_page table sometime soon[0], because
it is just not required (as we will likely always have a programmatic
mapping from entity id to page title) and it does not supported non
-numeric entity ids as it is now.
In the past I was alway told to use the wb_entity_per_page table instead 
of doing page_title=CONCAT('Q', id). The Wikibase code used to contain 
warnings not to make this assumption. I don't know, they might still be 
there.

Due to this removing it is a blocker
for the commons metadata.

That's unfortunate.

Is anybody using that for their tools (on tool labs)? If so, please
tell us so that we can give you instructions and a longer grace period
to update your scripts.
Of the 117 Wikidata related sql queries that seem to be in my homedir, 
48 of them use this table. Basically any Wikidata related tool that uses 
the sql database will break. What do you propose? That we start messing 
around with CONCAT()s in our SQL queries? Besides the hours of wasted 
volunteer time, that's probably a lot slower.


Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Mexico / Building up Wikidata, country by country

2015-06-15 Thread Maarten Dammers

Andrew Gray schreef op 15-6-2015 om 14:00:

The map can also be used to highlight other country-specific differences,
such as the unusually large amount of orphan items in The Netherlands and
UK.

WLM-related historic site imports, I think...
That's probably the 60.000 Rijksmonumenten (historic sites) and that bot 
run where someone created an item for *every* street in the Netherlands.


Maarten

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata-l] [Labs-l] Yet another partial labs outage

2015-05-16 Thread Maarten Dammers

Hi Andrew,

Andrew Bogott schreef op 16-5-2015 om 6:31:
I did shut off one instance:  wikidata-wdq-mm.  I don't have a 
personal grudge, but it was gobbling CPU cycles and the system really 
needs a rest.  If loss of that instance is a disaster for anyone, 
contact me and I'll see if I can revive it and shut off ten or so 
other instances to make room.
With that you basically break the edit flow of most users on Wikidata, 
see 
https://www.wikidata.org/wiki/Wikidata:Project_chat#wdq.wmflabs.org.2Fapi . 
This is one of those tools that have silently become production.


Maarten


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] annotating red links

2015-02-11 Thread Maarten Dammers

Hi Amir,

Amir E. Aharoni schreef op 11-2-2015 om 13:12:
If I may dream for a moment, this should be something that can be used 
in all Wikipedias, and without copying this template everywhere, but 
built into the site's software :)
Exactly, the template based approach doesn't scale at all. You have to 
somehow make it automatic. One thing I thought about is adding suggested 
sitelinks to Wikidata. The software would encounter a red link and would 
look in Wikidata if it can find an item with a suggested sitelink of the 
same title. Huge software overhaul so I don't see that happening.


Another approach that is probably already possible right now:
* Take an article with a red link
* Look at the links in the article in other languages.
* If you find a link that points to another article which has the same 
label as the red link in the same language, link to it


I wonder how many good results that would give.

Maarten

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Precision of globe coordinates

2015-01-11 Thread Maarten Dammers

Hi Markus,

Markus Krötzsch schreef op 11-1-2015 om 2:15:

Hi,

Does anybody know the current documentation of the precision of the 
globe coordinate datatype? This precision was introduced after the 
original datamodel discussions.
No clue, I do know we have to do some conversions. See 
https://git.wikimedia.org/blob/pywikibot%2Fcore.git/HEAD/pywikibot%2F__init__.py#L290 
for the relevant Pywikibot code. Do the reverse on seemingly odd values 
and you probably end up with a nice dimension. Dimension is documented 
at https://www.mediawiki.org/wiki/Extension:GeoData#Glossary


Maarten


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Python bot framework for wikidata

2014-08-30 Thread Maarten Dammers

Benjamin Good schreef op 29-8-2014 19:39:

It does, but only on the very bottom with a see also.
Hmm, that doesn't look very good. Pywikibot has full Wikidata support 
and contains several scripts you can run without writing a line of 
python, see https://www.mediawiki.org/wiki/Manual:Pywikibot/Scripts#Wikidata


Maybe we should update the bots page to make it clearer and easier to find.

Maarten


Somehow I ended up on
https://github.com/jcreus/pywikidata
first.

which is two years out of date and very similarly named..
-ben



On Fri, Aug 29, 2014 at 10:17 AM, Derric Atzrott 
datzr...@alizeepathology.com mailto:datzr...@alizeepathology.com 
wrote:


 There is https://www.wikidata.org/wiki/Wikidata:Bots which is the
 first hit on Google for me when searching for bots wikidata. Maybe
 it needs to be linked more on the site itself though.

I may just be blind, but it actually doesn't look like that page
mentions pywikibot anywhere.  I wonder if he may have found that
page, but it didn't answer all of the questions he had?

Thank you,
Derric Atzrott


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org mailto:Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l




___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Python bot framework for wikidata

2014-08-29 Thread Maarten Dammers

Don't use compat, use core.

Amir Ladsgroup schreef op 29-8-2014 2:27:

Hey,
It's pywikibot: https://www.mediawiki.org/wiki/PWB
Both branches support Wikidata

Best

On 8/29/14, Benjamin Good ben.mcgee.g...@gmail.com wrote:

Which python framework should a new developer use to make a wikidata
editing bot?

thanks
-Ben






___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] WikiProject sum of all paintings

2014-08-09 Thread Maarten Dammers

Hi everyone,

I'm starting a new project. Let's get all paintings on Wikidata! I could 
use some help. I already imported 4 museums in the Netherlands, but 
that's of course just a start. More information at 
https://www.wikidata.org/wiki/Wikidata:WikiProject_sum_of_all_paintings .


Maarten

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-tech] Heads up: Changes to API error codes

2014-04-20 Thread Maarten Dammers

Hi Daniel,

Daniel Kinzler schreef op 15-4-2014 19:16:

Today, Thiemo merged my patch introducing the ApiErrorReporter class:
https://gerrit.wikimedia.org/r/#/c/124323/. This should help us with
providing error reports from the API in a consistent manner; This way, we will
hopefully soon be able to provide more localized error messages too.

However, this means that some of the error codes used by the API may have
changed, and more will change when more API modules start using this module.
Also, it means that localized messages are included in a slightly different way.
If you rely on error codes or localized error messages, please keep an eye out
for breakage in that regard.
Did you document this somewhere? I assume we have to modify Pywikibot a 
bit so would be nice to have a good overview.


Maarten


___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


[Wikidata-tech] Typing of globe-coordinate

2014-03-01 Thread Maarten Dammers

Hi everyone,

I ran into some pretty annoying problems with coordinates specifically 
the precision. The valid values of precision aren't properly documented 
and enforced at the moment, see 
https://bugzilla.wikimedia.org/show_bug.cgi?id=62105


I would like to know what is supposed to be valid so I know what 
direction to go in.


Maarten


___
Wikidata-tech mailing list
Wikidata-tech@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-tech


Re: [Wikidata-l] Countries ranked

2013-10-05 Thread Maarten Dammers

Hi Katie,

Op 4-10-2013 23:29, Katie Filbert schreef:
As many folks enjoy country rankings, I have generated a list of 
countries (Property:P17) ranked by number of coordinates (P625) in 
Wikidata.  Note this data is from the September 22 database dump.

Did you consider doing this as a query on the live database on Tool*?
You have a page for an item with coordinates.
* Join this page against pagelinks for P625 (coordinate location)
* Join this page against pagelinks for P17 (country)
* Join this page against pagelinks for countrypage
* Join countrypage aginst pagelinks for Q6256 (country) or Q1763527 
(constituent country)


Group it by country and do some ordering.

We do seem to have quite a few items where country is missing, see 
http://208.80.153.172/wdq/?q=claim[625]_AND_noclaim[17]_AND_noclaim[31] 
. We should probably work on that too.


I was wondering how many articles do have coordinates at the Dutch 
Wikipedia, but not at Wikidata. For that I created a tracker category, 
see 
https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Co%C3%B6rdinaten_niet_op_Wikidata 
. We could probably do some LUA magic to compare coordinates in Wikidata 
with local coordinates and see how far these are apart. Did anyone 
already build something in LUA that might be reused for this?


Maarten
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Dexbot problem

2013-09-23 Thread Maarten Dammers

Hi Andy,

Op 23-9-2013 17:18, Andy Mabbett schreef:


There's a problem with Dexbot adding bogus values.

For example, place of death = Welsh People on the entry for Catrin 
Collier (still very much alive):


http://www.wikidata.org/w/index.php?title=Q13416998diff=72279065oldid=49769674 



Since I have no broadband, due to an ISP failure, I can't post about 
this on-Wiki.



Looking at the source:
| death_date = !-- {{Death date and age||MM|DD|1948|MM|DD|df=y}} --
| death_place =
| residence =
| nationality = [[Welsh people|Welsh]]

Probably parsing error here. Amir, can you have a look at it? Is this 
standard or custom code doing the template parsing?


Maarten



___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Commons is coming \o/

2013-09-09 Thread Maarten Dammers

Hi,

Op 9-9-2013 12:10, Lydia Pintscher schreef:

In the end this is a decision of the Commons community what they want
to link their pages and categories to and should probably be discussed
there. As I said it'll be possible one way or another. See also
http://lists.wikimedia.org/pipermail/wikidata-l/2013-August/002626.html
and 
https://commons.wikimedia.org/wiki/Commons:Village_pump#Interwiki_links_via_Wikidata_coming_soon
Thank you for mentioning my previous email. I've been waiting for this 
announcement and proposed a new property at 
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Generic#Topic_main_category


Maarten


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Make Commons a wikidata client

2013-08-13 Thread Maarten Dammers

Hi Jane,

You completely missed my point. This is about galleries and categories, 
not about creator, institution, etc etc. That's all next phase. Please 
reread my original post.


Maarten

Op 12-8-2013 20:32, Jane Darnell schreef:

geocoordinates can be linked to places; creator templates, book
templates, and artwork templates can all be linked to people.

The problem is if you store the data on WikiData, but do not allow the
content to show up on WikiCommons (due to copyright problems), then
where does data-curation take place?

After I sent that email it occurred to me though that probably most,
if not all the people on Commons who understand this stuff are already
Wikidatans anyway. So maybe it's a moot point.

2013/8/12, Cristian Consonni kikkocrist...@gmail.com:

2013/8/11 Jane Darnell jane...@gmail.com:

Hmm, I am not quite sure how to see this. Places and people yes: It
would be nice to have the geo coordinates on Wikidata and for the
artist and writers

I am not sure I get what geocoordinates means for people.


  I also agree for the book and the artwork templates.
But how could you possibly move all of the Commons copyright logic? As
far as I know, it's really quite a small group of people who even
understand how all that stuff works on Commons and can untangle those
template categories and delete/keep workflows... if you open Wikidata
to keeping the data on copyrighted materials, like books and artworks,
is that metadata OK to move and manage there?

I think this was not the sense of Maarten's proposal.

Cristian

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l



___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] Make Commons a wikidata client

2013-08-09 Thread Maarten Dammers

Hi everyone,

At Wikimania we had several discussions about the future of Wikidata and 
Commons. Some broader feedback would be nice.
Now we have a property Commons category 
(https://www.wikidata.org/wiki/Property:P373). This is a string and an 
intermediate solution.
In the long run Commons should probably be a wikibase instance in it's 
own right (structured metadata stored at Commons) integrated with 
Wikidata.org, see 
https://www.wikidata.org/wiki/Wikidata:Wikimedia_Commons for more info.
In the meantime we should make Commons a wikidata client like Wikipedia 
and Wikivoyage. How would that work?


We have an item https://www.wikidata.org/wiki/Q9920 for the city 
Haarlem. It links to the Wikipedia article Haarlem and the Wikivoyage 
article Haarlem. It should link to the Commons gallery Haarlem 
(https://commons.wikimedia.org/wiki/Haarlem)


We have an item https://www.wikidata.org/wiki/Q7427769 for the category 
Haarlem. It links to the Wikipedia category Haarlem. It should link to 
the Commons category Haarlem 
(https://commons.wikimedia.org/wiki/Category:Haarlem).


The category item (Q7427769) links to article item (Q9920) using the 
property main category topic 
(https://www.wikidata.org/wiki/Property:P301).

We would need to make an inverse property of P301 to make the backlink.

Some reasons why this is helpful:
* Wikidata takes care of a lot of things like page moves, deletions, 
etc. Now with P373 (Commons category) it's all manual
* Having Wikidata on Commons means that you can automatically get 
backlinks to Wikipedia, have intro's for category, etc etc

* It's a step in the right direction. It makes it easier to do next steps

Small change, lot's of benefits!

Maarten

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Question about wikipedia categories.

2013-05-09 Thread Maarten Dammers

Hi Patrick,

Op 8-5-2013 17:19, Patrick Cassidy schreef:

Should we have more than one ontology?
Back in 2001 when I was doing some artificial intelligence courses, the 
semantic web was the next big thing. What I remember about ontology is 
that an ontology of all is next to impossible. Most ontologies work very 
well in a certain domain, but if you go outside of this domain it won't 
be correct or become nonsense. So we should accept that we have multiple 
(overlapping) ontologies that are not redundant, but complementary to 
each other.


Maarten


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l