from:"Dario Taraborelli"

Re: [Wikidata] [wikicite-discuss] Invitation to join Linked Data for Libraries Wikidata Affinity Group

2019-04-12 Thread Dario Taraborelli

Hilary,

this is wonderful — I have a conflict on that date but we'll make sure to
circulate the announcement on social. Cross-posting on the main wikidata
mailing list, I bet many community members would be interested in
participating.

Best,
Dario

On Tue, Apr 9, 2019 at 4:39 PM Hilary K Thorsen 
wrote:

> Hello all,
>
>
> I'm a new Wikimedian in Residence as part of the Linked Data for
> Production project
> . One
> of the goals of the project is understanding how libraries can contribute
> to and leverage Wikidata as a platform for publishing, linking, and
> enriching library linked data. A number of institutions that are part of
> the grant are working on projects involving Wikidata and we decided to
> start an interest group with biweekly meetings to discuss various aspects
> of Wikidata in support of the projects. Possible topics include Wikidata
> best practices, documentation, communication channels, policies, and tools.
>
> At each meeting, myself or a guest will present some relevant material
> related to the topic and we’ll discuss any issues members have encountered
> as well as helpful resources. At the first meeting on April 23rd, we’ll
> talk about the purpose and goals of the group as well as the Wikidata
> related projects that are part of the grant.
>
>
> I'd like to invite any interested Wikicite community members to join us.
> The call details and communication channels are below.
>
>
> *First call details:*
> April 23, 2019 9am PST / 12noon EST / 5pm GMT / 6pm CET
> Agenda:
> https://docs.google.com/document/d/1BuszEQQxlOY14hK60Fl7n8Huvh6jEWXre0-wSvpyq84/edit?usp=sharing
> Join: https://stanford.zoom.us/j/204437188
>
>
> *Communication:*
> Ld4-wikidata Google group: https://groups.google.com/d/forum/ld4-wikidata
> #wikidata channel on LD4 Slack: http://bit.ly/joinld4slack
> Notes in public LD4 Wikidata folder:
> https://drive.google.com/drive/folders/1JwTulCABs0TkGQDVSnYbIYEb7bC-j4-n
> Website: https://wiki.duraspace.org/display/LD4P2/Wikidata+Affinity+Group
>
>
> The Affinity Group is open to anyone interested in libraries and Wikidata,
> so feel free to share this invitation.
>
>
> Cheers,
> Hilary
>
>
> Hilary Thorsen
>
> Wikimedian in Residence
>
> Digital Library Systems and Services
>
> Stanford Libraries
>
> Stanford, CA 94305
>
> thors...@stanford.edu
> 650-285-9429
>
> --
> Meta: https://meta.wikimedia.org/wiki/WikiCite
> Twitter: https://twitter.com/wikicite
> ---
> You received this message because you are subscribed to the Google Groups
> "wikicite-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to wikicite-discuss+unsubscr...@wikimedia.org.
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Save the date: Wiki Workshop 2019 to be hosted at The Web Conference 2019 in San Francisco (May 13-14, 2019)

2018-12-10 Thread Dario Taraborelli

Hi everyone,

We are thrilled to announce that the *6th annual Wiki Workshop* [1] will be
hosted at *The Web Conference 2019* (formerly known as WWW) in San
Francisco, CA, on May 13 or 14, 2019 [2]. The workshop provides an annual
forum for researchers exploring all aspects of Wikipedia, Wikidata, and
other Wikimedia projects to present their work. We'd love to have your
contributions, so please take a look at the details in this call:
http://wikiworkshop.org/2019/#call

Please note that *January 31, 2019* is the submission deadline if you want
your paper to appear in the (archival) conference proceedings, and *March
14, 2019* is for all other, non-archival submissions. [3]

Following past year's format, the workshop will include invited talks, a
poster session, as well as offer an opportunity for participants to meet
and discuss future research directions. We look forward to receiving your
submissions and seeing you in San Francisco in May!

Best,
Dario on behalf of the organizers [4]


[image: ww19_banner_www.png]

[1] http://wikiworkshop.org/
[2] https://www2019.thewebconf.org/
[3] http://wikiworkshop.org/2019/#dates
[4] http://wikiworkshop.org/2019/#organization
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] The future of bibliographic data in Wikidata: 4 possible federation scenarios

2018-08-14 Thread Dario Taraborelli

Hey all,

we had a productive “strategy meetup” at Wikimania with a group of about 20 
people, to talk about the future of WikiCite and a roadmap for source metadata 
in Wikidata more generally.

The motivation for this meetup was a set of concerns around scalability and 
“growing pains” around bibliographic and citation data in Wikidata, as well as 
the need (that many in the community have expressed) for a clearer goal, value 
proposition, and scope for WikiCite. 

The result is a series of notes fleshing out 4 possible scenarios for the 
future of bibliographic data as structured data—from a centralized scenario to 
a fully federated one—discussing their possible risks and benefits at the 
technical, social, and governance level:

https://www.wikidata.org/wiki/Wikidata:WikiCite/Roadmap

The question these notes try to address is whether Wikimedia should aim to 
build a “bibliographic commons”, and if so, what it would look like, and where 
it should live. 

This document is not a formal proposal or an RfC open for a vote, but a 
conversation starter to evaluate what type of future makes most sense for this 
data and the communities and stakeholders that will benefit from it. A shared 
understanding on what we’re building towards is also going to help us inform 
the program of the upcoming WikiCite 2018 conference in November (the 
application process will open in a few days).

If you wish to share your thoughts on these four scenarios, please chime in on 
wiki, rather than replying here, to avoid thread splintering (I’m cross-posting 
this on a few mailing lists).

Dario
on behalf of the meetup participants 

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Knowledge Integrity: A proposed Wikimedia Foundation cross-departmental program for 2018-2019

2018-04-16 Thread Dario Taraborelli

Hey all,

(apologies for cross-posting)

We’re sharing a proposed program for the Wikimedia Foundation’s upcoming fiscal
year (2018-19) and would love to hear from you. This plan builds extensively on
projects and initiatives driven by volunteer contributors and organizations in
the Wikimedia movement, so your input is critical.

Why a “knowledge integrity” program?

Increased global attention is directed at the problem of misinformation and how
media consumers are struggling to distinguish fact from fiction. Meanwhile,
thanks to the sources they cite, Wikimedia projects are uniquely positioned as
a reliable gateway to accessing quality information in the broader knowledge
ecosystem. How can we mobilize these citations as a resource and turn them into
a broader, linked infrastructure of trust to serve the entire internet? Free
knowledge grounds itself in verifiability and transparent attribution policies.
Let’s look at 4 data points as motivating stories:
Wikipedia sends tens of millions of people to external sources each year. We
want to conduct research to understand why and how readers leave our site.
The Internet Archive has fixed over 4 million dead links on Wikipedia. We want
to enable instantaneous archiving of every link on all Wikipedias to ensure the
long-term preservation of the sources Wikipedians cite.
#1Lib1Ref reaches 6 million people on social media. We want to bring #1Lib1Ref
to Wikidata and more languages, spreading the message that references improve
quality.
33% of Wikidata items represent sources (journals, books, works). We want to
strengthen community efforts to build a high-quality, collaborative database of
all cited and citable sources.
A 5-year vision

Our 5-year vision for the Knowledge Integrity program is to establish Wikimedia
as the hub of a federated, trusted knowledge ecosystem. We plan to get there by
creating:
A roadmap to a mature, technically and socially scalable, central repository of
sources.
Developed network of partners and technical collaborators to contribute to and
reuse data about citations.
Increased public awareness of Wikimedia’s vital role in information literacy
and fact-checking.

5 directions for 2018-2019

We have identified 5 levers of Knowledge Integrity: research, infrastructure
and tooling, access and preservation, outreach, and awareness. Here’s what we
want to do with each:

Continue to conduct research to understand how readers access sources and how
to help contributors improve citation quality.
Improve tools for linking information to external sources, catalogs, and
repositories.
Ensure resources cited across Wikimedia projects are accessible in perpetuity.
Grow outreach and partnerships to scale community and technical efforts to
improve the structure and quality of citations.
Increase public awareness of the processes Wikimedians follow to verify
information and articulate a collective vision for a trustable web.

Who is involved?

The core teams involved in this proposal are:
Wikimedia Foundation Technology’s Research Team
Wikimedia Foundation Community Engagement’s Programs team (Wikipedia Library)
Wikimedia Deutschland Engineering’s Wikidata team

The initiative also spans across an ecosystem of possible partners including
the Internet Archive, ContentMine, Crossref, OCLC, OpenCitations, and Zotero.
It is further made possible by funders including the Sloan, Gordon and Betty
Moore, and Simons Foundations who have been supporting the WikiCite initiative
to date.

How you can participate

You can read the fine details of our proposed year-1 plan, and provide your
feedback, on mediawiki.org:
https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/CDP3:_Knowledge_Integrity

We’ve also created a brief introductory slidedeck about our motivation and
goals:
https://commons.wikimedia.org/wiki/File:Knowledge_Integrity_CDP_proposal_%E2%80%93_FY2018-19.pdf

WikiCite has laid the groundwork for many of these efforts. Read last year’s
report: https://commons.wikimedia.org/wiki/File:WikiCite_2017_report.pdf

Recent initiatives like the just released citation dataset foreshadow the work
we want to do:
https://medium.com/freely-sharing-the-sum-of-all-knowledge/what-are-the-ten-most-cited-sources-on-wikipedia-lets-ask-the-data-34071478785a

Lastly, this April we’re celebrating Open Citations Month; it’s right in the
spirit of Knowledge Integrity:
https://blog.wikimedia.org/2018/04/02/initiative-for-open-citations-birthday/

--
Dario Taraborelli Director, Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter ___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] WikiCite 2018 satellite track at the Wikimedia Hackathon in Barcelona

2018-03-23 Thread Dario Taraborelli

Hey all,

as mentioned in a previous thread on the wikicite-discuss list, we'll have
a WikiCite presence at the Wikimedia Hackathon in Barcelona in May.

If you're planning to attend and/or wish to pitch specific projects you
want to work on, please add your name to this page:

https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2018/WikiCite

Dario
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] WikiCite 2018

2017-12-11 Thread Dario Taraborelli

Hey all,

We have some important news to share on behalf of the WikiCite organizers.

First off–in case you missed it–we released our annual report
 last week, and it's been
getting a lot of traction
.

So much has happened over the past 12 months: interest in creating a
collaborative knowledge base of sources to support free knowledge is
growing among libraries, linked data organizations, tool developers, and
many groups in the Wikimedia movement. New contributors and organizations,
that were not part of this community these past 2 years, are now joining us
and we need to make sure WikiCite remains sustainable in years to come.

For this reason, we're working with our funders to ensure the next cycle of
WikiCite events is well supported. We have decided to move our main annual
event  from the spring to
the *fall of 2018*, in order to concentrate our efforts on fundraising and
creating a long term strategic plan.

Ahead of WikiCite 2018, we'll be supporting local events to continue to
grow the community. We'll be present at the *Wikimedia Hackathon* in
Barcelona. We will submit a session proposal to *Wikimania 2018*, and we'll
attend other events of interest to the movement. If you know of any
satellite session or event where WikiCite should have a presence, please let
us know .

Should you have any questions for the organizing committee, you can get in
touch at wikic...@wikimedia.org.

Best,
Dario and Sarah
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] The WikiCite 2017 annual report is live

2017-12-07 Thread Dario Taraborelli

We just published the WikiCite 2017 annual report, giving a comprehensive – 
while definitely not complete – overview of what the community has achieved in 
the past 12 months in building a structured repository of sources in Wikidata. 

https://doi.org/10.6084/m9.figshare.5648233
https://twitter.com/wikicite/status/938778592653332480

Thanks to everyone who contributed, to the Alfred P. Sloan Foundation, the 
Gordon and Betty Moore Foundation, the Science Sandbox initiative at the Simons 
Foundation for their generous support, and everyone at Wikimedia Austria and 
WMF who helped getting this off the ground. 

We’ll be posting soon an update about our plans for 2018. Stay tuned!

Dario, on behalf of the WikiCite organizers.___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Cleaning up bibliographic collections in Wikidata

2017-11-24 Thread Dario Taraborelli

Hey all,

I'd like to hear from you on a proposal to add some order and structure to
the various bibliographic corpora we currently have in Wikidata.

As you may know, coverage of creative works in Wikidata has seen
significant growth over the last year. [1][2] Different groups and projects
have started importing source metadata for various reasons:

   - to provide sources machine-extracted statements (WikiFactMine [3],
   StrepHit [4])
   - to represent sources cited in Wikipedia (e.g. DOIs and PMIDs imported
   via the mwcite identifier dumps) or other Wikimedia projects (Wikisource,
   Wikispecies, Wikinews)
   - to create collections of the open access literature citable and
   reusable in Wikimedia projects (e.g. open access PMC review articles)
   - to maintain small, curated corpora about specific topics (e.g. the
   Zika corpus [5])

While all these efforts have grown organically and with little
coordination, it's hard to keep track of who initiated the, to clearly
communicate their purpose, to understand their completion criteria and
their data quality needs, and last but not least to offer any contribution
opportunities (in terms of code, or manual labor) to other community
members. It's unclear if the future of these efforts should continue to be
within Wikidata, or leverage the power of federated Wikibase-powered wikis
(see our discussion at the end of the WikiCite session at WikidataCon [6]).
Irrespective of the best long term solution, we need to provide some better
structure to these efforts today if we want to address the above problems.

I'd like to propose a fairly simple solution and hear your feedback on
whether it makes sense to implement it as is or with some modifications.

   1. create a Wikidata class called "Wikidata item collection" [Q-X]
   2. create and document individual collections (e.g. the Wikidata Zika
   corpus [Q-Y]) as instances of this class: [Q-Y] --P31--> [Q-X]
   3. add appropriate metadata to describe such collections (its main
   topic(s), creators, any external identifiers, if applicable)
   4. mark individual bibliographic items as part of [P361] the
   corresponding collections

Note that this approach can apply to bibliographic item collections but
also to any other set of items not directly identifiable via Wikidata
properties. Of course, the same items could obviously be part of multiple
collections. Some criteria would be needed to determine an appropriate
threshold for legitimate collections (we wouldn't want arbitrary
collections to be created for sets of items generated as part of a test
import).

Beyond solving the issues listed above, this approach would also allow us
to generate dedicated statistics on the growth or data quality of each
collection via the SPARQL endpoint. It would also allow us to design
constraints for arbitrary  item collections, something that right now is
not possible (unless these sets can already be identified via a query).

If something similar already exists in the context of structured data
donations/imports for GLAM, I'd be most grateful for any pointers.

Dario


[1] http://wikicite.org/statistics.html
[2] https://doi.org/10.6084/m9.figshare.5548591.v1
[3] https://meta.wikimedia.org/wiki/Grants:Project/ContentMine/WikiFactMine
[4]
https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Renewal
[5] https://www.wikidata.org/wiki/Wikidata:WikiProject_Zika_Corpus
[6]
https://mirror.netcologne.de/CCC/events/wikidatacon/2017/h264-hd/wikidatacon2017-10009-eng-WikiCite_Wikidata_as_a_structured_repository_of_bibliographic_data_hd.mp4
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Deletion nomination of Template:Cite Q on English Wikipedia

2017-09-20 Thread Dario Taraborelli

Jane – I think you hit it on the nail.

I don't know exactly how this should be designed (some user research seems
in order before coming up with any solution). The problem to me is how to
design subscription/synchronization mechanisms giving people freedom to
choose which data to reuse or not and which "fixes" to send upstream to a
centralized knowledge base. I believe this is how the relation between
Wikidata and other projects was originally conceived: something like this
would allow structured data to be broadly reused without neglecting the
very legitimate concerns, policies and expectations of data consumers.

Yaroslav – agreed, my mail was mostly a heads up about a problem that's an
instance of something much bigger the Wikidata community needs to think
about.


On Wed, Sep 20, 2017 at 1:03 AM, Jane Darnell <jane...@gmail.com> wrote:

> Yes Yaroslav, I totally agree with you (and don't worry, I wouldn't dream
> of commenting there). On the other hand, this is extremely relevant for the
> Wikidata mailing list and I am really grateful to Dario for posting about
> it, because I had no idea. I stopped following that "2017 state of affairs"
> thing when it first got ugly back in January. I suggest that in cases where
> (as Dario suggests) highly structured and superior data from Wikidata
> *could* be used in Wikipedia, that we create some sort of property to
> indicate this on Wikidata, along the lines of the P31->Q17362920 we use to
> show that a certain Wikipedia has a pending merge problem. If the
> information is ever used on that Wikipedia (either with or without that
> "Cite-Q" template) then the property for that specific Wikipedia should be
> removed. Ideally this property could be used as a qualifier at the
> statement level (so e.g. for paintings, a statement on a collection
> property for a painting that it was stolen and rediscovered, or on a
> significant event property that it was restored and reattributed, or that
> it was owned by the Hitler museum and stored it the depot in Linz during
> WWII, etc).
>
> On Wed, Sep 20, 2017 at 8:58 AM, Yaroslav Blanter <ymb...@gmail.com>
> wrote:
>
>> Thanks Dario.
>>
>> May I please add that whereas the deletion discussion is of course open
>> to everyone, a sudden influx of users who are not regular editors of the
>> English Wikipedia will be looked at extremely negatively. Please be
>> considerate.
>>
>> Cheers
>> Yaroslav
>>
>> On Tue, Sep 19, 2017 at 8:18 PM, Dario Taraborelli <
>> dtarabore...@wikimedia.org> wrote:
>>
>>> Hey folks,
>>>
>>> I wanted to draw your attention to a deletion nomination discussion for
>>> an experimental template – {{Cite Q}}
>>> <https://en.wikipedia.org/wiki/Template:Cite_Q> – pulling bibliographic
>>> data from Wikidata:
>>>
>>> https://en.wikipedia.org/wiki/Wikipedia:Templates_for_discus
>>> sion/Log/2017_September_15#Template:Cite_Q
>>>
>>> As you'll see, there is significant resistance against the broader usage
>>> of a template which exemplifies how structured bibliographic data in
>>> WIkidata could be reused across Wikimedia projects.
>>>
>>> I personally think many of the concerns brought up by editors who
>>> support the deletion request are legitimate. As the editor who nominated
>>> the template for deletion notes: "The existence of the template is one
>>> thing; the advocacy to use this systematically is another one altogether.
>>> Anybody seeking that kind of systematic, radical change in Wikipedia must
>>> get consensus for that in Wikipedia first. Being BOLD is fine but has its
>>> limits, and this kind of thing is one of them."
>>>
>>> I find myself in agreement with this statement, which I believe applies
>>> to much more than just bibliographic data from Wikidata: it's about
>>> virtually any kind of data and contents reused across projects governed by
>>> different policies and expectations. I think what's happening is that an
>>> experimental template – primarily meant to showcase how data reuse from
>>> Wikidata *might *work – is perceived as a norm for how references *will*
>>> or *should* work in the future.
>>>
>>> If you're involved in the WikiCite initiative, and are considering
>>> participating in the deletion discussion, I encourage you to keep a
>>> constructive tone and understand the perspective of people who are
>>> concerned about the use and misuse of this template.
>>>
>>> As one of the WikiCite organizers, I see the success o

[Wikidata] Deletion nomination of Template:Cite Q on English Wikipedia

2017-09-19 Thread Dario Taraborelli

Hey folks,

I wanted to draw your attention to a deletion nomination discussion for an
experimental template – {{Cite Q}}
 – pulling bibliographic
data from Wikidata:

https://en.wikipedia.org/wiki/Wikipedia:Templates_for_discussion/Log/2017_September_15#Template:Cite_Q

As you'll see, there is significant resistance against the broader usage of
a template which exemplifies how structured bibliographic data in WIkidata
could be reused across Wikimedia projects.

I personally think many of the concerns brought up by editors who support
the deletion request are legitimate. As the editor who nominated the
template for deletion notes: "The existence of the template is one thing;
the advocacy to use this systematically is another one altogether. Anybody
seeking that kind of systematic, radical change in Wikipedia must get
consensus for that in Wikipedia first. Being BOLD is fine but has its
limits, and this kind of thing is one of them."

I find myself in agreement with this statement, which I believe applies to
much more than just bibliographic data from Wikidata: it's about virtually
any kind of data and contents reused across projects governed by different
policies and expectations. I think what's happening is that an experimental
template – primarily meant to showcase how data reuse from Wikidata
*might *work
– is perceived as a norm for how references *will* or *should* work in the
future.

If you're involved in the WikiCite initiative, and are considering
participating in the deletion discussion, I encourage you to keep a
constructive tone and understand the perspective of people who are
concerned about the use and misuse of this template.

As one of the WikiCite organizers, I see the success of the initiative as
coming from rich, highly curated data that other projects will want to
reuse, and from technical and usability advances for all contributors, not
from giving an impression that the goal is to use Wikidata to subvert how
other Wikimedia communities do their job. I'll post a note explaining my
perspective.

Dario
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Fwd: [Analytics] Research Showcase Wednesday, August 23, 2017 at 11:30 AM (PST) 18:30 UTC

2017-08-23 Thread Dario Taraborelli

For those of you who are not familiar with Gene Wiki project
<https://en.wikipedia.org/wiki/Gene_Wiki>, it's by far one of the most
mature structured data projects in Wikimedia. The project started first on
English Wikipedia (seeding all articles on genes and related topics you can
read today) and is now leveraging Wikidata.

The showcase starts in 30 minutes, you can watch it on YouTube
https://www.youtube.com/watch?v=Fa0Ztv2iF4w or hop on #wikimedia-research
on irc.freenode.net for a live discussion, hosted by Jonathan.

Dario

On Mon, Aug 21, 2017 at 4:06 PM, Jonathan Morgan <jmor...@wikimedia.org>
wrote:

> One of this month's WMF research showcase presentations is by Andrew Su of
> Scripps Institute, the coordinator of Gene Wiki
> <https://en.wikipedia.org/wiki/Portal:Gene_Wiki>.
>
> -- Forwarded message --
> From: Sarah R <srodl...@wikimedia.org>
> Date: Mon, Aug 21, 2017 at 3:22 PM
> Subject: [Analytics] Research Showcase Wednesday, August 23, 2017 at 11:30
> AM (PST) 18:30 UTC
> To: wikimedi...@lists.wikimedia.org, analyt...@lists.wikimedia.org,
> wiki-researc...@lists.wikimedia.org
>
>
> Hi Everyone,
>
> The next Research Showcase will be live-streamed this Wednesday, August
> 23, 2017 at 11:30 AM (PST) 18:30 UTC.
>
> YouTube stream: https://www.youtube.com/watch?v=Fa0Ztv2iF4w
>
> As usual, you can join the conversation on IRC at #wikimedia-research.
> And, you can watch our past research showcases here
> <https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#August_2017>.
>
> This month's presentation:
>
> Sneha Narayan (Northwestern University)
>
> *The Wikipedia Adventure: Field Evaluation of an Interactive Tutorial for
> New Users*
>
> Integrating new users into a community with complex norms presents a
> challenge for peer production projects like Wikipedia. We present The
> Wikipedia Adventure (TWA): an interactive tutorial that offers a structured
> and gamified introduction to Wikipedia. In addition to describing the
> design of the system, we present two empirical evaluations. First, we
> report on a survey of users, who responded very positively to the tutorial.
> Second, we report results from a large-scale invitation-based field
> experiment that tests whether using TWA increased newcomers' subsequent
> contributions to Wikipedia. We find no effect of either using the tutorial
> or of being invited to do so over a period of 180 days. We conclude that
> TWA produces a positive socialization experience for those who choose to
> use it, but that it does not alter patterns of newcomer activity. We
> reflect on the implications of these mixed results for the evaluation of
> similar social computing systems.
>
> Andrew Su (Scripps Research Institute)
>
> *The Gene Wiki: Using Wikipedia and Wikidata to organize biomedical
> knowledge*
>
> The Gene Wiki project began in 2007 with the goal of creating a
> collaboratively-written, community-reviewed, and continuously-updated
> review article for every human gene within Wikipedia.  In 2013, shortly
> after the creation of the Wikidata project, the project expanded to include
> the organization and integration of structured biomedical data.  This talk
> will focus on our current and future work, including efforts to encourage
> contributions from biomedical domain experts, to build custom applications
> that use Wikidata as the back-end knowledge base, and to promote
> CC0-licensing among biomedical knowledge resources.  Comments, feedback and
> contributions are welcome at https://github.com/SuLab/genewikicentral and
> https://www.wikidata.org/wiki/WD:MB.
>
> Kindly,
>
> Sarah R. Rodlund
> Senior Project Coordinator-Product & Technology, Wikimedia Foundation
> srodl...@wikimedia.org
>
>
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
>
> --
> Jonathan T. Morgan
> Senior Design Researcher
> Wikimedia Foundation
> User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>


-- 

*Dario Taraborelli  *Director, Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Applications open now: WikiCite 2017 • Vienna, May 23-25, 2017

2017-02-09 Thread Dario Taraborelli

Dear all,

I am happy to announce that applications to attend WikiCite ‘17 officially open
today <https://goo.gl/forms/Kb9Wl6Xfw2EmFqEr2>.

About the event

WikiCite 2017 <https://meta.wikimedia.org/wiki/WikiCite_2017> is a 3-day
conference, summit and hack day to be hosted in Vienna, Austria, on May
23-25, 2017. It expands on efforts started last year at WikiCite 2016
<https://meta.wikimedia.org/wiki/WikiCite_2016/Report> to design a central
bibliographic repository, as well as tools and strategies to improve
information quality and verifiability in Wikimedia projects.

Our goal is to bring together Wikimedia contributors, data modelers,
information and library science experts, software engineers, designers and
academic researchers who have experience working with Wikipedia's citations
and bibliographic data.

WikiCite 2017 will be a venue to:

   -

   Day 1. (Conference) – present progress on existing work and initiatives
   for citations and bibliographic data across Wikimedia projects
   -

   Day 2. (Summit) – discuss technical, social, outreach and policy
   directions
   -

   Day 3. (Hack) – get together to build, based on new ideas and
   applications



More information on the event can be found here
<https://meta.wikimedia.org/wiki/WikiCite_2017>:

How to apply

Participation for this year's event is limited to 100 individuals. In order
to be considered for participation, please fill out the following form
<https://goo.gl/forms/Kb9Wl6Xfw2EmFqEr2> and provide us with some
information about yourself, your interests, and expected contribution.
PLEASE NOTE THIS IS NOT THE FINAL REGISTRATION FORM. Your application will
be reviewed and the organizing committee will extend an invitation by March
10, 2017. This application form is to determine the best mix of attendees.
Not everyone who applies will receive an invitation, but there will be a
waitlist.

Important dates


   -

   February 9, 2017: applications open
   -

   February 27, 2017: applications close, waitlist opens
   -

   March 10, 2017: all final notifications of acceptance are issued,
   waitlist processing begins
   -

   March 31, 2017: attendee list is finalized


Travel support


Like last year, limited funding to cover travel costs of prospective
participants will be available. Requests for travel support should be
submitted via the application form <https://goo.gl/forms/Kb9Wl6Xfw2EmFqEr2>.
We will confirm by March 10, if we can provide you with travel support.

Contact

For any question, you can contact the organizing committee via:
wikic...@wikimedia.org

We look forward to seeing you in Vienna!

The WikiCite 2017 organizing committee

Dario Taraborelli

Jonathan Dugan

Lydia Pintscher

Daniel Mietchen

Cameron Neylon



*Dario Taraborelli  *Director, Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Save the dates: WikiCite 2017 • Vienna, May 23-25, 2017

2017-02-01 Thread Dario Taraborelli

It's been a while since the last official WikiCite update but I am thrilled
to announce that we have dates confirmed for *WikiCite 2017
<https://meta.wikimedia.org/wiki/WikiCite_2017>*.

*WikiCite 2017* is a 3-day conference, summit and hack day hosted in
*Vienna* on *May 23-25, 2017* (back to back with the Wikimedia Hackathon
<https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2017>).

It expands efforts <https://meta.wikimedia.org/wiki/WikiCite/Newsletter>
that started last year in Berlin with WikiCite 2016
<https://meta.wikimedia.org/wiki/WikiCite_2016> towards the creation of a
bibliographic repository to serve open knowledge.

WikiCite 2017 will be a venue to:

   1. present on progress of existing and new initiatives around citations
   and bibliographic data across Wikimedia projects (day 1: conference)
   2. discuss technical, social, outreach and policy directions (day 2:
   summit)
   3. get together to hack on new ideas and applications (day 3: hack day)

For a summary of what was accomplished last year, you can read our report
<https://meta.wikimedia.org/wiki/WikiCite_2016/Report>.

Additional details on the event, the application process for prospective
participants, travel support requests, and information about the venue will
be posted shortly on Meta <https://meta.wikimedia.org/wiki/WikiCite_2017>
and via the mailing lists, but we wanted to share the dates as early as
possible so you can save them in your calendar.

Looking forward to seeing you there.

Dario
on behalf of the WikiCite 2017 organizers




*Dario Taraborelli  *Director, Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] [wikicite-discuss] Entity tagging and fact extraction (from a scholarly publisher perspective)

2016-11-16 Thread Dario Taraborelli

+Ruben

On Wed, Nov 16, 2016 at 11:06 AM, Stas Malyshev <smalys...@wikimedia.org>
wrote:

>
> > Perhaps add a few to use as examples to start (uniprot and wikipathways
> > would be good starters for our group).  And then start an endpoint
> > inclusion process accessible to the community through something akin to
> > the property proposal pattern
>
> It's a bit more complicated since WDQS and wikidata are different
> systems, but I think it can be worked out. So, I'd like some suggestions
> for:
>
> 1. Initial endpoint nomination process (who runs it, how, etc.)
> 2. Subsequent mechanism for requesting to add an endpoint.
>
> The technical part I can manage (probably (2) would result in creating a
> phabricator ticket, then I or somebody else will make a patch that I
> approve, etc.



> > Dario, I would love for the WDQS to support federated queries.



It is possible, technically, but we need a whitelist of the servers that
> are allowed. Any ideas about how to produce such list?
>

--
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 

*Dario Taraborelli  *Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Refactoring WP:Source Metadata and helping new contributors

2016-11-11 Thread Dario Taraborelli

On Nov 11, 2016, at 10:08 AM, Federico Leva (Nemo) <nemow...@gmail.com> wrote:
> 
> Dario Taraborelli, 11/11/2016 19:00:
>> Recent action is definitely not about journals only: it’s about
>> preprints; articles; conference proceedings, academic conferences and
>> conference series; books; publishers; provenance of citation data.
> 
> Most of these falls under "papers" perhaps? Just saying that if I hear 
> “Source Metadata" I have no idea what you're talking about (though I see I 
> watchlisted the page eons ago and after re-reading it for a few minutes I 
> recollect some background), then I go to the talk page and I see discussions 
> that seem completely unrelated, then I remember the recent WikiCite-related 
> bot imports and I understand that all this has some people in common, though 
> everything else seems disconnected.

maybe the project should be reframed as an umbrella WikiProject:Bibliography 
project (more self-explanatory?) and spawn individual taskforces / groups for 
different publication types. Open to suggestions.


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Refactoring WP:Source Metadata and helping new contributors

2016-11-11 Thread Dario Taraborelli

Hey all,

a conversation that happened on Twitter the other day suggests that we're
not doing a good enough job at structuring documentation on wiki.

While I think the subpages of [[m:WikiCite]]
 are fairly well organized,
[[WD:WikiProject
Source]]  badly needs
some love.

In particular:

   - the WikiProject's *landing page* contains a lot of unstructured and
   fairly outdated information: some of this information could be moved to
   subpages and we could use a navigation template like other WikiProjects
   do .
   - *data modeling proposals* for different types of works are currently
   only captured via a template (Template:Bibliographical_properties
   ) or
   buried on the WP's talk page: we should have a dedicated page to host data
   models, ideally a big table listing and annotating properties for different
   types of works, as well as their mappings to existing bibliographic models.
   - other important proposals (such as the use
   

   of *stated in* (P248) to represent *provenance of citation data* for
   statements using  *cites *(P2860) or the documentation of specific *data
   import strategies* (the Zika corpus, the OA review literature, PMCID
   references in enwiki) are similarly buried in the talk page and hard to
   find: this may raise a few eyebrows in the Wikidata community if we don't
   make it clear how and why this data is being imported or represented.

If someone on this list is willing to spend some time and help with some
documentation/design effort, it would be tremendously useful, especially to
people who are not yet regularly following WikiCite and WP:Source Metadata:
we need to create an inclusive environment and readability/navigability for
newbies is the first important step.

Dario
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] [wikicite-discuss] Entity tagging and fact extraction (from a scholarly publisher perspective)

2016-11-11 Thread Dario Taraborelli

Benjamin – agreed, I too see Wikidata as mainly a place to hold all the
mappings. Once we support federated queries in WDQS, the benefit of ID
mapping (over extensive data ingestion) will become even more apparent.

Hope Andrew and other interested parties can pick up this thread.

On Wed, Nov 2, 2016 at 12:11 PM, Benjamin Good <ben.mcgee.g...@gmail.com>
wrote:

> Dario,
>
> One message you can send is that they can and should use existing
> controlled vocabularies and ontologies to construct the metadata they want
> to share.  For example, MeSH descriptors would be a good way for them to
> organize the 'primary topic' assertions for their articles and would make
> it easy to find the corresponding items in Wikidata when uploading.  Our
> group will be continuing to expand coverage of identifiers and concepts
> from vocabularies like that in Wikidata - and any help there from
> publishers would be appreciated!
>
> My view here is that Wikidata can be a bridge to the terminologies and
> datasets that live outside it - not really a replacement for them.  So, if
> they have good practices about using shared vocabularies already, it should
> (eventually) be relatively easy to move relevant assertions into the
> WIkidata graph while maintaining interoperability and integration with
> external software systems.
>
> -Ben
>
> On Wed, Nov 2, 2016 at 8:31 AM, 'Daniel Mietchen' via wikicite-discuss <
> wikicite-disc...@wikimedia.org> wrote:
>
>> I'm traveling ( https://twitter.com/EvoMRI/status/793736211009536000
>> ), so just in brief:
>> In terms of markup, some general comments are in
>> https://www.ncbi.nlm.nih.gov/books/NBK159964/ , which is not specific
>> to Hindawi but partly applies to them too.
>>
>> A problem specific to Hindawi (cf.
>> https://commons.wikimedia.org/wiki/Category:Media_from_Hindawi) is the
>> bundling of the descriptions of all supplementary files, which
>> translates into uploads like
>> https://commons.wikimedia.org/wiki/File:Evolution-of-Coronar
>> y-Flow-in-an-Experimental-Slow-Flow-Model-in-Swines-
>> Angiographic-and-623986.f1.ogv
>> (with descriptions for nine files)
>> and eight files with no description, e.g.
>> https://commons.wikimedia.org/wiki/File:Evolution-of-Coronar
>> y-Flow-in-an-Experimental-Slow-Flow-Model-in-Swines-
>> Angiographic-and-623986.f2.ogv
>> .
>>
>> There are other problems in their JATS, and it would be good if they
>> would participate in
>> http://jats4r.org/ . Happy to dig deeper with Andrew or whoever is
>> interested.
>>
>> Where they are ahead of the curve is licensing information, so they
>> could help us set up workflows to get that info into Wikidata.
>>
>> In terms of triple suggestions to Wikidata:
>> - as long as article metadata is concerned, I would prefer to
>> concentrate on integrating our workflows with the major repositories
>> of metadata, to which publishers are already posting. They could help
>> us by using more identifiers (e.g. for authors, affiliations, funders
>> etc.), potentially even from Wikidata (e.g. for keywords/ P921, for
>> both journals and articles) and by contributing to the development of
>> tools (e.g. a bot that goes through the CrossRef database every day
>> and creates Wikidata items for newly published papers).
>> - if they have ways to extract statements from their publication
>> corpus, it would be good if they would let us/ ContentMine/ StrepHit
>> etc. know, so we could discuss how to move this forward.
>> d.
>>
>> On Wed, Nov 2, 2016 at 1:42 PM, Dario Taraborelli
>> <dtarabore...@wikimedia.org> wrote:
>> > I'm at the Crossref LIVE 16 event in London where I just gave a
>> presentation
>> > on WikiCite and Wikidata targeted at scholarly publishers.
>> >
>> > Beside Crossref and Datacite people, I talked to a bunch of folks
>> interested
>> > in collaborating on Wikidata integration, particularly from PLOS,
>> Hindawi
>> > and Springer Nature. I started an interesting discussion with Andrew
>> Smeall,
>> > who runs strategic projects at Hindawi, and I wanted to open it up to
>> > everyone on the lists.
>> >
>> > Andrew asked me if – aside from efforts like ContentMine and StrepHit –
>> > there are any recommendations for publishers (especially OA publishers)
>> to
>> > mark up their contents and facilitate information extraction and entity
>> > matching or even push triples to Wikidata to be considered for
>> ingestion.
>> >
>> > I don't think we have a recommended workflow for data providers for

[Wikidata] Entity tagging and fact extraction (from a scholarly publisher perspective)

2016-11-02 Thread Dario Taraborelli

I'm at the Crossref LIVE 16 event
<https://www.eventbrite.com/e/crossref-live16-registration-25928526922> in
London where I just gave a presentation
<https://dx.doi.org/10.6084/m9.figshare.4175343.v2> on WikiCite and
Wikidata targeted at scholarly publishers.

Beside Crossref and Datacite people, I talked to a bunch of folks
interested in collaborating on Wikidata integration, particularly from
PLOS, Hindawi and Springer Nature. I started an interesting discussion with
Andrew Smeall, who runs strategic projects at Hindawi, and I wanted to open
it up to everyone on the lists.

Andrew asked me if – aside from efforts like ContentMine and StrepHit –
there are any recommendations for publishers (especially OA publishers) to
mark up their contents and facilitate information extraction and entity
matching or even push triples to Wikidata to be considered for ingestion.

I don't think we have a recommended workflow for data providers for
facilitating triple suggestions to Wikidata, other than leveraging the Primary
Sources Tool <https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool>.
However, aligning keywords and terms with the corresponding Wikidata items
via ID mapping sounds like a good first step. I pointed Andrew to
Mix'n'Match <https://meta.wikimedia.org/wiki/Mix%27n%27match> as a handy
way of mapping identifiers, but if you have other ideas on how to best
support 2-way integration of Wikidata with scholarly contents, please chime
in.

Dario

-- 

*Dario Taraborelli  *Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] WikiCite 2016 report

2016-10-20 Thread Dario Taraborelli

The final report for WikiCite 2016
 is now available:

D. Taraborelli, J. Dugan, L. Pintscher, D. Mietchen, C. Neylon (2016) *WikiCite
2016 Report*.doi.org/10.6084/m9.figshare.4042530 commons.wikimedia.org/wiki/
File:WikiCite_2016_report.pdf CC BY

The report includes an overview of the event, its outcomes and impact,
results from a participant survey, financial data, as well as a list of
ongoing initiatives.

The impact of the event, and the number and importance of the initiatives
it spawned, far exceeded our expectations. We're planning to host a
follow-up event  in the early
summer of 2017 to continue these efforts. Thanks to all the participants
and organizations that supported or contributed to the event.

Dario, on behalf of the event organizers

*Why WikiCite?* A 10-minute introduction. https://t.co/egVUIqClbJ
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] A property to exemplify SPARQL queries associated witha property

2016-08-24 Thread Dario Taraborelli

Stas:

I think it may be a good idea to start thinking about some way of
> storing queries on Wikidata maybe? On one hand, they are just strings,
> on the other hand, they are code - like CSS or Javascript - and storing
> them just as strings may be inconvenient. Maybe .sparql file extension
> handler like we have for .js and .json and so on?


would that imply creating a new data type? This idea could be implemented
today with direct links to WDQS but storing the SPARQL query in clear as a
property value and using a URI formatter to point to the corresponding
services sounds like the right approach.



On Wed, Aug 24, 2016 at 6:18 AM, Dimitris Kontokostas <jimk...@gmail.com>
wrote:

> example sparql queries alone can be very helpful but I would suggest that
> they can be accompanied with a short description explaining what the query
> does
>
> On Wed, Aug 24, 2016 at 3:21 PM, Navino Evans <nav...@histropedia.com>
> wrote:
>
>> If you could store queries, you could also store queries for each item
>>> that is about a list of things, so that the query returns exactly the
>>> things that should be in the list ... could be useful.
>>
>>
>> This also applies to a huge number of Wikipedia categories (the non
>> subjective ones). It would be extremely useful to have queries describing
>> them attached to the Wikidata items for the categories.
>>
>> On 24 August 2016 at 02:31, Ananth Subray <ananth.sub...@gmail.com>
>> wrote:
>>
>>> मा
>>> --
>>> From: Stas Malyshev <smalys...@wikimedia.org>
>>> Sent: ‎24-‎08-‎2016 12:33 AM
>>> To: Discussion list for the Wikidata project.
>>> <wikidata@lists.wikimedia.org>
>>> Subject: Re: [Wikidata] A property to exemplify SPARQL queries
>>> associated witha property
>>>
>>> Hi!
>>>
>>> > Relaying a question from a brief discussion on Twitter [1], I am
>>> curious
>>> > to hear how people feel about the idea of creating a a "SPARQL query
>>> > example" property for properties, modeled after "Wikidata property
>>> > example" [2]?
>>>
>>> Might be nice, but we need a good way to present the query in the UI
>>> (see below).
>>>
>>> > This would allow people to discover queries that exemplify how the
>>> > property is used in practice. Does the approach make sense or would it
>>> > stretch too much the scope of properties of properties? Are there
>>> better
>>> > ways to reference SPARQL examples and bring them closer to their
>>> source?
>>>
>>> I think it may be a good idea to start thinking about some way of
>>> storing queries on Wikidata maybe? On one hand, they are just strings,
>>> on the other hand, they are code - like CSS or Javascript - and storing
>>> them just as strings may be inconvenient. Maybe .sparql file extension
>>> handler like we have for .js and .json and so on?
>>>
>>> --
>>> Stas Malyshev
>>> smalys...@wikimedia.org
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>>
>>
>>
>> --
>> ___
>>
>> The Timeline of Everything
>>
>> www.histropedia.com
>>
>> Twitter <https://twitter.com/Histropedia> Facebo
>> <https://www.facebook.com/Histropedia>ok
>> <https://www.facebook.com/Histropedia> Google +
>> <https://plus.google.com/u/0/b/104484373317792180682/104484373317792180682/posts>
>>L <http://www.linkedin.com/company/histropedia-ltd>inke
>> <http://www.linkedin.com/company/histropedia-ltd>dIn
>> <http://www.linkedin.com/company/histropedia-ltd>
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
>
> --
> Kontokostas Dimitris
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>


-- 

*Dario Taraborelli  *Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] A property to exemplify SPARQL queries associated with a property

2016-08-23 Thread Dario Taraborelli

>
> > One can always use the talk page, like with all the templates
> > documenting usage and monitoring.
>
> True. With additional benefit that you can use nice templates like
> SPARQL and SPARQL2 to display the queries.
>

talk page templates are definitely a valid alternative. The main benefit I
see in storing SPARQL query examples via a property of a property is that
they'd become discoverable via SPARQL themselves, so for example you could
allow users to find interested queries limited to a set of properties with
given characteristics (basically allowing filtering on all properties of
properties).

Might be nice, but we need a good way to present the query in the UI (see
> below).
>

Agreed. I imagine, from a design perspective, these "helper" properties
(like P1855) that are not really part of the data model could be grouped
together and presented via a dedicated UI element, like we do for
identifiers.


> > I think it may be a good idea to start thinking about some way of
> > storing queries on Wikidata maybe? On one hand, they are just strings,
> > on the other hand, they are code - like CSS or Javascript - and storing
> > them just as strings may be inconvenient. Maybe .sparql file extension
> > handler like we have for .js and .json and so on?



If you could store queries, you could also store queries for each item that
> is about a list of things, so that the query returns exactly the things
> that should be in the list ... could be useful.


I hadn't thought of this use case, but I too see how it might be useful.
Are there other good examples of properties that take code snippets as a
value?
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] A property to exemplify SPARQL queries associated with a property

2016-08-23 Thread Dario Taraborelli

Relaying a question from a brief discussion on Twitter [1], I am curious to
hear how people feel about the idea of creating a a "SPARQL query example"
property for properties, modeled after "Wikidata property example" [2]?

This would allow people to discover queries that exemplify how the property
is used in practice. Does the approach make sense or would it stretch too
much the scope of properties of properties? Are there better ways to
reference SPARQL examples and bring them closer to their source?

Dario

[1] https://twitter.com/ReaderMeter/status/768101464572997632
[2] https://www.wikidata.org/wiki/Property:P1855
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] What's the one thing you wish Wikidata had or would do but doesn't yet?

2016-08-02 Thread Dario Taraborelli

Hi folks,

the use of AllOurIdeas for prioritizing community requests came up in a
conversation about the Community Tech wishlist this morning. I realized I'd
never followed up on this thread to report the results of the informal poll
I set up in March.

As of this morning, nearly 1,000 votes have been cast. This is the full
ranked list of ideas that were submitted and evaluated:
http://allourideas.org/wikidata/results

I copy here the top 10 results and their score*:

   1. Wiktionary data 89
   2. Use of data on Wikipedia 83
   3. Support in-context editing of wikidata content shown on Wikipedia
   articles so that 'edit-this-page' keeps working seamlessly 75
   4. Statistics of re-use of data from Wikidata 72
   5. Automatically generated list pages on Wikipedia 64
   6. Federation: Re-use of wikidata items and properties on other Wikibase
   instances. 64
   7. Unit conversion 64
   8. Wizard-style dialog for entering references 64
   9. Tools to monitor the changes in a query 63
   10. Automatically suggest references from high quality sources that we
   as a community have compiled 57

(* "the score of an idea is the estimated chance that it will win against a
randomly chosen idea. For example, a score of 100 means the idea is
predicted to win every time and a score of 0 means the idea is predicted to
lose every time.")

The tool is still accepting new votes and new ideas, so I encourage you to
check it out if you haven't yet: http://allourideas.org/wikidata

Dario

On Sat, Mar 12, 2016 at 4:16 PM, Dario Taraborelli <
dtarabore...@wikimedia.org> wrote:

> Hey all,
>
> Lydia (or somebody operating the @Wikidata handle :) posted this question
> on Twitter and a few great ideas started trickling in
> <https://twitter.com/wikidata/status/708384895375163392>.
>
> I went ahead and created an AllOurIdeas poll <https://t.co/IbsBmY6Kpg>,
> seeded with the first ideas posted on Twitter, to crowdsource the
> generation of new ideas and produce a robust ranking.
>
> If you're unfamiliar with AllOurIdeas <http:///>, it's an open
> consultation engine allowing people to choose which idea they like best via
> pairwise comparisons (I am cc'ing Matt Salganik, the project lead). It's
> very simple on the surface but it uses algorithms such as the Condorcet
> method <https://en.wikipedia.org/wiki/Condorcet_method> to test how
> strongly each idea performs against another, reducing the weighing of the
> oldest ideas to create a level playing field for newly created ideas and
> preventing gaming or self-promotion of one's own ideas.
>
> Try it out or post new ideas: the more votes it gets, the higher the
> confidence of the ranking. Real-time results and statistics are here
> <http://www.allourideas.org/wikidata/results>.
>
> Dario
>
>

-- 

*Dario Taraborelli  *Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] WDQS URL shortener

2016-06-01 Thread Dario Taraborelli

On Wed, Jun 1, 2016 at 2:00 PM, Federico Leva (Nemo) <nemow...@gmail.com>
wrote:

> Dario Taraborelli, 01/06/2016 10:33:
>
>> I don't, it probably depends on what shorteners are most used for spam
>> purposes across Wikimedia projects. Maybe someone familiar with URL
>> blacklisting from major wikis can comment?
>>
>
> Nearly all URL shorteners get blacklisted, eventually.

that makes sense. It sounds like in the short term (and until we have a
Wikimedia-operated shortener), using full URLs from WDQS is – alas – the
only way to go. One option we haven't mentioned would be for WDQS itself to
support URL shortening, I have no idea where that would sit in terms of
priorities.

Dario

>
>
> Nemo
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>

-- 

*Dario Taraborelli  *Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] An early preview from WikiCite

2016-06-01 Thread Dario Taraborelli

Hey all,

Wikimedia Deutschland and the Wikimedia Foundation hosted the WikiCite
 event in Berlin last week,
bringing together a large group
 of
Wikidatans, Wikipedians, librarians, developers and researchers from all
over the world.

The event built a lot of momentum around the definition of data models,
workflows and technology needed to better represent source and citation
data from Wikimedia projects, Wikidata in particular.

While we're still drafting a human-readable report
, I thought I'd share
a preview of the notes from the various workgroups, to give you a sense of
what we worked on and to let everyone join the discussion:
Main workgroups
Modeling bibliographic source metadata


Discuss and draft data models to represent different types of sources as
Wikidata items
Reference extraction and metadata lookup tools


Design or improve tools to extract identifiers and bibliographic data from
Wikipedia citation templates, look up and retrieve metadata
Representing citations and citation events


Discuss how to express the citation of a source in a Wikimedia artifact
(such as a Wikipedia article, a Wikidata statements etc.) and review
alternative ways to represent them
(Semi-)automated ways to add references to Wikidata statements


Improve tools for semi-automated statement and reference creation
(StrepHit, ContentMine)
Use cases for source-related queries


Identify use cases for SPARQL queries involving source metadata. Obtain a
small open licensed bibliographic and citation graph dataset to build a
proof-of-concept of the querying and visualization potential of source
metadata in Wikidata.
Additional workgroups
Wikidata as the central hub on license information on databases


Add license information to Wikidata to make Wikidata the central hub on
license information on databases
Using citations and bibliographic source metadata


Merge groups working on citation structure and source metadata models and
integrate their recommendations
Citoid-Wikidata integration


Extend Citoid to write source metadata into Wikidata

We're opening up the wikicite-disc...@wikimedia.org mailing list to anyone
interested in interacting with the participants in the event (we encouraged
them to use the official wikidata list for anything of interest to the
broader community). Phabricator also has a dedicated tag
 for related initiatives.

The event was generously funded
 by the Alfred P.
Sloan Foundation, the Gordon and Betty Moore Foundation, and Crossref.
We'll be exploring the feasibility of a follow-up event in the next 6-12
months to continue the work we started in Berlin and bring in more people
than we could host due to funding/capacity.

Best,

Dario
on behalf of the organizers
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] WDQS URL shortener

2016-06-01 Thread Dario Taraborelli

Hey all,

while using shortened URLs from WDQS in the WikiCite report draft
, it occurred to me
that tinyurl.com is blacklisted on Meta.

This is a major problem as it prevents concise URLs for gigantic queries
from being linked from other Wikimedia wikis. Has anyone thought of this
issue (Stas, Jonas?), in particular: should we ask Meta to remove the
domain from the blacklist or potentially consider another URL shortening
solution?

Dario
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Wiki Workshop at ICWSM '16: Accepted papers and invited speakers

2016-05-14 Thread Dario Taraborelli

We are glad to announce our invited speaker lineup and 19 papers accepted
at the wiki research workshop <http://snap.stanford.edu/wikiworkshop2016/> we
will be hosting on May 17, 2016 at the *10th International AAAI Conference
on Web and Social Media* (ICWSM '16 <http://www.icwsm.org/2016/>) in
Cologne, Germany. If you're attending the conference and interested in
Wikipedia, Wikidata, Wikimedia research, please consider registering for
the workshop. This is the second part of a workshop previously hosted at WWW
'16 <http://www.icwsm.org/2016/> in Montréal, Canada, in April. For more
information, you can visit the workshop's website
<http://snap.stanford.edu/wikiworkshop2016/> or follow us on Twitter (
@wikiworkshop16 <https://twitter.com/wikiworkshop16>).
Invited speakers <http://snap.stanford.edu/wikiworkshop2016/#speakers-icwsm>

   - *Ofer Arazy* (*University of Haifa*) Emergent Work in Wikipedia
   - *Jürgen Pfeffer* (*TU Munich*) Applying Social Network Analysis
   Metrics to Large-Scale Hyperlinked Data
   - *Martin Potthast* (*Universität Weimar*) Wikipedia Text Mining:
   Uncovering Quality and Reuse
   - *Fabian Suchanek* (*Télécom ParisTech*) A Hitchhiker's Guide to
   Ontology
   - *Claudia Wagner* (*GESIS*) Gender Inequalities in Wikipedia

Accepted papers <http://snap.stanford.edu/wikiworkshop2016/#papers-icwsm>


   - *Yashaswi Pochampally, Kamalakar Karlapalem and Navya Yarrabelly*
   Semi-Supervised Automatic Generation of Wikipedia Articles for Named
   Entities
   - *Joan Guisado-Gámez, Josep Lluís Larriba-Pey, David Tamayo and Jordi
   Urmeneta*
   ENRICH: A Query Expansion Service Powered by Wikipedia Graph Structure
   - *Ioannis Protonotarios, Vasiliki Sarimpei and Jahna Otterbacher*
   Similar Gaps, Different Origins? Women Readers and Editors at Greek
   Wikipedia
   -
*Sven Heimbuch and Daniel Bodemer *Wiki Editors' Acceptance of Additional
   Guidance on Talk Pages
   - *Yerali Gandica, Renaud Lambiotte and Timoteo Carletti*
   What Can Wikipedia Tell Us about the Global or Local Character of
   Burstiness?
   - *Andreas Spitz, Vaibhav Dixit, Ludwig Richter, Michael Gertz and
   Johanna Geiß*
   State of the Union: A Data Consumer's Perspective on Wikidata and Its
   Properties for the Classification and Resolution of Entities
   - *Finn Årup Nielsen*
   Literature, Geolocation and Wikidata
   - *Ana Freire, Matteo Manca, Diego Saez-Trumper, David Laniado, Ilaria
   Bordino, Francesco Gullo and Andreas Kaltenbrunner*
   Graph-Based Breaking News Detection on Wikipedia
   - *Alexander Dallmann, Thomas Niebler, Florian Lemmerich and Andreas
   Hotho*
   Extracting Semantics from Random Walks on Wikipedia: Comparing Learning
   and Counting Methods
   - *Arpit Merchant, Darshit Shah and Navjyoti Singh*
   In Wikipedia We Trust: A Case Study
   - *Thomas Palomares, Youssef Ahres, Juhana Kangaspunta and Christopher
   Ré*
   Wikipedia Knowledge Graph with DeepDive
   - *Lu Xiao*
   Hidden Gems in the Wikipedia Discussions: The Wikipedians' Rationales
   - *Sooyoung Kim and Alice Oh*
   Topical Interest and Degree of Involvement of Bilingual Editors in
   Wikipedia
   - *Lambert Heller, Ina Blümel, Simone Cartellieri and Christian Wartena*
   A Proposed Solution for Discovery of Reusable Technology Pictures Using
   Textmining of Surrounding Article Text, Based on the Infrastructure of
   Wikidata, Wikisource and Wikimedia Commons
   - *Behzad Tabibian, Mehrdad Farajtabar, Isabel Valera, Le Song, Bernhard
   Schölkopf and Manuel Gomez Rodriguez*
   On the Reliability of Information and Trustworthiness of Web Sources in
   Wikipedia
   - *Ruth Garcia Gavilanes, Milena Tsvetkova and Taha Yasseri*
   Collective Remembering in Wikipedia: The Case of Aircraft Crashes
   - *Elena Labzina*
   The Political Salience Dynamics and Users' Interaction Using the Example
   of Wikipedia within the Authoritarian Regime Context
   - *Fabian Flöck and Maribel Acosta*
   WikiLayers – A Visual Platform for Analyzing Content Evolution and
   Editing Dynamics in Wikipedia
   - *Olga Zagarova, Tatiana Sennikova, Claudia Wagner and Fabian Flöck*
   Cultural Relation Mining on Wikipedia: Beyond Culinary Analysis

 OrganizersBob West, *Stanford University & Wikimedia Foundation*Leila
Zia, *Wikimedia
Foundation*Dario Taraborelli, *Wikimedia Foundation*Jure Leskovec, *Stanford
University*
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Applications open for WikiCite (Berlin, May 25-26, 2016)

2016-03-29 Thread Dario Taraborelli

Citations and references are the building blocks of Wikimedia projects.
However, as of today, they are still treated as second-class citizens.
Structured data bases such as Wikidata offer a unique opportunity
<https://www.wikidata.org/wiki/Wikidata:WikiProject_Source_MetaData> to
turn into reality over a decade of endeavors to build the sum of all
citations and bibliographic metadata into a centralized repository. To
coordinate upcoming work in this space, we're organizing a technical event
in late May and opening up applications for prospective participants.

*WikiCite 2016 <https://meta.wikimedia.org/wiki/WikiCite_2016>* is a
hands-on event focused on designing data models and technology to *improve
the coverage, quality, standards-compliance and machine-readability of
citations and source metadata in Wikipedia, Wikidata and other Wikimedia
projects*. Our goal, in particular, is to define a technical roadmap for
building a repository of all Wikimedia references in Wikidata.

We are bringing together Wikidatans, Wikipedians, software engineers, data
modelers, and information and library science experts from organizations
including *Crossref*, *Zotero*, *CSL*, *ContentMine*, *Google*, *Datacite*,
*NISO*, *OCLC* and the *NIH*. We are also inviting academic researchers
with experience working with Wikipedia's citations and bibliographic data.

WikiCite will be hosted in *Berlin* on *May 25-26, 2016*. Participation to
the event is capped at about 50 participants and we expect to have a number
of open slots for applicants:

   - if you were pre-invited and have already filled in a form, you will
   receive a separate note from the organizers
   - if you have not been invited but you would like to participate, please
   fill in this application form <http://goo.gl/forms/Yv6rve2wCt> to give
   us some information about you and your interest and expected contribution
   to the event.

Please help us pass this on to anyone who has done important technical work
on Wikimedia references and citations.

*Important dates*

   - *March 29, 2016*: applications open
   - *April 11, 2016*: applications close
   - *April 15, 2016*: notifications of acceptance are issued (if you
   applied for a travel grant, we'll be able to confirm by this date if we can
   cover the costs of your trip)


For any question, you can contact the organizing committee:
wikic...@wikimedia.org

The organizers,

Dario Taraborelli
Jonathan Dugan
Lydia Pintcher
Daniel Mietchen
Cameron Neylon


*Dario Taraborelli  *Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Fwd: WWW 2016 Wiki Workshop: accepted papers

2016-03-19 Thread Dario Taraborelli

We have several contributions of relevance to the Wikidata community at our
wiki research workshop hosted at WWW'16 (more papers will be announced soon
from our second workshop at ICWSM '16).

Dario

-- Forwarded message --
From: Dario Taraborelli <dtarabore...@wikimedia.org>
Date: Sat, Mar 19, 2016 at 9:00 AM
Subject: WWW 2016 Wiki Workshop: accepted papers
To: Research into Wikimedia content and communities <
wiki-researc...@lists.wikimedia.org>

We're thrilled to announce the list of papers accepted at the WWW 2016 Wiki
Workshop <http://snap.stanford.edu/wikiworkshop2016/>. You can follow
@wikiworkshop16 <https://twitter.com/wikiworkshop16> for updates.

Dario
(on behalf of the organizers)

Johanna Geiß and Michael Gertz
With a Little Help from my Neighbors: Person Name Linking Using the
Wikipedia Social Network
Ramine Tinati, Markus Luczak-Roesch and Wendy Hall
Finding Structure in Wikipedia Edit Activity: An Information Cascade
Approach
Paolo Boldi and Corrado Monti
Cleansing Wikipedia Categories using Centrality
Thomas Steiner
Wikipedia Tools for Google Spreadsheets
Yu Suzuki and Satoshi Nakamura
Assessing the Quality of Wikipedia Editors through Crowdsourcing
Vikrant Yadav and Sandeep Kumar
Learning Web Queries For Retrieval of Relevant Information About an Entity
in a Wikipedia Category
Haggai Roitman, Shay Hummel, Ella Rabinovich, Benjamine Sznajder, Noam
Slonim and Ehud Aharoni
On the Retrieval of Wikipedia Articles Containing Claims on Controversial
Topics
Tanushyam Chattopadhyay, Santa Maiti and Arindam Pal
Automatic Discovery of Emerging Trends using Cluster Name Synthesis on User
Consumption Data
Freddy Brasileiro, João Paulo A. Almeida, Victorio A. Carvalho and
Giancarlo Guizzardi
Applying a Multi-Level Modeling Theory to Assess Taxonomic Hierarchies in
Wikidata

*Dario Taraborelli  *Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata] Wiki Workshop 2016 @ ICWSM: deadline extended to March 3

2016-02-23 Thread Dario Taraborelli

Hi all – heads up that we extended the submission deadline for the Wiki
Workshop at ICWSM '16 to *Wednesday, March 3, 2016*. (The second deadline
remains unchanged: March 11, 2016).

You can check the workshop's website
 for submission instructions or
follow us at @wikiworkshop16  for live
updates.

Looking forward to your contributions.

Dario
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] [Analytics] [Wiki-Medicine] Zika

2016-02-18 Thread Dario Taraborelli

t some
>> more
>> >> information.
>> >> Depending on what complementary knowledge we want to produce, working
>> with
>> >> WikiProject Medicine can be helpful, too.
>> >
>> >
>> > Cool, yeah, I'm nowhere close to knowledgeable on this, I can data-dog
>> > though :)
>> >
>> >
>> > [1] www.cbc.ca/news/health/microcephaly-brazil-zika-reality-1.3442580
>> >
>> > ___
>> > Wikimedia-Medicine mailing list
>> > wikimedia-medic...@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikimedia-medicine
>> >
>>
>> ___
>> Wikimedia-Medicine mailing list
>> wikimedia-medic...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-medicine
>>
>> ___
>> Analytics mailing list
>> analyt...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
>
> --
> Thank you.
>
> Alex Druk
> alex.d...@gmail.com
> (775) 237-8550 Google voice
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>


-- 


*Dario Taraborelli  *Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Mix'n'match: how to preserve manually audited items for posterity?

2015-11-27 Thread Dario Taraborelli

Magnus, this is fantastic and works as expected, thanks a lot.

One last note regarding the use of different from (P1889 
<https://www.wikidata.org/wiki/Property:P1889>). While I agree with you that it 
would be overkill to generate all these relations for common homonyms, for new 
items created by Mix’n’match with the above tweak, where a single other notable 
individual was previously missing from Wikidata (and when no matching label can 
be found), it would be tremendously useful to automatically add a two-way 
relation (see for example Grasulfo (Q3775839 
<https://www.wikidata.org/wiki/Q3775839>) <—> different from (P1889 
<https://www.wikidata.org/wiki/Property:P1889>) <—> Grasulfo (Q21571734 
<https://www.wikidata.org/wiki/Q21571734>). Having this property added would 
save me 2 extra edits and permanently store disambiguation signal for future 
reference.

Thoughts?

> On Nov 24, 2015, at 9:54 AM, Luca Martinelli <martinellil...@gmail.com> wrote:
> 
> <3
> 
> L.
> 
> Il 23/nov/2015 21:05, "Magnus Manske" <magnusman...@googlemail.com 
> <mailto:magnusman...@googlemail.com>> ha scritto:
> Done.
> 
> On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov <abar...@wikimedia.org 
> <mailto:abar...@wikimedia.org>> wrote:
> On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli 
> <dtarabore...@wikimedia.org <mailto:dtarabore...@wikimedia.org>> wrote:
> On Nov 21, 2015, at 10:31, Magnus Manske <magnusman...@googlemail.com 
> <mailto:magnusman...@googlemail.com>> wrote:
>> A soultion could be to change the "not on Wikidata" button (or link) to a 
>> "create new item" button. The new item would have a label, a description 
>> (maybe), a statement with the catalog ID (if there is an associated WIkidata 
>> property!), and "instance of:human" if the entry is internally marked as 
>> "person", but nothing else.
> 
>> 
>> Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it 
>> would make sense, for catalogs with a Wikidata property at least.
> 
> I would strongly support this, with the restrictions you suggest. 
> 
> +1.  This would be good.
> 
> A.
> 
> -- 
> Asaf Bartov
> Wikimedia Foundation <http://www.wikimediafoundation.org/>
> 
> Imagine a world in which every single human being can freely share in the sum 
> of all knowledge. Help us make it a reality!
> https://donate.wikimedia.org 
> <https://donate.wikimedia.org/>___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata 
> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata 
> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



Dario Taraborelli  Head of Research, Wikimedia Foundation
wikimediafoundation.org <http://wikimediafoundation.org/> • nitens.org 
<http://nitens.org/> • @readermeter <http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Mix'n'match: how to preserve manually audited items for posterity?

2015-11-27 Thread Dario Taraborelli

err…point me to the correct item or fix it then? WP:BOLD
 
> On Nov 27, 2015, at 10:33 AM, Gerard Meijssen <gerard.meijs...@gmail.com> 
> wrote:
> 
> Hoi,
> It is highly likely that your Lombard duke already existed. So I think you 
> got it wrong.
> Thanks,
>  GerardM
> 
> On 27 November 2015 at 19:31, Dario Taraborelli <dtarabore...@wikimedia.org 
> <mailto:dtarabore...@wikimedia.org>> wrote:
> Gerard – I think you’re missing my point. I’m not suggesting this as a 
> display feature (which would be welcome and can always be generated by any 
> tool querying Wikidata labels) but as a contribution stored to avoid future 
> errors.
> 
>> On Nov 27, 2015, at 10:29 AM, Gerard Meijssen <gerard.meijs...@gmail.com 
>> <mailto:gerard.meijs...@gmail.com>> wrote:
>> 
>> Hoi,
>> Why not use Reasonator?
>> https://tools.wmflabs.org/reasonator/?find=Grasulfo 
>> <https://tools.wmflabs.org/reasonator/?find=Grasulfo>
>> Thanks,
>>  GerardM
>> 
>> On 27 November 2015 at 19:26, Dario Taraborelli <dtarabore...@wikimedia.org 
>> <mailto:dtarabore...@wikimedia.org>> wrote:
>> Magnus, this is fantastic and works as expected, thanks a lot.
>> 
>> One last note regarding the use of different from (P1889 
>> <https://www.wikidata.org/wiki/Property:P1889>). While I agree with you that 
>> it would be overkill to generate all these relations for common homonyms, 
>> for new items created by Mix’n’match with the above tweak, where a single 
>> other notable individual was previously missing from Wikidata (and when no 
>> matching label can be found), it would be tremendously useful to 
>> automatically add a two-way relation (see for example Grasulfo (Q3775839 
>> <https://www.wikidata.org/wiki/Q3775839>) <—> different from (P1889 
>> <https://www.wikidata.org/wiki/Property:P1889>) <—> Grasulfo (Q21571734 
>> <https://www.wikidata.org/wiki/Q21571734>). Having this property added would 
>> save me 2 extra edits and permanently store disambiguation signal for future 
>> reference.
>> 
>> Thoughts?
>> 
>>> On Nov 24, 2015, at 9:54 AM, Luca Martinelli <martinellil...@gmail.com 
>>> <mailto:martinellil...@gmail.com>> wrote:
>>> 
>>> <3
>>> 
>>> L.
>>> 
>>> Il 23/nov/2015 21:05, "Magnus Manske" <magnusman...@googlemail.com 
>>> <mailto:magnusman...@googlemail.com>> ha scritto:
>>> Done.
>>> 
>>> On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov <abar...@wikimedia.org 
>>> <mailto:abar...@wikimedia.org>> wrote:
>>> On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli 
>>> <dtarabore...@wikimedia.org <mailto:dtarabore...@wikimedia.org>> wrote:
>>> On Nov 21, 2015, at 10:31, Magnus Manske <magnusman...@googlemail.com 
>>> <mailto:magnusman...@googlemail.com>> wrote:
>>>> A soultion could be to change the "not on Wikidata" button (or link) to a 
>>>> "create new item" button. The new item would have a label, a description 
>>>> (maybe), a statement with the catalog ID (if there is an associated 
>>>> WIkidata property!), and "instance of:human" if the entry is internally 
>>>> marked as "person", but nothing else.
>>> 
>>>> 
>>>> Would that be welcomed by "mix'n'matchers", and Wikidata people? I think 
>>>> it would make sense, for catalogs with a Wikidata property at least.
>>> 
>>> I would strongly support this, with the restrictions you suggest. 
>>> 
>>> +1.  This would be good.
>>> 
>>> A.
>>> 
>>> -- 
>>> Asaf Bartov
>>> Wikimedia Foundation <http://www.wikimediafoundation.org/>
>>> 
>>> Imagine a world in which every single human being can freely share in the 
>>> sum of all knowledge. Help us make it a reality!
>>> https://donate.wikimedia.org 
>>> <https://donate.wikimedia.org/>___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata 
>>> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
>>> 
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.o

Re: [Wikidata] Mix'n'match: how to preserve manually audited items for posterity?

2015-11-27 Thread Dario Taraborelli

Gerard – I think you’re missing my point. I’m not suggesting this as a display 
feature (which would be welcome and can always be generated by any tool 
querying Wikidata labels) but as a contribution stored to avoid future errors.

> On Nov 27, 2015, at 10:29 AM, Gerard Meijssen <gerard.meijs...@gmail.com> 
> wrote:
> 
> Hoi,
> Why not use Reasonator?
> https://tools.wmflabs.org/reasonator/?find=Grasulfo 
> <https://tools.wmflabs.org/reasonator/?find=Grasulfo>
> Thanks,
>  GerardM
> 
> On 27 November 2015 at 19:26, Dario Taraborelli <dtarabore...@wikimedia.org 
> <mailto:dtarabore...@wikimedia.org>> wrote:
> Magnus, this is fantastic and works as expected, thanks a lot.
> 
> One last note regarding the use of different from (P1889 
> <https://www.wikidata.org/wiki/Property:P1889>). While I agree with you that 
> it would be overkill to generate all these relations for common homonyms, for 
> new items created by Mix’n’match with the above tweak, where a single other 
> notable individual was previously missing from Wikidata (and when no matching 
> label can be found), it would be tremendously useful to automatically add a 
> two-way relation (see for example Grasulfo (Q3775839 
> <https://www.wikidata.org/wiki/Q3775839>) <—> different from (P1889 
> <https://www.wikidata.org/wiki/Property:P1889>) <—> Grasulfo (Q21571734 
> <https://www.wikidata.org/wiki/Q21571734>). Having this property added would 
> save me 2 extra edits and permanently store disambiguation signal for future 
> reference.
> 
> Thoughts?
> 
>> On Nov 24, 2015, at 9:54 AM, Luca Martinelli <martinellil...@gmail.com 
>> <mailto:martinellil...@gmail.com>> wrote:
>> 
>> <3
>> 
>> L.
>> 
>> Il 23/nov/2015 21:05, "Magnus Manske" <magnusman...@googlemail.com 
>> <mailto:magnusman...@googlemail.com>> ha scritto:
>> Done.
>> 
>> On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov <abar...@wikimedia.org 
>> <mailto:abar...@wikimedia.org>> wrote:
>> On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli 
>> <dtarabore...@wikimedia.org <mailto:dtarabore...@wikimedia.org>> wrote:
>> On Nov 21, 2015, at 10:31, Magnus Manske <magnusman...@googlemail.com 
>> <mailto:magnusman...@googlemail.com>> wrote:
>>> A soultion could be to change the "not on Wikidata" button (or link) to a 
>>> "create new item" button. The new item would have a label, a description 
>>> (maybe), a statement with the catalog ID (if there is an associated 
>>> WIkidata property!), and "instance of:human" if the entry is internally 
>>> marked as "person", but nothing else.
>> 
>>> 
>>> Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it 
>>> would make sense, for catalogs with a Wikidata property at least.
>> 
>> I would strongly support this, with the restrictions you suggest. 
>> 
>> +1.  This would be good.
>> 
>> A.
>> 
>> -- 
>> Asaf Bartov
>> Wikimedia Foundation <http://www.wikimediafoundation.org/>
>> 
>> Imagine a world in which every single human being can freely share in the 
>> sum of all knowledge. Help us make it a reality!
>> https://donate.wikimedia.org 
>> <https://donate.wikimedia.org/>___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
>> https://lists.wikimedia.org/mailman/listinfo/wikidata 
>> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
>> 
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
>> https://lists.wikimedia.org/mailman/listinfo/wikidata 
>> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
>> 
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
>> https://lists.wikimedia.org/mailman/listinfo/wikidata 
>> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
> 
> 
> 
> Dario Taraborelli  Head of Research, Wikimedia Foundation
> wikimediafoundation.org <http://wikimediafoundation.org/> • nitens.org 
> <http://nitens.org/> • @readermeter <http://twitter.com/readermeter>
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata 
> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
> 
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



Dario Taraborelli  Head of Research, Wikimedia Foundation
wikimediafoundation.org <http://wikimediafoundation.org/> • nitens.org 
<http://nitens.org/> • @readermeter <http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Mix'n'match: how to preserve manually audited items for posterity?

2015-11-23 Thread Dario Taraborelli

Magnus, is the change live yet? I unmatched Giuseppe Civran (Q3770329 
<https://www.wikidata.org/wiki/Q3770329>) and Giuseppe Civran (DBI 
<http://www.treccani.it/enciclopedia/giuseppe-civran_(Dizionario_Biografico)/>) 
and flagged the latter as “Not on Wikidata”, but no new item was created. 

I am starting from this view of Mix’n’Match:
https://tools.wmflabs.org/mix-n-match/?mode=catalog=55=0_noq=0_autoq=1_userq=0_na=0#the_start
 
<https://tools.wmflabs.org/mix-n-match/?mode=catalog=55=0_noq=0_autoq=1_userq=0_na=0#the_start>

> On Nov 23, 2015, at 12:05 PM, Magnus Manske <magnusman...@googlemail.com> 
> wrote:
> 
> Done.
> 
> On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov <abar...@wikimedia.org 
> <mailto:abar...@wikimedia.org>> wrote:
> On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli 
> <dtarabore...@wikimedia.org <mailto:dtarabore...@wikimedia.org>> wrote:
> On Nov 21, 2015, at 10:31, Magnus Manske <magnusman...@googlemail.com 
> <mailto:magnusman...@googlemail.com>> wrote:
>> A soultion could be to change the "not on Wikidata" button (or link) to a 
>> "create new item" button. The new item would have a label, a description 
>> (maybe), a statement with the catalog ID (if there is an associated WIkidata 
>> property!), and "instance of:human" if the entry is internally marked as 
>> "person", but nothing else.
> 
>> 
>> Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it 
>> would make sense, for catalogs with a Wikidata property at least.
> 
> I would strongly support this, with the restrictions you suggest. 
> 
> +1.  This would be good.
> 
> A.
> 
> -- 
> Asaf Bartov
> Wikimedia Foundation <http://www.wikimediafoundation.org/>
> 
> Imagine a world in which every single human being can freely share in the sum 
> of all knowledge. Help us make it a reality!
> https://donate.wikimedia.org 
> <https://donate.wikimedia.org/>___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata 
> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



Dario Taraborelli  Head of Research, Wikimedia Foundation
wikimediafoundation.org <http://wikimediafoundation.org/> • nitens.org 
<http://nitens.org/> • @readermeter <http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Mix'n'match: how to preserve manually audited items for posterity?

2015-11-21 Thread Dario Taraborelli



> On Nov 21, 2015, at 10:31, Magnus Manske <magnusman...@googlemail.com> wrote:
> 
> To address the first point:
> So the auto-matches are just simple label-mmatches. Removing the automatch in 
> mix'n'match just says that this was not the same person etc. and the entry is 
> moved back to the "unmatched" pool.
> 
> This does /not/ mean there isn't a match on Wikidata! You only say that by 
> setting the entry to "not on Wikidata".

Apologies, I was indeed referring to items explicitly flagged as "not on WD", 
not simply unmerged ones. 

> And I do occasionally batch-create items for those, usually when all entries 
> are processed. Which can have other issues, like an item was created in the 
> meantime, and now I create a duplicate.

+1 

> A soultion could be to change the "not on Wikidata" button (or link) to a 
> "create new item" button. The new item would have a label, a description 
> (maybe), a statement with the catalog ID (if there is an associated WIkidata 
> property!), and "instance of:human" if the entry is internally marked as 
> "person", but nothing else.
> 
> Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it 
> would make sense, for catalogs with a Wikidata property at least.

I would strongly support this, with the restrictions you suggest. 

> As for the second point, I think in most cases the mere existence of a new, 
> better-fitting item (or at least one equally fitting at first glance) will 
> prevent false assignments. Sure, there are some cases, like the one given as 
> an example, which would profit from a P1889 "different from" statement. We 
> have run into that problem with the "merge game" I'm running, where people do 
> a lot of false merges because the items seem identical at first glance.
> 
> However, I don't think this is prevalent enough to warrant special treatment 
> in mix'n'match itself. For the few cases were it would help, Wikidata can 
> always be edited manually. Besides, where would we draw the line? "John 
> Smith" returns hundreds of search results; that would translate into tens of 
> thousands of "different from" statements.
> 
> I think once your "Giulio Baldigara" example brother is created, and both 
> will show up in search results, that alone will be enough to serve as a 
> "different from" warning in most settings.
> Mix'n'match automatch, for example, will only match entries where the exact 
> label is unique in labels and aliases; two items with a "Giulio Baldigara" 
> label or alias would not automatch an entry with that name.

These are valid concerns, happy to withdraw the second part of the proposal. 
Thanks Maarten for pointing me to the right property. 

> On Sat, Nov 21, 2015 at 5:35 PM Dario Taraborelli 
> <dtarabore...@wikimedia.org> wrote:
>> I finally found the time to play extensively with Mix’n’match and it’s by 
>> far one of the most promising models I’ve come across for Wikidata growth. A 
>> short conversation with Magnus on Twitter got me thinking on how to best 
>> preserve the output of costly human curation.[1]
>> 
>> I spent most of my time manually auditing automatically matched entries from 
>> the Dizionario Biografico degli Italiani [2]. These entries are long, 
>> unstructured biographical entries and it takes quite a lot of effort to 
>> understand if the two individuals referenced by Wikidata and DBI actually 
>> are the same person. This is a great example of a task that’s still pretty 
>> hard for a machine to perform, no matter how sophisticated the algorithm.
>> 
>> My favorite example? Mix’n’ match suggested a match between Giulio Baldigara 
>> (Q1010811) and Giulio Baldigara (DBI) which looked totally legitimate: these 
>> two individuals are both Italian architects from the 16th century with the 
>> same name, they were both born around the same years in the same city, they 
>> were both active in Hungary at the same time: strong indication that they 
>> are the same person, right? It turns out they are brothers and the full name 
>> of the person referenced in Wikidata is Giulio Cesare Baldigara (the least 
>> known in a family of architects). I unmatched the suggestion and flagged the 
>> DBI entry as non existing in Wikidata.
>> 
>> My question at the moment is: the output of a labor-intensive review of a 
>> potential match is currently stored as a volatile flag in a tool hosted on 
>> labs, but is invisible in Wikidata. Should something happen to Mix’n’match 
>> (god forbid) the result of my work would get lost. Which got me thinking:
>>

[Wikidata] Mix'n'match: how to preserve manually audited items for posterity?

2015-11-21 Thread Dario Taraborelli

I finally found the time to play extensively with Mix’n’match and it’s by far 
one of the most promising models I’ve come across for Wikidata growth. A short 
conversation with Magnus on Twitter got me thinking on how to best preserve the 
output of costly human curation.[1]

I spent most of my time manually auditing automatically matched entries from 
the Dizionario Biografico degli Italiani [2]. These entries are long, 
unstructured biographical entries and it takes quite a lot of effort to 
understand if the two individuals referenced by Wikidata and DBI actually are 
the same person. This is a great example of a task that’s still pretty hard for 
a machine to perform, no matter how sophisticated the algorithm.

My favorite example? Mix’n’ match suggested a match between Giulio Baldigara 
(Q1010811 ) and Giulio Baldigara (DBI 
)
 which looked totally legitimate: these two individuals are both Italian 
architects from the 16th century with the same name, they were both born around 
the same years in the same city, they were both active in Hungary at the same 
time: strong indication that they are the same person, right? It turns out they 
are brothers and the full name of the person referenced in Wikidata is Giulio 
Cesare Baldigara (the least known in a family of architects). I unmatched the 
suggestion and flagged the DBI entry as non existing in Wikidata.

My question at the moment is: the output of a labor-intensive review of a 
potential match is currently stored as a volatile flag in a tool hosted on 
labs, but is invisible in Wikidata. Should something happen to Mix’n’match (god 
forbid) the result of my work would get lost. Which got me thinking:

- shouldn’t a manually unmatched item be created directly on Wikidata (after 
all DBI is all about notable individuals who would easily pass Wikidata’s 
notability threshold for biographies)
- shouldn’t the relation between Giulio (Cesare) Baldigara (Q1010811 
) and the newly created item for Giulio 
Baldigara be explicitly represented via a not the same as property, to prevent 
future humans or machines from accidentally remerging the two items based on 
some kind of heuristics

Thoughts welcome,

Dario

[1] https://twitter.com/ReaderMeter/status/667214565621432320 

[2] 
https://tools.wmflabs.org/mix-n-match/?mode=catalog=55=0_noq=0_autoq=1_userq=0_na=0
 



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Mix'n'match: how to preserve manually audited items for posterity?

2015-11-21 Thread Dario Taraborelli



> On Nov 21, 2015, at 10:44, rupert THURNER <rupert.thur...@gmail.com> wrote:
> 
> 
> On Nov 21, 2015 18:35, "Dario Taraborelli" <dtarabore...@wikimedia.org> wrote:
> >
> 
> > My favorite example? Mix’n’ match suggested a match between Giulio 
> > Baldigara (Q1010811) and Giulio Baldigara (DBI) which looked totally 
> > legitimate: these two individuals are both Italian architects from the 16th 
> > century with the same name, they were both born around the same years in 
> > the same city, they were both active in Hungary at the same time: strong 
> > indication that they are the same person, right? It turns out they are 
> > brothers and the full name of the person referenced in Wikidata is Giulio 
> > Cesare Baldigara (the least known in a family of architects). I unmatched 
> > the suggestion and flagged the DBI entry as non existing in Wikidata.
> 
> Hi dario, an interesting example. How did you determine these two are 
> different persons?
> 
> Rupert 
DBI separately references three brothers: Giulio, Giulio Cesare and Ottavio and 
the entry suggested by MixNMatch is about Giulio. The Wikidata item was created 
from the Hungarian article which clearly refers to Giulio Cesare, but the WD 
label was created as Giulio, which resulted in the false positive. 


> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Announcing Wikidata Taxonomy Browser (beta)

2015-10-22 Thread Dario Taraborelli

I’m constantly getting 500 errors.

> On Oct 22, 2015, at 10:25 AM, Thomas Douillard <thomas.douill...@gmail.com> 
> wrote:
> 
> Great tool ! The error detection is precious !
> 
> 2015-10-22 17:31 GMT+02:00 Markus Kroetzsch <markus.kroetz...@tu-dresden.de 
> <mailto:markus.kroetz...@tu-dresden.de>>:
> Hi all,
> 
> I am happy to announce a new tool [1], written by Serge Stratan, which allows 
> you to browse the taxonomy (subclass of & instance of relations) between 
> Wikidata's most important class items. For example, here is the Wikidata 
> taxonomy for Pizza (discussed recently on this list):
> 
> http://sergestratan.bitbucket.org?draw=true=s0=177,2095,7802,28877,35120,223557,386724,488383,666242,736427,746549,2424752,1513,16686448
>  
> <http://sergestratan.bitbucket.org/?draw=true=s0=177,2095,7802,28877,35120,223557,386724,488383,666242,736427,746549,2424752,1513,16686448>
> 
> 
> == What you see there ==
> 
> Solid green lines mean "subclass of" relations (subclasses are lower), while 
> dashed purple lines are "instance of" relations (instances are lower). Drag 
> and zoom the view as usual. Hover over items for more information. Click on 
> arrows with numbers to display upper or lower neighbours. Right-click on 
> classes to get more options.
> 
> The sidebar on the left shows statistics and presumed problems in the data 
> (redundancies and likely errors). You can select a report type to see the 
> reports, and click on any line to show the error. If you search for a class 
> in the search field, the errors will be narrowed down to issues related to 
> the taxonomy of this class.
> 
> The toolbar at the top has options to show and hide items based on the 
> current selection (left click on any box).
> 
> Edges in red are the wrong way around (top to bottom). This occurs only when 
> there are cycles in the "taxonomy".
> 
> 
> == Micro tutorial ==
> 
> (1) Enter "Unicorn" in the search box, press return.
> (2) Zoom out a bit by scrolling your mouse/touchpad
> (3) Click on the "Unicorn" item box. It becomes blue (selected).
> (4) Click "Expand up" in the toolbar at the top
> (5) Zoom out to see the taxonomy of unicorn
> (6) Find the class "Fictional Horse" (directly above unicorn) and click its 
> downwards arrow labelled "3" to see all three children items of "fictional 
> horse".
> (7) Click the share button on the top right to get a link to this view.
> 
> You can also create your own share link manually by just changing the Qids in 
> the URL as you like.
> 
> 
> == Status and limitations ==
> 
> This is a prototype and it still has some limits:
> 
> * It only shows "proper" classes that have at least one instance or subclass. 
> This is to reduce the overall data size and load time.
> * The data is based on dumps (the date is shown on the right). It is not a 
> live view.
> * The layout is sometimes too dense. You can find a "hidden" option to make 
> it more spacy behind the sidebar (click "Sidebar" to see it). This helps to 
> disentangle larger graphs.
> * There are some minor bugs in the UI. You sometimes need to click more than 
> once until the right thing happens.
> * The help page at http://sergestratan.bitbucket.org/howtouse.html 
> <http://sergestratan.bitbucket.org/howtouse.html> does not explain everything 
> in detail yet.
> 
> It is planned to work on some of these limitations in the future.
> 
> The hope is that this tool will reveal many errors in Wikidata's taxonomy 
> that are otherwise hard to detect. For example, you can see easily that every 
> "Ship" is an "Event" in Wikidata, that every "Hobbit" is a "Fantasy Race", 
> and that every "Monday" is both a "Mathematical object" and a "Unit of 
> measurement".
> 
> Feedback is welcome (on the tool; better start new threads for feedback on 
> the Wikidata taxonomy ;-),
> 
> Markus
> 
> 
> [1] http://sergestratan.bitbucket.org <http://sergestratan.bitbucket.org/>
> 
> -- 
> Markus Kroetzsch
> Faculty of Computer Science
> Technische Universität Dresden
> +49 351 463 38486 <tel:%2B49%20351%20463%2038486>
> http://korrekt.org/ <http://korrekt.org/>
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata 
> <https://lists.wikimedia.org/mailman/listinfo/wikidata>
> 
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



Dario Taraborelli  Head of Research, Wikimedia Foundation
wikimediafoundation.org <http://wikimediafoundation.org/> • nitens.org 
<http://nitens.org/> • @readermeter <http://twitter.com/readermeter>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

[Wikidata-l] Identifier lookup services

2014-12-05 Thread Dario Taraborelli

What are best practices to represent in Wikidata alternate services that can be 
used to look up the same identifier?

For example:
- a PDB ID can be looked up via PDBe or RCSB. 
- a Pubmed ID can be looked up via PubMed or EuropePMC etc.

Is the expectation that I should add all relevant services as separate 
Formatter URLs to the corresponding property?

Dario
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Wikidata: CSV, Shapefile, etc.

2014-08-13 Thread Dario Taraborelli

There's a proposal I posted a while ago to store generic datasets that can be 
represented in a tabular or JSON format in a dedicated project namespace with 
dedicated handlers:

http://meta.wikimedia.org/wiki/DataNamespace

There's some good discussion on the talk page on the differences between this 
type of data and structure data hosted on Wikidata and where this thing could 
live (it could live on any Wikimedia wiki, including Commons or Meta). It looks 
like this could be a good fit for shapefiles and I'd love to hear your thoughts 
if you have a moment to read this.

Dario

 On Aug 5, 2014, at 8:43, Paul Houle ontolo...@gmail.com wrote:
 
 I'm intensely interested in links to shapefiles from databases such as 
 Wikidata,  DBpedia and Freebase.  In particular I'd like to get Natural Earth 
 hooked up
 
 http://www.naturalearthdata.com/
 
 It's definitely a weakness of current generic databases that they use the 
 'point GIS' model that is so popular in the social media world.
 
 
 On Tue, Aug 5, 2014 at 10:14 AM, Magnus Manske magnusman...@googlemail.com 
 wrote:
 We don't have shapefiles yet, but a lot of property types such as geographic 
 coordinates (as in, one per item, ideally...), external identifiers (e.g. 
 VIAF), dates, etc.
 
 A (reasonably) simple way to mass-add statements to Wikidata is this tool:
 http://tools.wmflabs.org/wikidata-todo/quick_statements.php
 
 A combination of spreadsheet apps, shell commands, and/or a good text editor 
 should allow you to convert many CSVs into the tool's input format.
 
 Cheers,
 Magnus
 
 
 On Tue, Aug 5, 2014 at 3:01 PM, Brylie Christopher Oxley 
 bry...@gnumedia.org wrote:
 I would like to contribute data to Wikidata that is in the form of CSV 
 files,
 geospatial shapefiles, etc.
 
 Is there currently, or planned, functionality to store general structured 
 data
 on Wikidata?
 --
 Brylie Christopher Oxley
 http://gnumedia.org
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
 -- 
 Paul Houle
 Expert on Freebase, DBpedia, Hadoop and RDF
 (607) 539 6254paul.houle on Skype   ontolo...@gmail.com
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Tabular data and metadata recommendations from W3C

2014-03-27 Thread Dario Taraborelli

Just published by the “CSV on the Web” working group:

CSV on the Web: Use Cases and Requirements
http://www.w3.org/TR/2014/WD-csvw-ucr-20140327/

Model for Tabular Data and Metadata on the Web 
http://www.w3.org/TR/2014/WD-tabular-data-model-20140327/

Read more: http://www.w3.org/blog/news/archives/3758___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] we're live on all Wikipedias with phase 2

2013-04-24 Thread Dario Taraborelli

Kudos to the team on hitting this big milestone!

On Apr 24, 2013, at 12:08 PM, emijrp emi...@gmail.com wrote:

 Congratulations :)
 
 
 2013/4/24 Lydia Pintscher lydia.pintsc...@wikimedia.de
 Heya folks :)
 
 The start of phase 2 has just been deployed on all 274 remaining Wikipedias 
 \o/
 http://blog.wikimedia.de/2013/04/24/wikidata-all-around-the-world/
 
 
 Cheers
 Lydia
 
 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Community Communications for Technical Projects
 
 Wikimedia Deutschland e.V.
 Obentrautstr. 72
 10963 Berlin
 www.wikimedia.de
 
 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
 
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Qualifiers, bug fixes, improved search - all in one night!

2013-04-18 Thread Dario Taraborelli

Hi Lydia and all,

great to hear about this deployment, I am particularly excited about qualifier 
support (as per my previous post).

Since you also mention improvements to search, I was wondering whether you had 
specific plans for work on search functionality.
Unless I use the Items by title page, if  type Berlin in a regular search 
form the item I am actually looking for (Q64) is ranked #34 in the search 
results (i.e. three clicks away on the more link).

I'd be curious to hear the team's thoughts on how to make search more effective 
and user friendly.

Dario

On Apr 18, 2013, at 2:26 PM, Lydia Pintscher lydia.pintsc...@wikimedia.de 
wrote:

 Heya folks :)
 
 We have just deployed qualifiers
 (http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer#Qualifiers)
 and bug fixes on wikidata.org. Qualifiers! Bug fixes are especially
 for Internet Explorer 8. Please let me know how it is working for you
 now if you're using IE8 and if there are still any major problems when
 using Wikidata with it.
 
 In addition the script we ran on the database to make search
 case-insensitive has finished running. This should be another huge
 step towards a nice search here. (This change also affects the
 autocomplete for items and properties.)
 
 As usual please let me know what you think and tell me if there are any 
 issues.
 
 
 Cheers
 Lydia
 
 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Community Communications for Wikidata
 
 Wikimedia Deutschland e.V.
 Obentrautstr. 72
 10963 Berlin
 www.wikimedia.de
 
 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
 
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Expiration date for data

2013-03-14 Thread Dario Taraborelli

Thanks Denny for the update and everybody else for the feedback.

The cases I am particularly interested in are those of qualifiers to express 
that Elizabeth I was Queen of England between 1558 and 1603, or that the 
city of Vibo Valentia was in the Province of Catanzaro up to 1996, in the 
Province of Vibo Valentia until 2014 and in the Province of 
Catanzaro-Crotone-Vibo Valentia after 2014.

Until these qualifiers become available, the only way to represent that a 
region has changed its governor is to overwrite the old value of head of local 
government with the current one.

Dario

On Mar 14, 2013, at 3:57 AM, Denny Vrandečić denny.vrande...@wikimedia.de 
wrote:

 Hi Dario,
 
 two or three features are still missing to enable that (sorted in order we 
 are probably going to deploy them):
 * qualifiers
 * the time datatype
 * statement ranks
 
 As soon as they are available, this can be modeled in a way that it can be 
 useful for projects accessing the data.
 
 So, progress yet, but it's not there yet :)
 
 Cheers,
 Denny
 
 
 
 
 
 
 2013/3/14 Dario Taraborelli dtarabore...@wikimedia.org
 Has there been any progress on time-based qualifiers since this thread?
 If so, can someone point me to relevant discussions/proposals?
 
 Thanks
 Dario
 
 On Oct 11, 2012, at 8:28 AM, Marco Fleckinger marco.fleckin...@gmail.com 
 wrote:
 
  Hi,
 
  On 11.10.2012 16:12, Lydia Pintscher wrote:
  On Thu, Oct 11, 2012 at 11:13 AM,bene...@zedat.fu-berlin.de  wrote:
  Is there something like VALID_FROM and VALID_TO in your Database?
 
  LB
 
  This is basically what the qualifiers do.
  http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has
  more details.
 
  Hm, sorry I didn't remember this. Thank you for reminding!
 
  Marco
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
 -- 
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de
 
 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. 
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter 
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für 
 Körperschaften I Berlin, Steuernummer 27/681/51985.
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Expiration date for data

2013-03-13 Thread Dario Taraborelli

Has there been any progress on time-based qualifiers since this thread?
If so, can someone point me to relevant discussions/proposals?

Thanks
Dario

On Oct 11, 2012, at 8:28 AM, Marco Fleckinger marco.fleckin...@gmail.com 
wrote:

 Hi,
 
 On 11.10.2012 16:12, Lydia Pintscher wrote:
 On Thu, Oct 11, 2012 at 11:13 AM,bene...@zedat.fu-berlin.de  wrote:
 Is there something like VALID_FROM and VALID_TO in your Database?
 
 LB
 
 This is basically what the qualifiers do.
 http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has
 more details.
 
 Hm, sorry I didn't remember this. Thank you for reminding!
 
 Marco
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] WikipediaJS

2012-09-10 Thread Dario Taraborelli

OKFN Labs just released a lightweight JS library to pull machine-readable data 
on Wikipedia articles from DBPedia

http://okfnlabs.org/wikipediajs/

Dario
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] RDFa, Microdata and Microformats

2012-04-09 Thread Dario Taraborelli

Thanks, I have worked with {{coord}} templates in the past but didn't have a 
good handle of the volume of transclusion of other microformat-enabled 
templates. 
Sad to hear about SmackBot XV, hope something can be done to resume the request.

Dario

On Apr 9, 2012, at 2:49 PM, Andy Mabbett wrote:

 By counting instances of the templates emitting microformats. For
 example, on en.WP, {{Coord}} alone emits 757,299 'geo' (coordinates)
 microformats [1]; {{Infobox settlement}} emits 273,300 hCard (place)
 microformats [2] - that's over a million, alone. {{Infobox person}}
 emits 105,623 hCard (biography) microformats [3]; {{Taxobox}} emits
 197,363 species microformats [4], and there are hundreds more
 templates emitting smaller, but not inconsequential, numbers of
 microformats of the above and other types [5].
 
 A further, vast, number of hCalendar (event) microformats are emitted,
 but without the required date metadata, because a long-requested bot
 task\ [6] remains unfulfilled. For example, {{Infobox album}} emits
 110,712  hCalendar microformats - as well as another 110,712 complete
 hAudio microformats [7].
 
 
 [1] 
 http://toolserver.org/~jarry/templatecount/index.php?lang=ennamespace=10name=Coord
 
 [2] 
 http://toolserver.org/~jarry/templatecount/index.php?lang=ennamespace=10name=Infobox+settlement
 
 [3] 
 http://toolserver.org/~jarry/templatecount/index.php?lang=ennamespace=10name=Infobox_person
 
 [4] 
 http://toolserver.org/~jarry/templatecount/index.php?lang=ennamespace=10name=taxobox
 
 [5] http://en.wikipedia.org/wiki/Category:Templates_generating_microformats
 
 [6] 
 http://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/SmackBot_XV
 
 [7] 
 http://toolserver.org/~jarry/templatecount/index.php?lang=ennamespace=10name=Infobox+album
 
 
 On 9 April 2012 20:52, Dario Taraborelli dtarabore...@wikimedia.org wrote:
 Andy,
 
 chiming in late in this thread, can you give me some pointers on how you 
 estimate this figure?
 
 Thanks
 Dario
 
 On Apr 3, 2012, at 5:32 AM, Andy Mabbett wrote:
 
 Yes (I was on my mobile so couldn't conveniently post a link, when I
 sent my last email; apologies.
 
 en-WP already emits over a million microformats.
 
 
 On 3 April 2012 13:22, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote:
 On Tue, Apr 3, 2012 at 1:51 AM, Andy Mabbett a...@pigsonthewing.org.uk 
 wrote:
 I'm the (for want of a better word) project lead for Microformats on
 en-Wikipedia.
 
 How can I help?
 
 Hi Andy,
 
 Is it this project:
 http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Microformats ?
 
 
 Cheers
 Lydia
 
 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Community Communications for Wikidata
 
 Wikimedia Deutschland e.V.
 Eisenacher Straße 2
 10777 Berlin
 www.wikimedia.de
 
 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
 
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
 --
 Andy Mabbett
 @pigsonthewing
 http://pigsonthewing.org.uk
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
 -- 
 Andy Mabbett
 @pigsonthewing
 http://pigsonthewing.org.uk
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

48 matches

Mail list logo