Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-21 Thread Pine W
On Sat, Oct 20, 2018 at 4:41 PM Daniel Kinzler 
wrote:

> Hi Pine, sorry for the misleading wording. Let me clarify below.
>
> Am 19.10.18 um 9:51 nachm. schrieb Pine W:
> > Hi Markus, I seem to be missing something. Daniel said, "And I think the
> best
> > way to achieve this is to start using the ontology as an ontology on
> wikimedia
> > projects, and thus expose the fact that the ontology is broken. This
> gives
> > incentive to fix it, and examples as to what things should be possible
> using
> > that ontology (namely, some level of basic inference)." I think that I
> > understand the basic idea behind structured data on Commons. I also
> think that I
> > understand your statement above. What I'm not understanding is how
> Daniel's
> > proposal to "start using the ontology as an ontology on wikimedia
> projects, and
> > thus expose the fact that the ontology is broken." isn't a proposal to
> add poor
> > quality information from Wikidata onto Wikipedia and, in the process,
> give
> > Wikipedians more problems to fix. Can you or Daniel explain this?
>
> What I meant in concrete terms was: let's start using wikidata items for
> tagging
> on commons, even though search results based on such tags will currently
> not
> yield very good results, due to the messy state of the ontology, and hope
> people
> fix the ontology to get better search results. If people use "poodle" to
> tag an
> image and it's not found when searching for "dog", this may lead to people
> investigating why that is, and coming up with ontology improvements to fix
> it.
>
> What I DON'T mean is "let's automatically generate navigation boxes for
> wikipedia articles based on an imperfect  ontology, and push them on
> everyone".
> I mean, using the ontology to generate navigation boxes for some kinds of
> articles may be a nice idea, and could indeed have the same effect - that
> people
> notice problems in the ontology, and fix them. But that would be something
> the
> local wiki communities decide to do, not something that comes from
> Wikidata or
> the Structured Data project.
>
> The point I was trying to make is: the Wiki communities are rather good in
> creating structures that serve their purpose, but they do so pragmatically,
> along the behavior of the existing tools. So, rather than trying to work
> around
> the quirks of the ontology in software, the software should use very simply
> rules (such as following the subclass relation), and let people adopt the
> data
> to this behavior, if and when they find it useful to do so. This approach,
> over
> time, provides better results in my opinion.
>
> Also, keep in mind that I was referring to an imperfect *improvement* of
> search.
> the alternative being to only return things tagged with "dog" when
> searching for
> "dog". I was not suggesting to degrade user experience in order to
> incentivize
> editors. I'm rather suggesting the opposite: let's NOT give people a
> reason tag
> images that show poodles with "poodle" and "dog" and "mammal" and "animal"
> and
> "pet" and...
>
> --
> Daniel Kinzler
> Principal Software Engineer, Core Platform
> Wikimedia Foundation
>

Hi Daniel,

Thanks for the explanation. I think that I now better understand what
you're proposing. This explanation of the proposal sounds reasonable to me
in a way that my earlier understanding of the proposal did not.

By the way, I don't know what your normal work schedule is, but I usually
don't expect staff to respond to non-urgent emails over the weekend,
although I appreciate it. :) Waiting until Monday is usually fine.

Pine
( https://meta.wikimedia.org/wiki/User:Pine )
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-20 Thread Peter F. Patel-Schneider
On 10/20/18 11:57 AM, Ettore RIZZA wrote:

> From Peter F. Patel-Schneider
> Hi,
> 
> I see no reason that this [adding subclass relationships sanctioned by 
> corresponding Wikipedia pages]
>  should not be done for other groups of living
> organisms where subclass relationships are missing.  
> 
> 
> It seems very simple to me. Maybe too simple. Perhaps I am intimidated by the
> kilometers of discussions I'm reading about the taxon-centric aspect of
> Wikidata, when I'm not a biologist. So, there is no problem if we add
> that Cetacea  is a subclass of aquatic
> mammals , as indicated by
> its Wikipedia page ?
> 
> Cheers,
> 
> Ettore

How can there be any effective counter to adding these relationships?  Many
Wikidata items correspond to Wikipedia pages.   If the true information about
the Wikidata item in the corresponding pages cannot be added to the Wikidata
items, then the correspondence is not correct and should be removed.

peter

PS:  Of course, determining truth may be contentious in some cases, but these
will be a small minority.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-20 Thread Ettore RIZZA
Hi,

I see no reason that this should not be done for other groups of living
> organisms where subclass relationships are missing.


It seems very simple to me. Maybe too simple. Perhaps I am intimidated by
the kilometers of discussions I'm reading about the taxon-centric aspect of
Wikidata, when I'm not a biologist. So, there is no problem if we add that
Cetacea  is a subclass of aquatic
mammals , as indicated by its Wikipedia
page ?

Cheers,

Ettore

On Sat, 20 Oct 2018 at 19:20, Peter F. Patel-Schneider <
pfpschnei...@gmail.com> wrote:

> On 10/20/18 6:29 AM, Ettore RIZZA wrote:
> > For most people, ants are insects, not instances of taxon.
>
> Sure, but Wikidata doesn't have ants being instances of taxon.  Instead,
> Formicidae (aka ant) is an instance of taxon, which seems right to me.
>
> Here are some extracts from Wikidata as of a few minutes ago, also showing
> the English Wikipedia page for the Wikidata item.
>
> https://www.wikidata.org/wiki/Q7386 Formicidae  ant
> https://en.wikipedia.org/wiki/Ant
> instance of taxon
> no subclass of statement
>
> https://www.wikidata.org/wiki/Q1390 insect
> https://en.wikipedia.org/wiki/Insect
> subclass of animal
> instance of taxon
>
> What is missing is that Q7386 is a subclass of Q1390, which is sanctioned
> by
> the "Ants are eusocial insects" phrase at the start of
> https://en.wikipedia.org/wiki/Ant.  I added that statement and put as
> source
> English Wikipedia.  (By the way, how can I source a statement to a
> particular
> Wikipedia page?)
>
>
> I see no reason that this should not be done for other groups of living
> organisms where subclass relationships are missing.
>
> peter
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-20 Thread Peter F. Patel-Schneider
On 10/20/18 6:29 AM, Ettore RIZZA wrote:
> For most people, ants are insects, not instances of taxon.

Sure, but Wikidata doesn't have ants being instances of taxon.  Instead,
Formicidae (aka ant) is an instance of taxon, which seems right to me.

Here are some extracts from Wikidata as of a few minutes ago, also showing
the English Wikipedia page for the Wikidata item.

https://www.wikidata.org/wiki/Q7386 Formicidae  ant
https://en.wikipedia.org/wiki/Ant
instance of taxon
no subclass of statement

https://www.wikidata.org/wiki/Q1390 insect
https://en.wikipedia.org/wiki/Insect
subclass of animal
instance of taxon

What is missing is that Q7386 is a subclass of Q1390, which is sanctioned by
the "Ants are eusocial insects" phrase at the start of
https://en.wikipedia.org/wiki/Ant.  I added that statement and put as source
English Wikipedia.  (By the way, how can I source a statement to a particular
Wikipedia page?)


I see no reason that this should not be done for other groups of living
organisms where subclass relationships are missing.

peter

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-20 Thread Daniel Kinzler
Hi Pine, sorry for the misleading wording. Let me clarify below.

Am 19.10.18 um 9:51 nachm. schrieb Pine W:
> Hi Markus, I seem to be missing something. Daniel said, "And I think the best
> way to achieve this is to start using the ontology as an ontology on wikimedia
> projects, and thus expose the fact that the ontology is broken. This gives
> incentive to fix it, and examples as to what things should be possible using
> that ontology (namely, some level of basic inference)." I think that I
> understand the basic idea behind structured data on Commons. I also think 
> that I
> understand your statement above. What I'm not understanding is how Daniel's
> proposal to "start using the ontology as an ontology on wikimedia projects, 
> and
> thus expose the fact that the ontology is broken." isn't a proposal to add 
> poor
> quality information from Wikidata onto Wikipedia and, in the process, give
> Wikipedians more problems to fix. Can you or Daniel explain this?

What I meant in concrete terms was: let's start using wikidata items for tagging
on commons, even though search results based on such tags will currently not
yield very good results, due to the messy state of the ontology, and hope people
fix the ontology to get better search results. If people use "poodle" to tag an
image and it's not found when searching for "dog", this may lead to people
investigating why that is, and coming up with ontology improvements to fix it.

What I DON'T mean is "let's automatically generate navigation boxes for
wikipedia articles based on an imperfect  ontology, and push them on everyone".
I mean, using the ontology to generate navigation boxes for some kinds of
articles may be a nice idea, and could indeed have the same effect - that people
notice problems in the ontology, and fix them. But that would be something the
local wiki communities decide to do, not something that comes from Wikidata or
the Structured Data project.

The point I was trying to make is: the Wiki communities are rather good in
creating structures that serve their purpose, but they do so pragmatically,
along the behavior of the existing tools. So, rather than trying to work around
the quirks of the ontology in software, the software should use very simply
rules (such as following the subclass relation), and let people adopt the data
to this behavior, if and when they find it useful to do so. This approach, over
time, provides better results in my opinion.

Also, keep in mind that I was referring to an imperfect *improvement* of search.
the alternative being to only return things tagged with "dog" when searching for
"dog". I was not suggesting to degrade user experience in order to incentivize
editors. I'm rather suggesting the opposite: let's NOT give people a reason tag
images that show poodles with "poodle" and "dog" and "mammal" and "animal" and
"pet" and...

-- 
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-20 Thread Ettore RIZZA
Hello,

It is interesting to note that what Cparle wants are "is a" relationships
based on common sense. For most people, ants are insects, not instances of
taxon. A clarinet is a woodwind instrument, and woodwind instruments are
musical instruments, not an instance of "first order metaclass".

One of the best sources of "common sense" hypernymy is probably the first
sentence of a Wikipedia page. Whether in English, French, Italian, a woman
is always "a female *human *being."

For "poodle", this would look like (following the links in the English
version of Wikipedia):

- The poodle is a group of formal *dog breeds*

- Dog breeds are *dogs* that...

- The domestic dog (...) is a member of the genus *Canis* (canines)

- Canis is a genus of the *Canidae*

- The biological family Canidae (...) is a lineage of *carnivorans*

- Carnivora (...) is a diverse *scrotiferan *order

- Scrotifera is a clade of *placental mammals*

- Placentalia ("Placentals") is one of the three extant subdivisions of the
class of animals *Mammalia*...

- Mammals are the *vertebrates *within the class Mammalia...


>From my point of view, this classification looks much better than the
current relationships in Wikidata's ontology.

The automatic extraction of hypernymic relationships from English texts
(especially Wikipedia) has been studied for a long time and gives good
results, even with simple methods based on hand-crafted rules. In the case
of Wikipedia, the hypernym often has a page itself (and therefore a link to
Wikidata), which could simplify the NLP extraction and the mapping with
Wikidata items.

Of course, the extracted relationships will not always be "subclass of" or
"instance of". But if someone proposed a new property called "Wikipedia
Hypernyms" (and its symmetric property "Wikipedia Hyponyms"), I would use
it more willingly and with more confidence than the current system. This
would also better respect the logic of Wikidata's descriptions.

I mean, if the description of Zoroastrianism (Q9601) says this is an
"Ancient Iranian *religion *founded by Zoroaster", one would expect the
class "religion" to appear much earlier in the hierarchy of superclasses of
this item. If there was this property "Wikipedia Hypernyms", we could
mention it in the same page - since Wikipedia describes Zoroastrianism as
"one of the world's oldest *religions *that remains active." And a SPARQL
query looking for 'all items that have "religion" as "Wikipedia hypernyms"
property' would be much much faster.

Note: sorry if this reflection is naive or if it has already been
discussed/tested.

Cheers,

Ettore

On Thu, 27 Sep 2018 at 23:35, James Heald  wrote:

> This recent announcement by the Structured Data team perhaps ought to be
> quite a heads-up for us:
>
>
> https://commons.wikimedia.org/wiki/Commons_talk:Structured_data#Searching_Commons_-_how_to_structure_coverage
>
> Essentially the team has given up on the hope of using Wikidata
> hierarchies to suggest generalised "depicts" values to store for images
> on Commons, to match against terms in incoming search requests.
>
> i.e.  if an image is of a German Shepherd dog, and identified as such,
> the team has given up on trying to infer in general from Wikidata that
> 'dog' is also a search term that such an image should score positively
> with.
>
> Apparently the Wikidata hierarchies were simply too complicated, too
> unpredictable, and too arbitrary and inconsistent in their design across
> different subject areas to be readily assimilated (before one even
> starts on the density of bugs and glitches that then undermine them).
>
> Instead, if that image ought to be considered in a search for 'dog', it
> looks as though an explicit 'depicts:dog' statement may be going to be
> needed to be specifically present, in addition to 'depicts:German
> Shepherd'.
>
> Some of the background behind this assessment can be read in
> https://phabricator.wikimedia.org/T199119
> in particular the first substantive comment on that ticket, by Cparle on
> 10 July, giving his quick initial read of some of the issues using
> Wikidata would face.
>
> SDC was considered a flagship end-application for Wikidata.  If the data
> in Wikidata is not usable enough to supply the dogfood that project was
> expected to be going to be relying on, that should be a serious wake-up
> call, a red flag we should not ignore.
>
> If the way data is organised across different subjects is currently too
> inconsistent and confusing to be usable by our own SDC project, are
> there actions we can take to address that?  Are there design principles
> to be chosen that then need to be applied consistently?  Is this
> something the community can do, or is some more active direction going
> to need to be applied?
>
> Wikidata's 'ontology' has grown haphazardly, with little oversight, like
> an untended bank of weeds.  Is some more active gardening now required?
>
>-- James.
>
>
>
> ---
> This email has been checked for viruses by 

Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-20 Thread Markus Kroetzsch

Hi Pine,

As I understood Daniel, he did not talk about inserting low quality 
content into any project, Wikipedia or other. What I believe he meant 
with "using the ontology" is to use it for improving search/discovery 
services that help editors to find something (i.e., technical 
infrastructure, not editorial content). Doing so could lead to an 
additional amount of mostly useful results, but it will not yet be 
enough to get all results that a user would intuitively expect. Maybe 
his wording made this sound a bit too dramatic -- I think he just wanted 
to emphasize the point that any actual use will immediately provide 
motivation and guidance for Wikidata editors to improve things that are 
currently imperfect.


I agree with him in that I think we need to identify ways of moving 
gradually forward, offering the small benefits we can already provide 
while creating an environment that allows the community to improve 
things step by step. If we ask for perfection before even starting, we 
will get into a deadlock where we bind editor resources in redundant 
tagging tasks instead of empowering the community to improve the 
situation in a sustainable way.


Cheers,

Markus


On 20/10/2018 06:51, Pine W wrote:



On Fri, Oct 19, 2018 at 9:47 AM Markus Kroetzsch 
mailto:markus.kroetz...@tu-dresden.de>> 
wrote:


On 19/10/2018 07:09, Pine W wrote:
 > I would appreciate clarification what is proposed with regard to
 > exposing problematic Wikidata ontology on Wikipedia. If the idea
 > involves inserting poor-quality information onto English
Wikipedia in
 > order to spur us to fix problems with Wikidata, then I am likely to
 > oppose it. English Wikipedia is not an endless resource for free
labor,
 > and we have too few skilled and good-faith volunteers to handle our
 > already enormous scope of work.

You are right, and thankfully this is not what is proposed. The
proposal
was to offer people who search for Commons media the (maybe optional)
possibility to find more results by letting the search engine traverse
the "more-general-than" links stored in Wikidata. People have
discovered
cases where some of these links are not correct (surprise! it's a wiki
;-), and the suggestion was that such glitches would be fixed with
higher priority if there would be an application relying on it. But
even
with some wrong links, the results a searcher would get would still
include mostly useful hits. Also, at least half of the currently
observed problems with this approach would lead to fewer results (e.g.,
dogs would be hard to include automatically to a search for all
mammals), but in such cases the proposed extension would simply do what
the baseline approach (ignoring the links) would do anyway, so service
would not get any worse. Also, the manual workarounds suggested by some
(adding "mammal" to all pictures of some "dog") would be compatible
with
this, so one could do both to improve search experience on both ends.

Best regards,

Markus


Hi Markus, I seem to be missing something. Daniel said, "And I think the 
best way to achieve this is to start using the ontology as an ontology 
on wikimedia projects, and thus expose the fact that the ontology is 
broken. This gives incentive to fix it, and examples as to what things 
should be possible using that ontology (namely, some level of basic 
inference)." I think that I understand the basic idea behind structured 
data on Commons. I also think that I understand your statement above. 
What I'm not understanding is how Daniel's proposal to "start using the 
ontology as an ontology on wikimedia projects, and thus expose the fact 
that the ontology is broken." isn't a proposal to add poor quality 
information from Wikidata onto Wikipedia and, in the process, give 
Wikipedians more problems to fix. Can you or Daniel explain this?


Separately, someone wrote to me off list to make the point that 
Wikipedians who are active in non-English Wikipedias also wouldn't 
appreciate having their workloads increased by having a large quantity 
poor-quality information added to their edition of Wikipedia. I think 
that one of the person's concerns is that my statement could have been 
interpreted as implying something like "it's okay to insert poor-quality 
information on non-English Wikipedias because their standards are 
lower". I apologize if I gave the impression that I would approve of a 
non-English language edition of Wikipedia being on the receiving end of 
an unwelcome large addition of information that requires significant 
effort to clean up. Hopefully my response here will address the concerns 
that I heard off list, and if not then I welcome additional feedback.


Thanks,

Pine
( https://meta.wikimedia.org/wiki/User:Pine )

___
Wikidata mailing list
Wikidata@lists.wikimedia.org

Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-20 Thread Thomas Douillard
There is already stuffs to handle this kind of « mutex » on Wikidata :
"disjoint union of", see for example in usage on htps://
www.wikidata.org/wiki/Q180323 . The statements are used on the talk page by
templates that uses them to generate queries to find instances that violate
the mutex : https://www.wikidata.org/wiki/Talk:Q180323 (for example This
query

, that does not find anything unsurprisingly because I don’t expect to find
a lot of vertebra instances on Wikidata :) )

Le sam. 20 oct. 2018 à 12:09, Thad Guidry  a écrit :

> Hi All,
>
> Just to address what Markus was hinting at with inference rules. Both
> positive and negative rules could be stored.  Back in the Freebase days, we
> had those and were called "mutex's".  We used them for "type incompatible"
> hints to users and stored those "type incompatible" mutex rules in the
> knowledge graph. (Freebase being a Type based system along with having
> Properties under each Type)
>
> Such as:  ORGANIZATION != SPORT
>
> You actually have all those type incompatible mutexs in the Freebase dumps
> handed to you where you could start there.  The biggest one was called "Big
> Momma Mutex".
> Here is an archived email thread to give further context:
> https://freebase.markmail.org/thread/z5o7nlnb62n5t22o
>
> Anyways, the point is that those rules worked well for us in Freebase and
> I can see rules also working wonders in various ways in Wikidata as well.
> Maybe its just a mutex at each class ? Where multiple statements could
> hold rules ?
>
> Thad
> +ThadGuidry 
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Stas Malyshev
Hi!

> data on Commons. I also think that I understand your statement above.
> What I'm not understanding is how Daniel's proposal to "start using the
> ontology as an ontology on wikimedia projects, and thus expose the fact
> that the ontology is broken." isn't a proposal to add poor quality
> information from Wikidata onto Wikipedia and, in the process, give
> Wikipedians more problems to fix. Can you or Daniel explain this?

While I can not pretend to have expert knowledge and do not purport to
interpret what Daniel meant, I think here we must remember that
Wikipedia, while being of course of huge importance, is not the only
Wikimedia project, so "start using it on Wikimedia projects" does not
necessarily mean "start using it on Wikipedia", yet less "start adding
bad information to Wikipedia" (there are other ways to use the data,
including imperfect ontologies - e.g. for search, for bot guidance, for
quality assurance and editor support, and many other ways) I am not
prescribing a specific scenario here, just reminding that "using the
ontology on wikimedia projects" can mean a wide variety of things.

> Separately, someone wrote to me off list to make the point that
> Wikipedians who are active in non-English Wikipedias also wouldn't
> appreciate having their workloads increased by having a large quantity
> poor-quality information added to their edition of Wikipedia. I think

I am sure that would be a bad thing. But I don't think anything we are
discussing here would lead to that happening.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Pine W
On Fri, Oct 19, 2018 at 9:47 AM Markus Kroetzsch <
markus.kroetz...@tu-dresden.de> wrote:

> On 19/10/2018 07:09, Pine W wrote:
> > I would appreciate clarification what is proposed with regard to
> > exposing problematic Wikidata ontology on Wikipedia. If the idea
> > involves inserting poor-quality information onto English Wikipedia in
> > order to spur us to fix problems with Wikidata, then I am likely to
> > oppose it. English Wikipedia is not an endless resource for free labor,
> > and we have too few skilled and good-faith volunteers to handle our
> > already enormous scope of work.
>
> You are right, and thankfully this is not what is proposed. The proposal
> was to offer people who search for Commons media the (maybe optional)
> possibility to find more results by letting the search engine traverse
> the "more-general-than" links stored in Wikidata. People have discovered
> cases where some of these links are not correct (surprise! it's a wiki
> ;-), and the suggestion was that such glitches would be fixed with
> higher priority if there would be an application relying on it. But even
> with some wrong links, the results a searcher would get would still
> include mostly useful hits. Also, at least half of the currently
> observed problems with this approach would lead to fewer results (e.g.,
> dogs would be hard to include automatically to a search for all
> mammals), but in such cases the proposed extension would simply do what
> the baseline approach (ignoring the links) would do anyway, so service
> would not get any worse. Also, the manual workarounds suggested by some
> (adding "mammal" to all pictures of some "dog") would be compatible with
> this, so one could do both to improve search experience on both ends.
>
> Best regards,
>
> Markus
>
>
Hi Markus, I seem to be missing something. Daniel said, "And I think the
best way to achieve this is to start using the ontology as an ontology on
wikimedia projects, and thus expose the fact that the ontology is broken.
This gives incentive to fix it, and examples as to what things should be
possible using that ontology (namely, some level of basic inference)." I
think that I understand the basic idea behind structured data on Commons. I
also think that I understand your statement above. What I'm not
understanding is how Daniel's proposal to "start using the ontology as an
ontology on wikimedia projects, and thus expose the fact that the ontology
is broken." isn't a proposal to add poor quality information from Wikidata
onto Wikipedia and, in the process, give Wikipedians more problems to fix.
Can you or Daniel explain this?

Separately, someone wrote to me off list to make the point that Wikipedians
who are active in non-English Wikipedias also wouldn't appreciate having
their workloads increased by having a large quantity poor-quality
information added to their edition of Wikipedia. I think that one of the
person's concerns is that my statement could have been interpreted as
implying something like "it's okay to insert poor-quality information on
non-English Wikipedias because their standards are lower". I apologize if I
gave the impression that I would approve of a non-English language edition
of Wikipedia being on the receiving end of an unwelcome large addition of
information that requires significant effort to clean up. Hopefully my
response here will address the concerns that I heard off list, and if not
then I welcome additional feedback.

Thanks,

Pine
( https://meta.wikimedia.org/wiki/User:Pine )
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Markus Kroetzsch



On 20/10/2018 00:41, Stas Malyshev wrote:

Hi!


Cparle wants to make sure that people searching for "clarinet" also get
shown images of "piccolo clarinet" etc.

To make this possible, where an image has been tagged "basset horn" he
is therefore looking to add "clarinet" as an additional keyword, so that
if somebody types "clarinet" into the search box, one of the images
retrieved by ElasticSearch will be the basset horn one.


Generally if the image is tagged with "basset horn" and the user query
is "clarinet", we can do one of the following:

1. Index all upstream-hierarchy for "basset horn" (presumably we would
have to cut off when it gets too deep or too abstract) and then match
directly when searching.

2. Expand hierarchy down-stream from "clarinet" and then match against
search index.

3. Have some manual or automatic process that ensures that both
"clarinet" and "basset horn" are indexed (not necessarily at once) and
rely on it to discover the matches.

The problem with (1) is that if hierarchy changes, we will have to do
huge number of updates which might overwhelm the system, and most of
these updates would be not even for things people search for, but we
have no way to know that.

The problem with (2) is that downstream hierarchies explode very fast,
and if you search for "clarinet" and there are 1 descendants in
these hierarchies, we can't search for all of them, so you may never get
a chance to find the basset horn. Also, of course, querying big
downstream hierarchies takes time too, which means performance hit.


Is this such a problem? It is what people now commonly do with P31/P279* 
queries. For example, finding 10K instances of (some subclass of) 
building takes 9 secs: http://tinyurl.com/y7e5j5sd (I think this is one 
of the more complex hierarchies; maybe you know larger downstream 
hierarchies one could try?) If you omit the labels, it takes 650ms. 
That's maybe not quite autocompletion speed yet, but seems acceptable 
for a media search.


Cheers,

Markus







smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Stas Malyshev
Hi!

> Cparle wants to make sure that people searching for "clarinet" also get
> shown images of "piccolo clarinet" etc.
> 
> To make this possible, where an image has been tagged "basset horn" he
> is therefore looking to add "clarinet" as an additional keyword, so that
> if somebody types "clarinet" into the search box, one of the images
> retrieved by ElasticSearch will be the basset horn one.

Generally if the image is tagged with "basset horn" and the user query
is "clarinet", we can do one of the following:

1. Index all upstream-hierarchy for "basset horn" (presumably we would
have to cut off when it gets too deep or too abstract) and then match
directly when searching.

2. Expand hierarchy down-stream from "clarinet" and then match against
search index.

3. Have some manual or automatic process that ensures that both
"clarinet" and "basset horn" are indexed (not necessarily at once) and
rely on it to discover the matches.

The problem with (1) is that if hierarchy changes, we will have to do
huge number of updates which might overwhelm the system, and most of
these updates would be not even for things people search for, but we
have no way to know that.

The problem with (2) is that downstream hierarchies explode very fast,
and if you search for "clarinet" and there are 1 descendants in
these hierarchies, we can't search for all of them, so you may never get
a chance to find the basset horn. Also, of course, querying big
downstream hierarchies takes time too, which means performance hit.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Stas Malyshev
Hi!

> possibility to find more results by letting the search engine traverse
> the "more-general-than" links stored in Wikidata. People have discovered
> cases where some of these links are not correct (surprise! it's a wiki
> ;-), and the suggestion was that such glitches would be fixed with
> higher priority if there would be an application relying on it. But even

The main problem I see here is not that some links are incorrect - which
may have bad effects, but it's not the most important issue. The most
important one, IMHO, that there's no way to figure out in any scalable
and scriptable way what "more-general-than" means for any particular case.

It's different for each type of objects and often inconsistent within
the same class (e.g. see confusion between whether "dog" is an animal, a
name of the animal, name of the taxon, etc.) It's not that navigating
the hierarchy would lead as astray - we're not even there yet to have
this problem, because we don't even have a good way to navigate it.

Using instance-of/subclass-of only seems to not be that useful, because
a lot of interesting things are not represented in this way - e.g.
finding out that Donna Strickland (Q56855591) is a woman (Q467) is
impossible using only this hierarchy. We could special-case a bunch of
those but given how diverse Wikidata is, I don't think this will ever
cover any significant part of the hierarchy unless we find a non-ad-hoc
method of doing this.

This also makes it particularly hard to do something like "let's start
using it and fix the issues as we discover them", because the main issue
here is that we don't have a way to start with anything useful beyond a
tiny subset of classes that we can special-case manually. We can't
launch a rocket and figure how to build the engine later - having a
working engine is a prerequisite to launching the rocket!

There are also significant technical challenges in this - indexing
dynamically changing hierarchy is very problematic, and with our
approach to ontology anything can be a class, so we'd have to constantly
update the hierarchy. But this is more of a technical challenge, which
will come after we have some solution for the above.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Markus Kroetzsch



On 19/10/2018 12:32, Luca Martinelli wrote:

Il giorno ven 19 ott 2018 alle ore 01:09 James Heald
 ha scritto:

But the taxo project has become such a walled garden, answerable only to
itself, that people with comments may need to be quite forceful to get
their message through, if we are to deal eg with some of the
difficulties Cparle describes in the ticket [...]


Me and other admins are unfortunately aware of this and this is
exactly what I was referring to in my previous e-mail. I do agree with
you the situation there is frankly unbearable, and IMHO it will likely
be ended also through "removals" of some users who think they should
be the only one in charge of deciding what's good and what's not. You
might easily understand why this situation deteriorated like this, but
I acknowledge this is no excuse for it to continue.



Re this tricky situation, it might be good that the taxonomy part of 
Wikidata avoid the use of "subclass of" altogether. Doesn't this open up 
a path for compromise? Wikidata could intentionally "overload" taxons to 
also refer to sets of organisms (in some cases). The taxonomic model 
would not be affected by this in any way, since it ignores "subclass 
of". Some (historic or debated) taxons could be ignored for this 
"colloquial" subclass hierarchy, while other merely colloquially defined 
classes of animals could be put in relation to proper species. I think 
such overloading is acceptable as long as there cannot be confusion 
between which statement refers to which facet of the concept. Then no 
use of either facet will be impaired by the presence of the "irrelevant" 
extra data.


The only alternative seems to build a "mirror taxonomy" that consists 
not of taxon names but of animal classes (and that would include "dog" 
somewhere in its hierarchy [1]). But then we will need a community-wide 
decision on which of the two (class of organisms vs. scientific name) is 
the subject of actual Wikipedia articles, which might be a difficult 
topic to discuss.


Alternatively, if the taxons are mostly considered as "names" (syntax) 
rather than classes of individual organism, then it seems we are 
actually building a kind of scientific dictionary here that might rather 
belong into the lexeme space.


Whatever happens, this problem needs some solution.

Cheers,

Markus

[1] It seems that the strange position of "dog" is mostly due to the 
fact that two taxons are associated with it. In general, this seems an 
important issue (many common names are not clearly specifying a taxon), 
but in the case of dog it seems that the two taxons are synonyms of one 
another, i.e., the taxon for dog simply changed names over time.




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Luca Martinelli
Il giorno ven 19 ott 2018 alle ore 01:09 James Heald
 ha scritto:
> But the taxo project has become such a walled garden, answerable only to
> itself, that people with comments may need to be quite forceful to get
> their message through, if we are to deal eg with some of the
> difficulties Cparle describes in the ticket [...]

Me and other admins are unfortunately aware of this and this is
exactly what I was referring to in my previous e-mail. I do agree with
you the situation there is frankly unbearable, and IMHO it will likely
be ended also through "removals" of some users who think they should
be the only one in charge of deciding what's good and what's not. You
might easily understand why this situation deteriorated like this, but
I acknowledge this is no excuse for it to continue.

L.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Markus Kroetzsch

On 19/10/2018 07:09, Pine W wrote:
I would appreciate clarification what is proposed with regard to 
exposing problematic Wikidata ontology on Wikipedia. If the idea 
involves inserting poor-quality information onto English Wikipedia in 
order to spur us to fix problems with Wikidata, then I am likely to 
oppose it. English Wikipedia is not an endless resource for free labor, 
and we have too few skilled and good-faith volunteers to handle our 
already enormous scope of work.


You are right, and thankfully this is not what is proposed. The proposal 
was to offer people who search for Commons media the (maybe optional) 
possibility to find more results by letting the search engine traverse 
the "more-general-than" links stored in Wikidata. People have discovered 
cases where some of these links are not correct (surprise! it's a wiki 
;-), and the suggestion was that such glitches would be fixed with 
higher priority if there would be an application relying on it. But even 
with some wrong links, the results a searcher would get would still 
include mostly useful hits. Also, at least half of the currently 
observed problems with this approach would lead to fewer results (e.g., 
dogs would be hard to include automatically to a search for all 
mammals), but in such cases the proposed extension would simply do what 
the baseline approach (ignoring the links) would do anyway, so service 
would not get any worse. Also, the manual workarounds suggested by some 
(adding "mammal" to all pictures of some "dog") would be compatible with 
this, so one could do both to improve search experience on both ends.


Best regards,

Markus



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Markus Kroetzsch

Hi James,

On 19/10/2018 01:09, James Heald wrote:

On 18/10/2018 22:33, Markus Kroetzsch wrote:


And, on another note, there is also a huge misunderstanding exposed in 
the discussion on th search-related tracker item [1]: Cparle there 
speaks about "traversing the subclass hierarchy" but is actually 
looking at *super*classes of, e.g., "Clarinet", which he mostly finds 
irrelevant to users who care about clarinets. But surely that's the 
wrong direction! You have to look for *sub*classes to find special 
cases of what you are looking for. Looking downwards will often lead 
to much saner ontologies than when turning your head towards the dizzy 
heights of upper ontology. Yes, the few of us looking for instances of 
"logical consequence" will still get clarinets, but those who look for 
instances of clarinet merely will see instances of alto clarinet, 
piccolo clarinet, basset horn, Saxonette, and so on [2]. So instead of 
trying to suggest to Commons editors meaningful "upper concepts", one 
could simply enable the use of lower concepts in search. It does not 
work in all cases yet, but it many.


Not really.

Cparle wants to make sure that people searching for "clarinet" also get 
shown images of "piccolo clarinet" etc.


To make this possible, where an image has been tagged "basset horn" he 
is therefore looking to add "clarinet" as an additional keyword, so that 
if somebody types "clarinet" into the search box, one of the images 
retrieved by ElasticSearch will be the basset horn one.


I imagine there are pluses and minuses both ways, whether you try to 
make sure one search returns more hits, or try to run multiple searches 
each returning fewer hits.


Your suggestion of the latter approach may not involve so much 
pre-investigation of the top of the tree, which may be terms that people 
are less likely to search for; but on the other hand, the actual 
searching may be less efficient than a single indexed search.


True, but with the Wikidata Query Service we already have infrastructure 
that completes millions of search requests of this kind (involving path 
queries), so that seems doable for Commons as well. WDQS already has 
Wikimedia API bindings that allow it to use Lucene-based results in 
addition, if needed (though this would only make sense if the search 
should use some content that for some reason cannot be imported into a 
query service as graph data, mostly free-text search over longer texts).


I think the approach of completing tags towards the upper classes is not 
a good idea in general, since it creates extra work for editors that 
requires a million times the resources needed in the other approach: if 
the subclass hierarchy is wrong, you only need to fix it once to improve 
search for all existing Commons content; if you rely on manual extra 
tags, you'd have to add them to every file on Commons and keep it 
up-to-date with changes in the concepts -- an enormous, redundant effort 
that will invariably lead to a very non-uniform search experience across 
otherwise similar media. This seems like a huge waste of editors' time 
even if it would work (i.e., if we would live in a world where the 
superclasses of a class would be easy to understand and closely related 
to the topic that an editor is working on -- which will never happen for 
Wikidata or Commons, since both cover such a breadth of topics that 
their upper ontology necessarily has to be very general even if modelled 
in a clean and fully correct way).


Cheers,

Markus





There are still problems (such as the biological taxonomy being 
modelled as a hierarchy of names rather than animal classes, placing 
dog far away from mammal), but it is still always much easier to come 
up with a sane organisation for the *sub*classes of a concrete class.


For what it's worth, there's currently quite a lively discussion on 
Project Chat about issues with the current modelling of biological 
taxonomies,
https://www.wikidata.org/wiki/Wikidata:Project_chat#Taxonomy:_concept_centric_vs_name_centric 



People on this thread might like to comment on some of the less 
fortunate elements of current practice, and the appropriateness of some 
of the thoughts that have been suggested.


But the taxo project has become such a walled garden, answerable only to 
itself, that people with comments may need to be quite forceful to get 
their message through, if we are to deal eg with some of the 
difficulties Cparle describes in the ticket at

  https://phabricator.wikimedia.org/T199119

   -- James.

---
This email has been checked for viruses by AVG.
https://www.avg.com


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-19 Thread Gerard Meijssen
Hoi Pine,
The ontology of Wikidata has nothing to do with English Wikipedia. The
notion that English Wikipedia is the only endless resource of free labour
is pathetic. Its dismissive attitude prevents functional contributions that
will benefit the users of Wikimedia projects.

For authors of "scholarly articles" we have an increasing amount of
information that is impossible for Wikipedia to include. It does not take
much to have a template that show them (standard collapsed) and links to
"Scholia" information for the paper.

For authors of books we could have a similar template. They could link to
*your local library* where you can check if it is available for reading.
Alternatively we could link to the "Open Library".

What it would do is provide a SERVICE to our readers that is easy enough to
provide, that leverages the data in Wikidata and is of a high quality. The
issue about the ontology has everything to do with the discovery of images
in Commons. It cannot get worse as it is, it is disfunctional. It only
works for English and I understand that is something you do not really
notice.

Yes, I do recognise Wikidata is a wiki. It is a work in progress and as
such the quality and quantity steadily improves.. Just like English
Wikipedia.
Thanks,
   Gerard

On Fri, 19 Oct 2018 at 07:10, Pine W  wrote:

> I would appreciate clarification what is proposed with regard to exposing
> problematic Wikidata ontology on Wikipedia. If the idea involves inserting
> poor-quality information onto English Wikipedia in order to spur us to fix
> problems with Wikidata, then I am likely to oppose it. English Wikipedia is
> not an endless resource for free labor, and we have too few skilled and
> good-faith volunteers to handle our already enormous scope of work.
>
>
> Pine
> ( https://meta.wikimedia.org/wiki/User:Pine )
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-18 Thread Pine W
I would appreciate clarification what is proposed with regard to exposing 
problematic Wikidata ontology on Wikipedia. If the idea involves inserting 
poor-quality information onto English Wikipedia in order to spur us to fix 
problems with Wikidata, then I am likely to oppose it. English Wikipedia is not 
an endless resource for free labor, and we have too few skilled and good-faith 
volunteers to handle our already enormous scope of work.

Pine
( https://meta.wikimedia.org/wiki/User:Pine )
null___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-18 Thread James Heald

On 18/10/2018 22:33, Markus Kroetzsch wrote:


And, on another note, there is also a huge misunderstanding exposed in 
the discussion on th search-related tracker item [1]: Cparle there 
speaks about "traversing the subclass hierarchy" but is actually looking 
at *super*classes of, e.g., "Clarinet", which he mostly finds irrelevant 
to users who care about clarinets. But surely that's the wrong 
direction! You have to look for *sub*classes to find special cases of 
what you are looking for. Looking downwards will often lead to much 
saner ontologies than when turning your head towards the dizzy heights 
of upper ontology. Yes, the few of us looking for instances of "logical 
consequence" will still get clarinets, but those who look for instances 
of clarinet merely will see instances of alto clarinet, piccolo 
clarinet, basset horn, Saxonette, and so on [2]. So instead of trying to 
suggest to Commons editors meaningful "upper concepts", one could simply 
enable the use of lower concepts in search. It does not work in all 
cases yet, but it many.


Not really.

Cparle wants to make sure that people searching for "clarinet" also get 
shown images of "piccolo clarinet" etc.


To make this possible, where an image has been tagged "basset horn" he 
is therefore looking to add "clarinet" as an additional keyword, so that 
if somebody types "clarinet" into the search box, one of the images 
retrieved by ElasticSearch will be the basset horn one.


I imagine there are pluses and minuses both ways, whether you try to 
make sure one search returns more hits, or try to run multiple searches 
each returning fewer hits.


Your suggestion of the latter approach may not involve so much 
pre-investigation of the top of the tree, which may be terms that people 
are less likely to search for; but on the other hand, the actual 
searching may be less efficient than a single indexed search.




There are still problems (such as the biological taxonomy being modelled 
as a hierarchy of names rather than animal classes, placing dog far away 
from mammal), but it is still always much easier to come up with a sane 
organisation for the *sub*classes of a concrete class.


For what it's worth, there's currently quite a lively discussion on 
Project Chat about issues with the current modelling of biological 
taxonomies,

https://www.wikidata.org/wiki/Wikidata:Project_chat#Taxonomy:_concept_centric_vs_name_centric

People on this thread might like to comment on some of the less 
fortunate elements of current practice, and the appropriateness of some 
of the thoughts that have been suggested.


But the taxo project has become such a walled garden, answerable only to 
itself, that people with comments may need to be quite forceful to get 
their message through, if we are to deal eg with some of the 
difficulties Cparle describes in the ticket at

 https://phabricator.wikimedia.org/T199119

  -- James.

---
This email has been checked for viruses by AVG.
https://www.avg.com


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-18 Thread Markus Kroetzsch

+1 to Daniel

And, on another note, there is also a huge misunderstanding exposed in 
the discussion on th search-related tracker item [1]: Cparle there 
speaks about "traversing the subclass hierarchy" but is actually looking 
at *super*classes of, e.g., "Clarinet", which he mostly finds irrelevant 
to users who care about clarinets. But surely that's the wrong 
direction! You have to look for *sub*classes to find special cases of 
what you are looking for. Looking downwards will often lead to much 
saner ontologies than when turning your head towards the dizzy heights 
of upper ontology. Yes, the few of us looking for instances of "logical 
consequence" will still get clarinets, but those who look for instances 
of clarinet merely will see instances of alto clarinet, piccolo 
clarinet, basset horn, Saxonette, and so on [2]. So instead of trying to 
suggest to Commons editors meaningful "upper concepts", one could simply 
enable the use of lower concepts in search. It does not work in all 
cases yet, but it many.


There are still problems (such as the biological taxonomy being modelled 
as a hierarchy of names rather than animal classes, placing dog far away 
from mammal), but it is still always much easier to come up with a sane 
organisation for the *sub*classes of a concrete class.


FYI, I recently gave a talk about ontological modelling in Wikidata that 
discussed some of the current issues: 
https://iccl.inf.tu-dresden.de/web/Misc3058/en (audience were ontology 
design pattern researchers there).


Cheers,

Markus

[1] https://phabricator.wikimedia.org/T199119
[2] http://tinyurl.com/y7tvkuzk

On 17/10/2018 16:04, Daniel Kinzler wrote:

My (very belated) thoughts on this issue:

Wiki content grows in a messy way, and it stays messy until the messiness causes
problems. Once it causes problems, people are motivated to clean it up.

I propose to implement hierarchical search based on very simple, predictable
rules, e.g. by having a configurable list of transitive relationships that get
evaluated to a certain depth. I'd go for subclasses, geographical inclusion, and
subspecies at first.

Doing this will NOT produce good results. You would have to implement a lot of
special cases and heuristics to work around dirty data. I say: let it produce
bad results, tell people why the results are bad, and what they can do about it!

The Wikimedia community is AMAZING at making good use of whatever capabilities
the software, and adapting content to make the software produce the results they
want. By providing limited but clearly defined software support for hierarchical
search, we allow the community to optimize the content to work with that search.
Keeping the rules simple means that other consumers can then follow the same
rules, and the content will work for them as well.

-- daniel

Am 29.09.2018 um 19:25 schrieb Gerard Meijssen:

Hoi,
There is also the age old conundrum where some want to enforce their rules for
the good all all because (argument of the day follows).

First of all, Wikidata is very much a child of Wikipedia. It has its own
structures and people have endeavoured to build those same structures in
Wikidata never mind that it is a very different medium and never mind that there
are 280+ Wikipedias that might consider things to be different.  The start of
Wikidata was also an auspicious occasion where it was thought to be OK to adopt
an external German authority. That proved to be a disaster and there are still
residues of this awful decision. It took not long to show the short comings of
this schedule and it was replaced by something more sensible.

However, we got something really Wiki and it was all too wild. It took not long
for me to ask for someone to explain the current structures and nobody
volunteered. So I did what I do best, I largely ignored the results of the
classes and subclasses. It does not work for me. It works against me so me
current strategy is to ignore this nonsense and concentrate on including data.
The reason is simple; once data is included, it is easy to slice it and dice
it.structure it as we see fit at a later date.

So when our priority becomes to make our data reusable, more open we should
agree on it. So far we have not because we choose to fight each other. Some have
ideas, some have invested too much in what we have at this time. When we are to
make our data reusable, we should agree on what it is exactly we aim to achieve.
Is it to support Commons, it is to support some external standard that is
academically sound. I would always favour what is practical and easily measured.

I would support Commons first. It has the benefit that it will bring our
communities together in a clear objective. It has the benefit that changes in
the operations of Wikidata support the whole of the Wikimedia universe and
consequentially financial, technical and operational needs and investments are
easily understood. It also means that all the bureaucracy that has materialised
will 

Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-18 Thread Peter F. Patel-Schneider
On 10/17/18 7:04 AM, Daniel Kinzler wrote:
> My (very belated) thoughts on this issue:
> 
[...]
> I say: let it produce> bad results, tell people why the results are bad, and
what they can do about it!
[...]
> 
> -- daniel
My view is that there is a big problem with this for industrial use of Wikidata.

I would very much like to use Wikidata more in my company.  However, I view it
as my duty in my company to point out problems with the use of any technology.
  So whenever I talk about Wikidata I also have to talk about the problems I
see in the Wikidata ontology and how they will affect use of Wikidata in my
company.

If Wikidata is going to have significant use in my company there needs to be
at least some indication that the problems in Wikidata are being addressed.  I
don't see that happening at the moment.


What is the biggest problem I see in Wikidata?  It is the poor organization of
the Wikidata ontology.  To fix the ontology, beyond doing point fixes, is
going to require some commitment from the Wikidata community.


Peter F. Patel-Schneider
Nuance Communications

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-18 Thread Luca Martinelli
Il mer 17 ott 2018, 16:04 Daniel Kinzler  ha scritto:
> I say: let it produce bad results, tell people why the results are bad, and 
> what they can do about it!

TL;DR: let's produce bad results, and let's analyse those results to
find the best practical solution we can come up with.

I totally agree with Daniel here. It is definitely a red flag that we
should tackle head-first, but we need data first. We need to know
*where* ontology fails, *why* it fails, and *how* can we fix it.

Now it's probably the best time to talk about this, not just because
we have a potential big application such as Structured Data, but also
because we focused on other not-so-easy problems such as dealing with
isolated sitelinks/projects and try to establish relations between
items, and between items and other databases.

What we need to do IMHO is to find whatever best practical solution we
have at hand, in order to primarily use it on Wikimedia projects. My
only fear is that such discussions may end up in a swamp because of
"that one user" who doesn't want to apply that particular solution
(not accusing anyone in particular, I've been that user too in some
discussions). Anyway, if we start from data, we can come up with some
solution.

L.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-10-17 Thread Daniel Kinzler
My (very belated) thoughts on this issue:

Wiki content grows in a messy way, and it stays messy until the messiness causes
problems. Once it causes problems, people are motivated to clean it up.

I propose to implement hierarchical search based on very simple, predictable
rules, e.g. by having a configurable list of transitive relationships that get
evaluated to a certain depth. I'd go for subclasses, geographical inclusion, and
subspecies at first.

Doing this will NOT produce good results. You would have to implement a lot of
special cases and heuristics to work around dirty data. I say: let it produce
bad results, tell people why the results are bad, and what they can do about it!

The Wikimedia community is AMAZING at making good use of whatever capabilities
the software, and adapting content to make the software produce the results they
want. By providing limited but clearly defined software support for hierarchical
search, we allow the community to optimize the content to work with that search.
Keeping the rules simple means that other consumers can then follow the same
rules, and the content will work for them as well.

-- daniel

Am 29.09.2018 um 19:25 schrieb Gerard Meijssen:
> Hoi,
> There is also the age old conundrum where some want to enforce their rules for
> the good all all because (argument of the day follows).
> 
> First of all, Wikidata is very much a child of Wikipedia. It has its own
> structures and people have endeavoured to build those same structures in
> Wikidata never mind that it is a very different medium and never mind that 
> there
> are 280+ Wikipedias that might consider things to be different.  The start of
> Wikidata was also an auspicious occasion where it was thought to be OK to 
> adopt
> an external German authority. That proved to be a disaster and there are still
> residues of this awful decision. It took not long to show the short comings of
> this schedule and it was replaced by something more sensible.
> 
> However, we got something really Wiki and it was all too wild. It took not 
> long
> for me to ask for someone to explain the current structures and nobody
> volunteered. So I did what I do best, I largely ignored the results of the
> classes and subclasses. It does not work for me. It works against me so me
> current strategy is to ignore this nonsense and concentrate on including data.
> The reason is simple; once data is included, it is easy to slice it and dice
> it.structure it as we see fit at a later date.
> 
> So when our priority becomes to make our data reusable, more open we should
> agree on it. So far we have not because we choose to fight each other. Some 
> have
> ideas, some have invested too much in what we have at this time. When we are 
> to
> make our data reusable, we should agree on what it is exactly we aim to 
> achieve.
> Is it to support Commons, it is to support some external standard that is
> academically sound. I would always favour what is practical and easily 
> measured. 
> 
> I would support Commons first. It has the benefit that it will bring our
> communities together in a clear objective. It has the benefit that changes in
> the operations of Wikidata support the whole of the Wikimedia universe and
> consequentially financial, technical and operational needs and investments are
> easily understood. It also means that all the bureaucracy that has 
> materialised
> will show to be in the way when it is.
> 
> So my question is not if we are a Wiki, my question is are we a Wiki enough 
> and
> willing to change our way for our own good.
> Thanks,
>       GerardM
> 
> On Sat, 29 Sep 2018 at 16:38, Thad Guidry  > wrote:
> 
> Ettore,
> 
> Wikidata has the ability of crowdsourcing...unfortunately, it is not
> effectively utilized.
> 
> Its because Wikidata does not yet provide a voting feature on
> statements...where as the vote gets higher...more resistance to change the
> statement is required.
> But that breaks the notion of a "wiki" for some folks.
> And there we circle back to Gerard's age old question of ... should 
> Wikidata
> really be considered a wiki at all for the benefit of society ?  or should
> it apply voting/resistance to keep it tidy, factual and less messy.
> 
> We have the technology to implement voting/resistance on statements.  I
> personally would utilize that feature and many others probably would as
> well.  Crowdsourcing the low voted facts back to applications like
> OpenRefine, or the recently sent out Survey vote mechanism for spam 
> analysis
> on the low voted statements could highlight where things are untidy and
> implement vote casting to clean them up.
> 
> "...the burden of proof has to be placed on authority, and it should be
> dismantled if that burden cannot be met..."
> 
> -Thad
> +ThadGuidry 
> 
> 
> On Sat, Sep 29, 2018 at 2:49 AM Ettore 

Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-09-29 Thread Gerard Meijssen
Hoi,
There is also the age old conundrum where some want to enforce their rules
for the good all all because (argument of the day follows).

First of all, Wikidata is very much a child of Wikipedia. It has its own
structures and people have endeavoured to build those same structures in
Wikidata never mind that it is a very different medium and never mind that
there are 280+ Wikipedias that might consider things to be different.  The
start of Wikidata was also an auspicious occasion where it was thought to
be OK to adopt an external German authority. That proved to be a disaster
and there are still residues of this awful decision. It took not long to
show the short comings of this schedule and it was replaced by something
more sensible.

However, we got something really Wiki and it was all too wild. It took not
long for me to ask for someone to explain the current structures and nobody
volunteered. So I did what I do best, I largely ignored the results of the
classes and subclasses. It does not work for me. It works against me so me
current strategy is to ignore this nonsense and concentrate on including
data. The reason is simple; once data is included, it is easy to slice it
and dice it.structure it as we see fit at a later date.

So when our priority becomes to make our data reusable, more open we should
agree on it. So far we have not because we choose to fight each other. Some
have ideas, some have invested too much in what we have at this time. When
we are to make our data reusable, we should agree on what it is exactly we
aim to achieve. Is it to support Commons, it is to support some external
standard that is academically sound. I would always favour what is
practical and easily measured.

I would support Commons first. It has the benefit that it will bring our
communities together in a clear objective. It has the benefit that changes
in the operations of Wikidata support the whole of the Wikimedia universe
and consequentially financial, technical and operational needs and
investments are easily understood. It also means that all the bureaucracy
that has materialised will show to be in the way when it is.

So my question is not if we are a Wiki, my question is are we a Wiki enough
and willing to change our way for our own good.
Thanks,
  GerardM

On Sat, 29 Sep 2018 at 16:38, Thad Guidry  wrote:

> Ettore,
>
> Wikidata has the ability of crowdsourcing...unfortunately, it is not
> effectively utilized.
>
> Its because Wikidata does not yet provide a voting feature on
> statements...where as the vote gets higher...more resistance to change the
> statement is required.
> But that breaks the notion of a "wiki" for some folks.
> And there we circle back to Gerard's age old question of ... should
> Wikidata really be considered a wiki at all for the benefit of society ?
> or should it apply voting/resistance to keep it tidy, factual and less
> messy.
>
> We have the technology to implement voting/resistance on statements.  I
> personally would utilize that feature and many others probably would as
> well.  Crowdsourcing the low voted facts back to applications like
> OpenRefine, or the recently sent out Survey vote mechanism for spam
> analysis on the low voted statements could highlight where things are
> untidy and implement vote casting to clean them up.
>
> "...the burden of proof has to be placed on authority, and it should be
> dismantled if that burden cannot be met..."
>
> -Thad
> +ThadGuidry 
>
>
> On Sat, Sep 29, 2018 at 2:49 AM Ettore RIZZA 
> wrote:
>
>> Hi,
>>
>> The Wikidata's ontology is a mess, and I do not see how it could be
>> otherwise. While the creation of new properties is controlled, any fool can
>> decide that a woman is no longer a
>> human or is part of family. Maybe I'm a fool too? I wanted to remove the
>> claim that a ship  is an instance
>> of "ship type" because it produces weird circular inferences in my
>> application; but maybe that makes sense to someone else.
>>
>> There will never be a universal ontology on which everyone agrees. I
>> wonder (sorry to think aloud) if Wikidata should not rather facilitate the
>> use of external classifications. Many external ids are knowledge
>> organization systems (ontologies, thesauri, classifications ...) I dream of
>> a simple query that could search, in Wikidata, "all elements of the same
>> class as 'poodle' according to the classification of imagenet
>> .
>>
>> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-09-29 Thread Ettore RIZZA
Hi Thad,

I understand that an open Wiki has its advantages and disadvantages (I
sometimes prefer a system like StackOverflow, where you need a certain
reputation to do some things). I am afraid that a voting system simply
favors the opinions shared by the majority of Wikidata editors, namely a
Western worldview. And even within this subgroup opinions may legitimately
differ.

But there may be ways to avoid messing up the ontology while respecting the
wiki spirit. For example, a warning pop-up every time you edit an
ontological property (P31, P279, P361...). Something like: "OK, you added
the statement "a poodle is an instance of toy". Do you agree with the fact
that poodle is now a goods, a work, an artificial physical object? "

But that would only work for manual edits...

On Sat, 29 Sep 2018 at 16:38, Thad Guidry  wrote:

> Ettore,
>
> Wikidata has the ability of crowdsourcing...unfortunately, it is not
> effectively utilized.
>
> Its because Wikidata does not yet provide a voting feature on
> statements...where as the vote gets higher...more resistance to change the
> statement is required.
> But that breaks the notion of a "wiki" for some folks.
> And there we circle back to Gerard's age old question of ... should
> Wikidata really be considered a wiki at all for the benefit of society ?
> or should it apply voting/resistance to keep it tidy, factual and less
> messy.
>
> We have the technology to implement voting/resistance on statements.  I
> personally would utilize that feature and many others probably would as
> well.  Crowdsourcing the low voted facts back to applications like
> OpenRefine, or the recently sent out Survey vote mechanism for spam
> analysis on the low voted statements could highlight where things are
> untidy and implement vote casting to clean them up.
>
> "...the burden of proof has to be placed on authority, and it should be
> dismantled if that burden cannot be met..."
>
> -Thad
> +ThadGuidry 
>
>
> On Sat, Sep 29, 2018 at 2:49 AM Ettore RIZZA 
> wrote:
>
>> Hi,
>>
>> The Wikidata's ontology is a mess, and I do not see how it could be
>> otherwise. While the creation of new properties is controlled, any fool can
>> decide that a woman is no longer a
>> human or is part of family. Maybe I'm a fool too? I wanted to remove the
>> claim that a ship  is an instance
>> of "ship type" because it produces weird circular inferences in my
>> application; but maybe that makes sense to someone else.
>>
>> There will never be a universal ontology on which everyone agrees. I
>> wonder (sorry to think aloud) if Wikidata should not rather facilitate the
>> use of external classifications. Many external ids are knowledge
>> organization systems (ontologies, thesauri, classifications ...) I dream of
>> a simple query that could search, in Wikidata, "all elements of the same
>> class as 'poodle' according to the classification of imagenet
>> .
>>
>> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-09-29 Thread Thad Guidry
Ettore,

Wikidata has the ability of crowdsourcing...unfortunately, it is not
effectively utilized.

Its because Wikidata does not yet provide a voting feature on
statements...where as the vote gets higher...more resistance to change the
statement is required.
But that breaks the notion of a "wiki" for some folks.
And there we circle back to Gerard's age old question of ... should
Wikidata really be considered a wiki at all for the benefit of society ?
or should it apply voting/resistance to keep it tidy, factual and less
messy.

We have the technology to implement voting/resistance on statements.  I
personally would utilize that feature and many others probably would as
well.  Crowdsourcing the low voted facts back to applications like
OpenRefine, or the recently sent out Survey vote mechanism for spam
analysis on the low voted statements could highlight where things are
untidy and implement vote casting to clean them up.

"...the burden of proof has to be placed on authority, and it should be
dismantled if that burden cannot be met..."

-Thad
+ThadGuidry 


On Sat, Sep 29, 2018 at 2:49 AM Ettore RIZZA  wrote:

> Hi,
>
> The Wikidata's ontology is a mess, and I do not see how it could be
> otherwise. While the creation of new properties is controlled, any fool can
> decide that a woman is no longer a
> human or is part of family. Maybe I'm a fool too? I wanted to remove the
> claim that a ship  is an instance
> of "ship type" because it produces weird circular inferences in my
> application; but maybe that makes sense to someone else.
>
> There will never be a universal ontology on which everyone agrees. I
> wonder (sorry to think aloud) if Wikidata should not rather facilitate the
> use of external classifications. Many external ids are knowledge
> organization systems (ontologies, thesauri, classifications ...) I dream of
> a simple query that could search, in Wikidata, "all elements of the same
> class as 'poodle' according to the classification of imagenet
> .
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-09-29 Thread Ettore RIZZA
Hi,

The Wikidata's ontology is a mess, and I do not see how it could be
otherwise. While the creation of new properties is controlled, any fool can
decide that a woman is no longer a
human or is part of family. Maybe I'm a fool too? I wanted to remove the
claim that a ship  is an instance of
"ship type" because it produces weird circular inferences in my
application; but maybe that makes sense to someone else.

There will never be a universal ontology on which everyone agrees. I wonder
(sorry to think aloud) if Wikidata should not rather facilitate the use of
external classifications. Many external ids are knowledge organization
systems (ontologies, thesauri, classifications ...) I dream of a simple
query that could search, in Wikidata, "all elements of the same class as
'poodle' according to the classification of imagenet
.

On Fri, 28 Sep 2018 at 04:42, Thad Guidry  wrote:

> James,
>
> It looks like a lot of that phabricator issue was around Taxons ?  For the
> Poodle to show a class of Mammal...
>
> Seems like many of these could be answered if someone responded to
> https://www.wikidata.org/wiki/User:Danyaljj on their last question about
> if an "OR" could be used with linktype with gas:service ... where no one
> gave an answer to their final question comment here:
>
> https://www.wikidata.org/wiki/Wikidata:Request_a_query/Archive/2017/01#Timeout_when_finding_distance_between_two_entities
>
> I tried myself to answer that question and find either Parent Taxon OR
> Subclass of a Poodle, but couldn't seem to pull it off using gas:service
> and 1 hour of trial and error in many forms, even duplicating the program
> twice ...
>
> http://tinyurl.com/yb7wfpwh
>
> #defaultView:Graph
> PREFIX gas: 
>
> SELECT ?item ?itemLabel
> WHERE {
>   SERVICE gas:service {
> gas:program gas:gasClass "com.bigdata.rdf.graph.analytics.SSSP" ;
> gas:in wd:Q38904 ;
> gas:traversalDirection "Forward" ;
> gas:out ?item ;
> gas:out1 ?depth ;
> gas:maxIterations 10 ;
> gas:linkType wdt:P279 .
>   }
>   SERVICE gas:service {
> gas:program gas:gasClass "com.bigdata.rdf.graph.analytics.SSSP" ;
> gas:in wd:Q38904 ;
> gas:traversalDirection "Forward" ;
> gas:out ?item ;
> gas:out1 ?depth ;
> gas:maxIterations 10 ;
> gas:linkType wdt:P171 .
>   }
>
>   SERVICE wikibase:label {bd:serviceParam wikibase:language
> "[AUTO_LANGUAGE],en" }
> }
>
>
> On Thu, Sep 27, 2018 at 5:24 PM Stas Malyshev 
> wrote:
>
>> Hi!
>>
>> > Apparently the Wikidata hierarchies were simply too complicated, too
>> > unpredictable, and too arbitrary and inconsistent in their design across
>> > different subject areas to be readily assimilated (before one even
>> > starts on the density of bugs and glitches that then undermine them).
>>
>> The main problem is that there is no standard way (or even defined small
>> number of ways) to get the hierarchy that is relevant for "depicts" from
>> current Wikidata data. It may even be that for a specific type or class
>> the hierarchy is well defined, but the sheer number of different ways it
>> is done in different areas is overwhelming and ill-suited for automatic
>> processing. Of course things like "is "cat" a common name of an animal
>> or a taxon and which one of these will be used in depicts" adds
>> complexity too.
>>
>> One way of solving it is to create a special hierarchy for "depicts"
>> purposes that would serve this particular use case. Another way is to
>> amend existing hierarchies and meta-hierarchies so that there would be
>> an algorithmic way of navigating them in a common case. This is
>> something that would be nice to hear about from people that are
>> experienced in ontology creation and maintenance.
>>
>> > to be chosen that then need to be applied consistently?  Is this
>> > something the community can do, or is some more active direction going
>> > to need to be applied?
>>
>> I think this is very much something that the community can do.
>>
>> --
>> Stas Malyshev
>> smalys...@wikimedia.org
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-09-27 Thread Thad Guidry
James,

It looks like a lot of that phabricator issue was around Taxons ?  For the
Poodle to show a class of Mammal...

Seems like many of these could be answered if someone responded to
https://www.wikidata.org/wiki/User:Danyaljj on their last question about if
an "OR" could be used with linktype with gas:service ... where no one gave
an answer to their final question comment here:
https://www.wikidata.org/wiki/Wikidata:Request_a_query/Archive/2017/01#Timeout_when_finding_distance_between_two_entities

I tried myself to answer that question and find either Parent Taxon OR
Subclass of a Poodle, but couldn't seem to pull it off using gas:service
and 1 hour of trial and error in many forms, even duplicating the program
twice ...

http://tinyurl.com/yb7wfpwh

#defaultView:Graph
PREFIX gas: 

SELECT ?item ?itemLabel
WHERE {
  SERVICE gas:service {
gas:program gas:gasClass "com.bigdata.rdf.graph.analytics.SSSP" ;
gas:in wd:Q38904 ;
gas:traversalDirection "Forward" ;
gas:out ?item ;
gas:out1 ?depth ;
gas:maxIterations 10 ;
gas:linkType wdt:P279 .
  }
  SERVICE gas:service {
gas:program gas:gasClass "com.bigdata.rdf.graph.analytics.SSSP" ;
gas:in wd:Q38904 ;
gas:traversalDirection "Forward" ;
gas:out ?item ;
gas:out1 ?depth ;
gas:maxIterations 10 ;
gas:linkType wdt:P171 .
  }

  SERVICE wikibase:label {bd:serviceParam wikibase:language
"[AUTO_LANGUAGE],en" }
}


On Thu, Sep 27, 2018 at 5:24 PM Stas Malyshev 
wrote:

> Hi!
>
> > Apparently the Wikidata hierarchies were simply too complicated, too
> > unpredictable, and too arbitrary and inconsistent in their design across
> > different subject areas to be readily assimilated (before one even
> > starts on the density of bugs and glitches that then undermine them).
>
> The main problem is that there is no standard way (or even defined small
> number of ways) to get the hierarchy that is relevant for "depicts" from
> current Wikidata data. It may even be that for a specific type or class
> the hierarchy is well defined, but the sheer number of different ways it
> is done in different areas is overwhelming and ill-suited for automatic
> processing. Of course things like "is "cat" a common name of an animal
> or a taxon and which one of these will be used in depicts" adds
> complexity too.
>
> One way of solving it is to create a special hierarchy for "depicts"
> purposes that would serve this particular use case. Another way is to
> amend existing hierarchies and meta-hierarchies so that there would be
> an algorithmic way of navigating them in a common case. This is
> something that would be nice to hear about from people that are
> experienced in ontology creation and maintenance.
>
> > to be chosen that then need to be applied consistently?  Is this
> > something the community can do, or is some more active direction going
> > to need to be applied?
>
> I think this is very much something that the community can do.
>
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata considered unable to support hierarchical search in Structured Data for Commons

2018-09-27 Thread Stas Malyshev
Hi!

> Apparently the Wikidata hierarchies were simply too complicated, too
> unpredictable, and too arbitrary and inconsistent in their design across
> different subject areas to be readily assimilated (before one even
> starts on the density of bugs and glitches that then undermine them).

The main problem is that there is no standard way (or even defined small
number of ways) to get the hierarchy that is relevant for "depicts" from
current Wikidata data. It may even be that for a specific type or class
the hierarchy is well defined, but the sheer number of different ways it
is done in different areas is overwhelming and ill-suited for automatic
processing. Of course things like "is "cat" a common name of an animal
or a taxon and which one of these will be used in depicts" adds
complexity too.

One way of solving it is to create a special hierarchy for "depicts"
purposes that would serve this particular use case. Another way is to
amend existing hierarchies and meta-hierarchies so that there would be
an algorithmic way of navigating them in a common case. This is
something that would be nice to hear about from people that are
experienced in ontology creation and maintenance.

> to be chosen that then need to be applied consistently?  Is this
> something the community can do, or is some more active direction going
> to need to be applied?

I think this is very much something that the community can do.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata