Re: [Wikidata-l] Wikidata for Wiktionary

Paul Houle Fri, 08 May 2015 09:32:07 -0700

Concepts and words are different things,  or better yet,  words (word
senses,  ...) are a special kind of concept.


I was looking at what the data model for a system that supports logical
representation of 100% of critical knowledge in business and technical
documents over narrow domains.

One thing I tried was (more or less) Wikidata+Wordnet and I found the
Wordnet part was difficult to apply.  Where Wikidata concepts match text
chunks it works OK,  but trying to deal with the verbs and prepositions and
all that stuff is labor intensive,  hard to do correctly,  and doesn't
contribute much to machine readable semantics.  It is more useful to model
verb functions in terms of discontinuous chunks which form templates,  i.e.
often the verb and associated prepositions together are a good unit of
modelling.

Super-Wordnet,  however,  will still be interesting to humans who might
want to pin down exact word senses in a contract.

On Fri, May 8, 2015 at 11:35 AM, Luca Martinelli <martinellil...@gmail.com>
wrote:

> 2015-05-08 15:33 GMT+02:00 Federico Leva (Nemo) <nemow...@gmail.com>:
> > +1. The Wikimedia community has been long able to think of all the
> Wikimedia
> > projects as an organic whole. Software, on the other hand, too often
> forced
> > innatural divisions.
> >
> > Wiktionary, Wikipedia, Commons and Wikiquote (to name the main cases)
> link
> > to each other all the time in a constructive division of labour. It
> makes no
> > sense to make connections between them harder.
>
>
> I start from here, since Nemo got the point IMHO: the fact that every
> project has its own scope doesn't imply that the whole of the
> community works on different scopes - we just decided to split up our
> duties among ourselves. But it's not just that.
>
> TL;DR: Wikidata and Wiktionary deal with the same things (concepts),
> therefore are best-suited for each other, given some needed
> adaptations. Structured Data and Structured Wikiquote deal with
> different things (objects), therefore are not to be considered good
> examples.
>
> Long version here:
>
> In theory, one might just agree that a separate instance of Wikibase
> might be the best solution for Wiktionary, but Structured Data and
> Structured Wikiquote are different from a theoretical "Structured
> Wiktionary", because they respectively deal with images, quotes and
> words.
>
> Images and quotes are describable *objects*, as the Wiki*
> articles/pages are, and there are billions and billions of those
> objects out there. This is the main, if not just the only, reason why
> we *have* to put up a separate instance of Wikibase to deal with them:
> thinking that Wikidata might deal with such an infinite task is just
> nuts.
>
> Words, on the other hands, are describable *concepts*, not objects.
> They can be linked one another by relation, they have synonyms and
> opposites, they can be regrouped or separated, etcetera, which is
> exactly what we're currently doing with Wikidata items.
>
> I know, words are even more than images and quotes, so it would be
> even more nuts to think to deal with this just with Wikidata - but
> Wikidata is *already* structured for dealing with concepts, making it
> the best choice for integrating data from Wiktionary.
>
> In other words, Wikidata and Wiktionary both work with *concepts*,
> while all the other projects work with *objects*. From a more
> practical point of view, why should I have a Wikidata item about, say,
> present tense[1] *AND* a completely similar item on "Structured
> Wiktionary"? It's the same concept, why should I have it in two
> different-yet-linked databases, belonging to and maintained by the
> very same community? Why can't we work something out to keep all
> informations just in one database?
>
> This is why I think that setting up a separate Wikibase for Wiktionary
> might end up in doubling our efforts and splitting our communities,
> which is exactly the opposite of what we need to do (halving the
> efforts and doubling the community).[2]
>
> Sorry for the long post. :)
>
>
> [1] https://www.wikidata.org/wiki/Q192613
> [2] Not sure if I have to remark this, but please, PLEASE, note this
> is just an exaggeration for argument's sake, I have of course no data
> that might confirm factually that the WD community will surge by 100%.
> I just want to make clear my concept (heh).
>
> --
> Luca "Sannita" Martinelli
> http://it.wikipedia.org/wiki/Utente:Sannita
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>



-- 
Paul Houle

*Applying Schemas for Natural Language Processing, Distributed Systems,
Classification and Text Mining and Data Lakes*

(607) 539 6254    paul.houle on Skype   ontolo...@gmail.com
https://legalentityidentifier.info/lei/lookup
<http://legalentityidentifier.info/lei/lookup>

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Wikidata for Wiktionary

Reply via email to