[Wikidata-bugs] [Maniphest] [Commented On] T204713: Provide a Wikibase instance where we can import Wikitionaries materials and that can be queried from Wiktionaries

2018-09-20 Thread Noe
Noe added a comment.
I support the idea of experimenting a separate Wikibase instance for Wiktionaries texts.

Something not in CC0 so not in Wikidata, but integrating full text material such as definitions, examples and etymology, as there are finely written in Wiktionaries. All the part that are protected by copyright laws (or droit d'auteur or equivalent laws). Some people outside of the wikiverse want to reuse definitions and etymologies in prose (like website that compile dictionaries such as encyclopedie.fr) and there will be none in Wikidata (only glosses). This proposal of a separate instance will resolve this issue and participate to show the power of Wikibase for more than data.

It could be a support for Wiktionary and wiktionarians, when Wikidata is not yet a support to Wiktionaries but rather a separate project aiming other purposes for other customers. The kind of support Psychoslave mentioned in the first message is not something planed by Wikidata product team and both instances could be complementary. I know there is no Wiktionaries product manager, I am sad about that,  but if one exist now, it should be the person to which this proposal is directed.

Finally, I know for sure Wiktionaries data can be integrated in databases easily, as shown by GLAWI (for French) and Dbnary (for twelve languages). So, a separate instance under CC BY-SA could be filled without much investment of time and there will be voluntaries to do so, as it the material they made and care for years.

I am convince WikidataLex (or any name this instance would have) is a robust project with a clear scope.TASK DETAILhttps://phabricator.wikimedia.org/T204713EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: NoeCc: Noe, Lydia_Pintscher, Lea_Lacroix_WMDE, Aklapper, Psychoslave, Lahi, Gq86, Cinemantique, GoranSMilovanovic, QZanden, LawExplorer, jberkel, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T204713: Provide a Wikibase instance where we can import Wikitionaries materials and that can be queried from Wiktionaries

2018-09-20 Thread Psychoslave
Psychoslave added a comment.
Hi @Lydia_Pintscher , I join your concern on not wasting valuable resources.

I would be happy to know how we could achieve the the exposed goals with the lexicographical data support available on Wikidata, including the very first point "allow to import definitions from Wiktionaries".

I hope we can agree that Wiktionaries communities and the work they achieved so far including the definition set they collectively created can also be considered as valuable resources that we shouldn't waste, do we all agree on that?

I would like that we leverage on this resources to make the best out of the synergy between the Wikibase technologies and Wiktionary communities and works. If this is not through the launch of a distinct instance of Wikibase, where we could better coordinate existing valuable resources, any other implementable solution proposal is welcome, of course.TASK DETAILhttps://phabricator.wikimedia.org/T204713EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: PsychoslaveCc: Lydia_Pintscher, Lea_Lacroix_WMDE, Aklapper, Psychoslave, Lahi, Gq86, Cinemantique, GoranSMilovanovic, QZanden, LawExplorer, jberkel, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T204713: Provide a Wikibase instance where we can import Wikitionaries materials and that can be queried from Wiktionaries

2018-09-19 Thread Psychoslave
Psychoslave added a comment.
As an example of the limits perceived with our current infrastructure, even staying in the frame of a single Wiktionary, you can think about creating glossaries using existing definitions.

Maybe Extension:TextExtracts might help somewhat here, but at least from the documentation it doesn't appear to be usable through wikicode calls.

One way to do it would be to use external bots that browse lexical categories, fetch each matching article, grab matching definitions (if tagged in the article) and generate distinct pages, for example in a dedicated namespace.

As Lua modules doesn't allow to fetch a list of element in a category, it's not possible to do the previous transformation through modules alone, and it would probably be too resource consuming anyway. What would be possible is parsing all articles, as previously mentioned and put their definitions into data modules. That would at least make a scenario where this data would be easily providable for in-wiki consumption in various cases, potentially avoiding data duplication within a single instance, as well as for external queries as the module could generate various output format.

Thus said there is no guarantee that the current community would like to use this data modules, there might propose to delete the transformed material, just ignore it and recommend to not use it within the main space, just as well as embrace and migrate massively to such an approach. Also such an approach shouldn't go without a rethinked UX which enable to edit data modules without editing the LSON, aiming at gaining both the support of current communities and new comers.

But even if we would have all that, it wouldn't allow data sharing across linguistic version, as we currently don't have possibility to share modules across instances.

Community endorsement won't change whether the data are stored within a data module or in an external Wikibase instance. The sharing of data across wikis can't currently be solved with data modules, but Wikidata comes to our mind of course. However as this is about sharing existing definitions of Wiktionaries which are covered by CC-by-sa-3.0-unported, Wikidata which accept only CC-0 compatible material can't host this data.

This hopefully expose the reasons of this demand and makes obvious how it supports the aims of our movement.TASK DETAILhttps://phabricator.wikimedia.org/T204713EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: PsychoslaveCc: Lea_Lacroix_WMDE, Aklapper, Psychoslave, Lahi, Gq86, Cinemantique, GoranSMilovanovic, QZanden, LawExplorer, jberkel, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T204713: Provide a Wikibase instance where we can import Wikitionaries materials and that can be queried from Wiktionaries

2018-09-19 Thread Psychoslave
Psychoslave added a comment.
So to answer a few questions I received in feedback to this ticket:


The reason for this tickets:
enable better sharing and coordination between linguistic versions, including
translations of definition
sharing the base of samples showing use of words linked to a specific definition
sharing etymologies written in prose (as opposed to explicit etymological relational trees)

bring more flexibility to reuse definitions inside Wikimedia wikis, allowing to show the same definition at several places/in different manners.
An example would be to create glossaries that gather on a single page terms and definitions of a topic, reusing the same definitions that are on each term page, restricted to those pertaining the topic.
An other possible use would be to have not only the lemma article having definitions, but also each inflections as long as the definition holds, while only displaying examples of use that pertains to the queried form.
this are only examples, possibilities of use are open, and this specific possibilities are only illustrative, in any case communities decide what and how they want to use.

ease the use by external projects (through a query service, an unified structured downloadable dump, etc.)
we can't import definition of Wiktionaries in Wikidata due to license incompatibility, but this won't be a problem with a separated instance using an appropriate license, so we can benefit of both the power and flexibility of Wikibase and the already large knowledge base of Wiktionaries together

Focus of efforts and resources
with such a project, existing community wouldn't have to change anything if they don't want, but would have the possibility to do so
it would allow to more easily share data across linguistic versions
it would allow to develop nicer interfaces on top of such a Wikibase instance, including within the Wikitionaries, but also in Toolforge for example, while having the whole resulting data base accessible from everywhere in the Wikimedia infrastructure per query facilities
that means possibilities to create more easy to use interfaces, and lowering the barrier to start contributing

the ticket is about a single Wikibase instance, dedicated to definition materials
it's not about one instance for each existing Wikitionary

the same result can not be reached through downstream structuration like Dbnary et GLAWI but both could possibly be used to populate the Wikibase instance



Thank you for those that already pointed lake in my initial demand, I hope it helps to clarify a bit.TASK DETAILhttps://phabricator.wikimedia.org/T204713EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: PsychoslaveCc: Aklapper, Psychoslave, Lahi, Gq86, Cinemantique, GoranSMilovanovic, QZanden, LawExplorer, jberkel, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs