On 12/21/2014 07:05 PM, Axel Hecht wrote: > Can you provide some use cases and/or purpose? That's my stock reply > someone asks me about a data model. Gandalf and stas can sing that song > forwards and backwards ;-)
Well, this is probably dissimilar to the data models you're used to, in that our primary goal is merely to encode the data rather than fulfill any particular use case. Providing this type of structured data is more of a way of encoding the data in the most accessible way and letting the use cases fall out of that. But with that being said, the Introduction on the proposal provides the general overview of the expected uses of the data: dictionary, reverse dictionary, thesaurus, rhyming dictionary, etc. It should be easily turned into any number of word/definition-related documents for intuitive consumption by humans. Of particular note to the Intellego project, however, is its benefit for machine translation and other fields of computational linguistics. The various Wiktionary projects are already human consumable; what they are not, generally, is machine consumable. So the goal here is to create a data model that will allow the human consumable output to be more consistent while also providing a much more accessible avenue of use for various computational processes. > Also, I'm wondering if some pieces in particular in the implementation > section should/could/need to be language dependent? I'm curious as to what you have in mind. Could you provide some examples? Wikidata is a multilingual place to interconnect the various language-dependent Wikimedia sites (and others), so it was on that principle that this was based: Provide a centralized location for words and definitions so that content is not unevenly distributed or unnecessarily duplicated. This idea takes it a step further, however, in that the end goal (the way I see it) is not to maintain separately wikis for separate languages, like Wikipedia does. Given the nature of the information being encoded, it makes more sense to me for it to all live in a single location and be centrally localized in place. > Axel > > On 12/22/14 12:38 AM, Gordon P. Hemsley wrote: >> Hey all, >> >> One of the ideas that came out of our visit to LREC back in May was the >> need for an Open dictionary with a machine-readable data structure. >> Wiktionary seemed like a natural source of the data itself, so I've set >> about investigating how to get it converted and implemented as >> structured data. >> >> Many people have iterated on the idea of leveraging the Wikibase code >> used by Wikidata to store the information, and a number of proposals >> have been put forth over the years. I have put forward my own proposal, >> based on reading the previous proposals and my own knowledge of >> linguistics, and I would love to get more feedback on it: >> >> https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2014-10 >> >> >> I introduced myself to Lydia Pintscher, the product manager of Wikidata >> and Wikimedia Deutschland, at Wikimania in August, and told her that >> we're interested in helping out with the implementation of this idea, >> and I've had further online interactions with her since. Unfortunately, >> though, this project is seen as a low priority by the Wikimedia folks, >> and it will need a grassroots effort to get off the ground any time soon. >> >> I'd be happy to hear from anyone who is interested in helping out. There >> are some blockers in the Wikibase codebase that will need (PHP) >> development before we can really move ahead on this, but I'm also >> interested in simply hearing other people's ideas. Feel free to drop by >> #intellego on Mozilla IRC or #wiktionary or #wikidata on FreeNode if you >> want to talk in real-time. >> >> Regards, >> Gordon >> >> > -- Gordon P. Hemsley http://gphemsley.org/ _______________________________________________ tools-l10n mailing list [email protected] https://lists.mozilla.org/listinfo/tools-l10n
