Hi Robert,
For a proposal using properties for storing multilingual topic values
see also jri's reply to my posting to this list from June 2015.
http://lists.deepamehta.de/pipermail/devel-lists.deepamehta.de/2015-June/000599.html
How i understood jri's reply is that we could just
- declare a topics value (the thing the dm4 storage stores in "value")
as some default (language) value, e.g. "en_UK"
- and introduce a "value_de_DE" property for each topic's translated value.
As jri' points out, this approach would have some advantages about the
way i explored with trying to integrate multi-lingual topic values for
items (and properties) stored in wikidata.
I am in the hope that this helps you to identify the best track in advance.
Kind Regards,
Malte
--
What i can now say about the decisions for the two wikidata-plugins is:
1. About representing languages in DM4:
* wikipedia and thus wikidata uses a two letter code to identify their
wikipedias in various languages. In the dm4-wikidata plugin you can find
a migration to create 43 curated language identifier topics each
representing a Wikipedia languageCode (with its Human Readable Label)
I think their language identifier system is based on the (now obsolte)
ISO-639-1 (https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) but i
am not completely sure and i could not confirm this with a quick google
search on wikimedia or wikidata api pages
* the latest IETF recommendation for identifying languages by strings
is, up to my knowledge, the RFC 5646 - though i have seen some
developers/systems implementing it with an underscore (like "de_DE"
instead of "de-DE")
2. About modeling multi-lingual values in DM 4:
* In the dm4-wikidata-toolkit plugin i developed a "Entity Value" topic
type which aggregates the above mentioned wikipedia language identifier
topics. A "Wikidata Item" or "Wikidata Property" then has many "Entity
Value" topics (in various languages).
On 13.12.2016 09:05, Robert Schuster wrote:
Hi alltogether,
soon I'll have the task of providing translation support to DM-managed.
In a first shot the project was realized without translation support and
in a 2nd phase this is going to be added. Of course, some design
decision had been made to make this transition smooth (e.g. not rely on
data that is later translated as keys to data.)
The data that needs to be translated can have the following form:
(I write Topic like table entries)
a) Algeria | http://link-to-a-german-article.html | 3 | 12 | 3432
or
b) Benin | Die Geschichte des Landes beginnt [...]
So what I want to indicate is that I have basically 3 types of data:
- facts, figures, statistics that do not need to be translated (a date
stays a date, some statistics value too)
- links to external resources that need to be provided in translated
form, e.g. a link to the English variant of an article)
- text in one language that needs to be provided in another directly in DM
Has anyone done something similar and can share the approach?
Is there already consensus about this?
What I am having in mind is introducing a custom association
"translation" which hosts a language specifier and which I then
associate manually with all the data that is translated. My goal is to
provide the data in DM via a REST interface. During the transformation
of the DM-data I'll follow the translation associations to provide the
translated value of an item.
Does that sound good?
All the best,
Robert
--
devel mailing list
[email protected]
http://lists.deepamehta.de/mailman/listinfo/devel-lists.deepamehta.de