[Wikidata-bugs] [Maniphest] [Edited] T167166: Specify the use of extended language codes in Lexemes

2018-05-17 Thread Pablo-WMDE
Pablo-WMDE updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...* Does the user select an Item? Do we allow any Item, or just ones that have a value for [[ https://www.wikidata.org/wiki/Property:P424 | P424 ("Wikimedia language code") ]]?...TASK DETAILhttps://phabricator.wikimedia.org/T167166EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pablo-WMDECc: WMDE-leszek, Denny, Lydia_Pintscher, Aklapper, daniel, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Wikidata-bugs, aude, Darkdadaah, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T167166: Specify the use of extended language codes in Lexemes

2017-06-08 Thread daniel
daniel updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...However, this may fail due to the fact that Item IDs may have more then 8 characters, and RFC 5646 only allows 8 characters per section of the code. The relevant production for language tag extensions according to RFC 5646 is `singleton 1*("-" (2*8alphanum))` in ABNF. In PCRL that would be `\w(-\w{2,8}\w)+`)+`TASK DETAILhttps://phabricator.wikimedia.org/T167166EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: WMDE-leszek, Denny, Lydia_Pintscher, Aklapper, daniel, Cinemantique, GoranSMilovanovic, QZanden, Izno, Wikidata-bugs, aude, Darkdadaah, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T167166: Specify the use of extended language codes in Lexemes

2017-06-08 Thread daniel
daniel updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...However, this may fail due to the fact that Item IDs may have more then 8 characters, and RFC 5646 only allows 8 characters per section of the code. The relevant production for language tag extensions according to RFC 5646 is `singleton 1*("-" (2*8alphanum))` in ABNF. In PCRL that would be `\w(-{2,8}\w)+`.

Of course, Wikimedia could apply with IANA for a "q" singleton to be registered for Wikidata, so we could use "de-q-1205". But we would still run into issues with the length of the decimal item ID. Base 48 could help, but would be ugly.
 Or the ID could be split, as in Q1234-5678. But that may cause confusion with structured entity IDs which also use dashes as separators, as in L234243-F5.

=== Representation of Language Variants in Output Formats ===...TASK DETAILhttps://phabricator.wikimedia.org/T167166EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: WMDE-leszek, Denny, Lydia_Pintscher, Aklapper, daniel, Cinemantique, GoranSMilovanovic, QZanden, Izno, Wikidata-bugs, aude, Darkdadaah, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs