[
https://issues.apache.org/jira/browse/JENA-827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238274#comment-14238274
]
Andy Seaborne edited comment on JENA-827 at 12/8/14 9:07 PM:
-------------------------------------------------------------
Another approach is to not check language tags against the registry - instead,
just do syntactic checking of the string for the language tag. (I think this is
what happens elsewhere in Jena.)
If the langtag package is removed, code fix up is quite natural
Only one test fails and that is a tainting test of RDF/XML (tainting excludes
bad RDF) from the output. See {{jena-core/testing/arp/tainting/lang.rdf}}.
The test is for the specifically illegal language tag UND (Undetermined).
was (Author: andy.seaborne):
Another approach is to not check language tags against the registry - instead,
just do syntactic checking of the string for the language tag. (I think this is
what happens elsewhere in Jena.)
> Include all ISO 639-3 languages
> -------------------------------
>
> Key: JENA-827
> URL: https://issues.apache.org/jira/browse/JENA-827
> Project: Apache Jena
> Issue Type: Improvement
> Components: RDF/XML
> Affects Versions: Jena 2.12.1
> Reporter: Stian Soiland-Reyes
> Priority: Minor
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> {code}
> WARN 2014-12-05 14:21:24,085
> (com.hp.hpl.jena.rdf.model.impl.RDFDefaultErrorHandler:47) -
> http://www.w3.org/ns/oa#(line 42 column 36):
> {W116}
> ISO-639 does not define language: 'vls'.
> {code}
> http://www.w3.org/ns/oa.rdf says
> {code}
> <dc:creator xml:lang="vls">Herbert Van de Sompel</dc:creator>
> {code}
> but it does.. http://www-01.sil.org/iso639-3/documentation.asp?id=vls
> The complete list of ISO639-3 is not included in
> https://github.com/apache/jena/blob/master/jena-core/src/main/java/com/hp/hpl/jena/rdfxml/xmlinput/lang/Iso639.java
> - only ISO639-2 and ISO639-3.
> The new lists can be found at http://www-01.sil.org/iso639-3/download.asp -
> e.g. http://www-01.sil.org/iso639-3/iso-639-3.tab (UTF-8 although browser
> disagrees).
> I can work on the script to update this. One question is if Iso639.java needs
> a new field for the identifier for all those languages which are not in -1
> and -2 (e.g. "vls"). Another is if we should include the proper UTF-8 names
> of the languages to get the accents correct, e.g.
> {quote}
> bbj I L Ghomálá'
> {quote}
> I'm not sure if the permissions are compatible with Apache license:
> {quote}
> ISO 639-3 Code Tables Terms of Use
> The ISO 639-3 code set may be downloaded and incorporated into software
> products, web-based systems, digital devices, etc., either commercial or
> non-commercial, provided that:
> attribution is given www.sil.org/iso639-3/ as the source of the codes;
> the identifiers of the code set are not modified or extended except as
> may be privately agreed using the Private Use Area (range qaa to qtz), and
> then such extensions shall not be distributed publicly;
> the product, system, or device does not provide a means to redistribute
> the code set.
> {quote}
> the last bit might mean we should not include the *.tab files directly - but
> would the listing in Iso6539.java consitute a "means to redistribute the code
> set"?
> Is "the identifiers of the code set are not modified" compatible with Apache
> License which presumably allows you to modify anything?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)