Martin Desruisseaux created SIS-137:
---------------------------------------
Summary: <gmd:LocalisedCharacterString> locale shall be a URI
Key: SIS-137
URL: https://issues.apache.org/jira/browse/SIS-137
Project: Spatial Information Systems
Issue Type: Bug
Components: Metadata, Utilities
Affects Versions: 0.3
Reporter: Martin Desruisseaux
Assignee: Martin Desruisseaux
The {{locale}} attribute in the {{<gmd:LocalisedCharacterString>}} element is
defined by the XML schema as a value of kind {{xs:anyURI}}. However SIS 0.3
handles it as a plain string containing directly the language code, except for
the "{{#locale-}}" prefix (if present) which is ignored. This is incomplete,
and may be wrong under some circumstances.
ISO 19139:2007 at pages 105 and 106 defines the management of multilingual
metadata. A character string is localized as the example below:
{code:xml}
<PT_FreeText>
<textGroup>
<LocalisedCharacterString locale="#locale-fr">Résumé succinct du contenu de
la ressource</LocalisedCharacterString>
</textGroup>
</PT_FreeText>
{code}
The {{locale="#locale-fr"}} attribute references a locale definition provided
elsewhere, typically as an element of the root metadata:
{code:xml}
<MD_Metadata>
<locale>
<PT_Locale id="locale-fr">
<languageCode>
<LanguageCode
codeList="resources/Codelist/gmxcodelists.xml#LanguageCode"
codeListValue="fra"> French </LanguageCode>
</languageCode>
</PT_Locale>
</locale>
</MD_Metadata>
{code}
Since the {{locale}} attribute is a URI referencing an other element, that
attribute value typically begins with {{#}} character, while the {{id}}
attribute in {{<PT_Locale>}} does not. However there is nothing in the
specification telling that the locale ID shall be prefixed by "{{locale-}}",
neither that the text after that prefix shall be the language code. It just
happen to be the convention followed in the examples given by the ISO
specification.
A search on internet shows that this attribute is used in various ways:
h3. French mapping agency (IGN)
Extract from
[ML_gmxCrs.xml|http://eden.ign.fr/xsd/isotc211/isofull/20090316/resources/crs/ML_gmxCrs.xml/view]:
{code:xml}
<gmd:PT_FreeText>
<gmd:textGroup>
<gmd:LocalisedCharacterString locale="#xpointer(//*[@id='fra'])">Catalogue
des paramètres géodésiques pour la description de jeux de métadonnées conformes
aux schémas gmx</gmd:LocalisedCharacterString>
</gmd:textGroup>
</gmd:PT_FreeText>
{code}
{code:xml}
<locale>
<gmd:PT_Locale id="fra">
<gmd:languageCode>
<gmd:LanguageCode codeList="../codelist/ML_gmxCodelists.xml#LanguageCode"
codeListValue="french">French</gmd:LanguageCode>
</gmd:languageCode>
</gmd:PT_Locale>
</locale>
{code}
Observations:
* The {{locale}} attribute value is given by a XPath.
* The {{codeListValue}} attribute value in {{LanguageCode}} is "French" instead
than an ISO language code.
h3. NOAA
Extract from [Cruise2ISO on
geo-ide|https://geo-ide.noaa.gov/wiki/images/f/fc/Cruise2ISOSample-20100618-xml.pdf]:
{code:xml}
<gmd:LocalisedCharacterString id="PERSON_NAME_ID"
locale="http://www.rvdata.us/person#6708">PERSON/NAME<gmd:LocalisedCharacterString>
{code}
Observations:
* The {{locale}} attribute is a URL to a distant resource. However in this
particular case attempts to fetch that resource give an error 404. Consequently
there is no obvious way to find the locale for that example.
h3. INSPIRE
Extract from [Google
code|http://inspire-foss.googlecode.com/svn-history/r215/trunk/etl/NL.Kadaster/GeographicalNames/test/gn-example-finland.xml]:
{code:xml}
<gmd:LocalisedCharacterString
locale="en-GB">House</gmd:LocalisedCharacterString>
{code}
Observations:
* The {{locale}} attribute contains directly a parseable ISO language code. The
absence of leading {{#}} suggest that there is no need to search for a
definition elsewhere. This is the easiest case and is supported by current
Apache SIS.
h3. Other
Extract from [a mailing
list|http://www.mail-archive.com/[email protected]/msg02889.html]:
{code:xml}
<gmd:LocalisedCharacterString
locale="#frFR">Montréal</gmd:LocalisedCharacterString>
{code}
Observations:
* This is not really a parseable ISO code because of the missing {{-}}
character. We would expect a definition to be provided elsewhere because of the
{{#}} prefix, while the extract from the mail archive does not show it.
h2. Work needed in SIS
The fact that the {{<PT_Locale>}} elements providing locale definitions may
appear after the localized string complicates the handling. One possible
approach would be to create an internal object that keep a reference to a
{{DefaultInternationalString}}, a {{String}} and a locale ID, then invoke the
{{DefaultInternationalString.add(Locale, String)}} method at some later time
when the {{Locale}} become known.
--
This message was sent by Atlassian JIRA
(v6.1#6144)