At 17:47 05/01/17, David Powell wrote:
>Reading the XML spec, I'm not clear that we're allowed to restrict the >inheritance of xml:lang? > >>From the spec: > >> The intent declared with xml:lang is considered to apply to all >> attributes and content of the element where it is specified, unless >> overridden with an instance of xml:lang on another element within >> that content.
From this text, it is indeed not clear. There is an erratum that makes this clearer. Please see http://www.w3.org/XML/xml-V10-3e-errata#E01.
>If we allow it to inherit inappropriately, and we restrict it to >certain elements, then this makes the situation even worse, as we'll >have no way of re-declaring or un-declaring xml:lang on elements that >it shouldn't apply to.
I don't understand this. If the spec says that it doesn't apply to some element, there is no reason to un-declare it in the content, and of course even less a reason to re-declare it.
Please also note that the current version of the XML spec allows xml:lang="", i.e. the empty string as a value to say that you don't have any information about the language. This is handy for cases like 1) the element is natural language text, but the actual content is e.g. only math or some other special notation that doesn't belong to a natural language, and 2) the actual content is natural language text, but the software doesn't know which language it is.
As an aside: We should make sure that the Atom spec is very clear for each element that inherits (e.g. copyright,...) how to define the absence of such information. As experiece with xml:lang and other things has shown, having something like a 'reset' or 'neutral element' is extremely important every time there is inheritance, in particular for information that comes from different sources.
>I think it is very unlikely that a user would want to include a title, >summary, and content in a different language.
Well, somewhat unlikely, but not impossible. What about e.g. a service in Japan that takes some English feed and just changes the titles to Japanese for quicker browsing by a Japanese audience?
>If we only allow xml:lang on Text constructs and atom:content, then >this means that the user would typically have to include it on titles, >summaries, and content separately. Would that be ok?
Why have to do that for feeds that are completely or mostly monolingual?
>Would it be any better than a dc:language style atom:language?
Yes. It would still be better.
>> As for RDF's support of xml:lang, there is a very RDF-specific, >> unfortunate issue that has to be taken into account when >> converting from atom to RDF/XML: RDF/XML at certain points >> cuts the inheritance of xml:lang. > >I didn't realize this - where is it?
Ok, here it is: The RDF Graph model doesn't deal with mixed content (such as typical XHTML), but the data model has a special datatype that in RDF/XML is indicated by the rdf:parseType="Literal" attribute. The problem is that whenever you use this, RDF/XML requires you to cut language inheritance.
From http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Literals:
>>>>
For text that may contain markup, use typed literals with type rdf:XMLLiteral. If language annotation is required, it must be explicitly included as markup, usually by means of an xml:lang attribute. [XHTML] may be included within RDF in this way. Sometimes, in this latter case, an additional span or div element is needed to carry an xml:lang or lang attribute.
>>>>
What this means in practice for Atom is that if it were not for this, you could just use a DTD to declare a default attribute of rdf:parseType="Literal" on things like <title>.
However, this won't work, because if you say
<entry xml:lang='en'>
...
<title>My entry title.</title>
...
</entry>which will look to a parser like
<entry xml:lang='en'>
...
<title rdf:parseType="Literal">My entry title.</title>
...
</entry>The XML Literal won't carry the language information.
Even if you put the language information right on title, as so:
<entry>
...
<title xml:lang='en' rdf:parseType="Literal">My entry title.</title>
...
</entry>it still isn't picked up by RDF/XML. What RDF/XML requires you to do is something like
<entry xml:lang='en'>
...
<title rdf:parseType="Literal"><span xml:lang='en'>My entry
title.</span></title>
...
</entry>i.e. in order to give the language information to the title in RDF/XML, you have to introduce a 'dummy' element.
This means that the simple DTD approach won't work, but such a transform of course can be done by XSLT. It also means that Henry and friends have some more work to define how Atom converts/relates to RDF, because they have to define when/how to introduce dummy elements.
Regards, Martin.
