At 17:47 05/01/17, David Powell wrote:

>Reading the XML spec, I'm not clear that we're allowed to restrict the
>inheritance of xml:lang?
>
>>From the spec:
>
>> The intent declared with xml:lang is considered to apply to all
>> attributes and content of the element where it is specified, unless
>> overridden with an instance of xml:lang on another element within
>> that content.

From this text, it is indeed not clear. There is an erratum that
makes this clearer. Please see
http://www.w3.org/XML/xml-V10-3e-errata#E01.


>If we allow it to inherit inappropriately, and we restrict it to >certain elements, then this makes the situation even worse, as we'll >have no way of re-declaring or un-declaring xml:lang on elements that >it shouldn't apply to.

I don't understand this. If the spec says that it doesn't apply to
some element, there is no reason to un-declare it in the content,
and of course even less a reason to re-declare it.

Please also note that the current version of the XML spec allows
xml:lang="", i.e. the empty string as a value to say that you don't
have any information about the language. This is handy for cases
like 1) the element is natural language text, but the actual content
is e.g. only math or some other special notation that doesn't
belong to a natural language, and 2) the actual content is
natural language text, but the software doesn't know which
language it is.

As an aside: We should make sure that the Atom spec is very
clear for each element that inherits (e.g. copyright,...)
how to define the absence of such information. As experiece
with xml:lang and other things has shown, having something
like a 'reset' or 'neutral element' is extremely important
every time there is inheritance, in particular for information
that comes from different sources.

>I think it is very unlikely that a user would want to include a title,
>summary, and content in a different language.

Well, somewhat unlikely, but not impossible. What about e.g. a service
in Japan that takes some English feed and just changes the titles
to Japanese for quicker browsing by a Japanese audience?


>If we only allow xml:lang on Text constructs and atom:content, then >this means that the user would typically have to include it on titles, >summaries, and content separately. Would that be ok?

Why have to do that for feeds that are completely or mostly monolingual?

>Would it be any better than a dc:language style atom:language?

Yes. It would still be better.

>> As for RDF's support of xml:lang, there is a very RDF-specific,
>> unfortunate issue that has to be taken into account when
>> converting from atom to RDF/XML: RDF/XML at certain points
>> cuts the inheritance of xml:lang.
>
>I didn't realize this - where is it?

Ok, here it is: The RDF Graph model doesn't deal with mixed content
(such as typical XHTML), but the data model has a special datatype
that in RDF/XML is indicated by the rdf:parseType="Literal" attribute.
The problem is that whenever you use this, RDF/XML requires you to
cut language inheritance.

From
http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Literals:

>>>>
For text that may contain markup, use typed literals with type rdf:XMLLiteral. If language annotation is required, it must be explicitly included as markup, usually by means of an xml:lang attribute. [XHTML] may be included within RDF in this way. Sometimes, in this latter case, an additional span or div element is needed to carry an xml:lang or lang attribute.
>>>>



What this means in practice for Atom is that if it were not for this, you could just use a DTD to declare a default attribute of rdf:parseType="Literal" on things like <title>.

However, this won't work, because if you say

<entry xml:lang='en'>
    ...
    <title>My entry title.</title>
    ...
</entry>

which will look to a parser like

<entry xml:lang='en'>
    ...
    <title rdf:parseType="Literal">My entry title.</title>
    ...
</entry>

The XML Literal won't carry the language information.

Even if you put the language information right on title, as so:

<entry>
    ...
    <title  xml:lang='en' rdf:parseType="Literal">My entry title.</title>
    ...
</entry>

it still isn't picked up by RDF/XML. What RDF/XML requires you to do
is something like

 <entry xml:lang='en'>
    ...
    <title rdf:parseType="Literal"><span xml:lang='en'>My entry 
title.</span></title>
    ...
</entry>

i.e. in order to give the language information to the title in
RDF/XML, you have to introduce a 'dummy' element.

This means that the simple DTD approach won't work, but such
a transform of course can be done by XSLT. It also means that
Henry and friends have some more work to define how Atom
converts/relates to RDF, because they have to define when/how
to introduce dummy elements.


Regards, Martin.




Reply via email to