Hi,
This feedback is related to the autodiscovery draft.
Before reading on, I suggest anyone writing a specification of any kind
actually learn a little about how to write good conformance criteria.
http://ln.hixie.ch/?start=1140242962&count=1
I do not believe it is at all useful for this spec to continue as either
normative or informational. If it were to be published as
informational, who would it's target audience be? What benefit would it
provide to anyone? What purpose would it serve?
James M Snell wrote:
To document best practice as it relates specifically to syndication
feeds.
It's not entirely clear what that actually means. How would it be any
different from, or more useful than, existing documentation on the
subject that has been around for the past 3 or 4 years.
What we do need is a normative specification that clearly defines both
document and user agent conformance requirements, and that really has to
be in a normative specification. The only issue that then remains is
where this should take place and, for reasons documented later in this
e-mail, I strongly believe that HTML5 is the correct place for this to
be defined.
For example, HTML5 says nothing about whether the relative order
of feed autodiscovery links within a document is significant. The Atom
autodiscovery draft, however, defines that the order is significant.
That can be considered a limitation of the HTML5 spec which can be
addressed there. In fact, at the time of writing this, you've already
raised the issue on the WHATWG list and it looks like its been resolved.
Note: The rest of this feedback is written as though this spec were
still going to be published as a normative RFC, despite the suggestion
that it be published as an informational item only or not at all.
That's because I had most of it written before that suggestion and it's
useful feedback anyway.
Feed autodiscovery should ideally be defined independent of the
syndication feed format. It is illogical to have a separate
autodiscovery spec for Atom [1] and RSS [2]. As far as autodiscovery is
concerned, the only difference between these and any other format is the
MIME type. But, if this spec is to continue, it should at least be
renamed to "Syndication Feed Autodiscovery" or similar.
*Introduction*
The introduction should discuss the use of Atom, RSS and RDF Site
Summary because they're all widely used and are relevant to anyone
implementing autodiscovery. I suggest it also talk about the generic
concept of what a syndication feed is (independent of the syntax) and
only refer to Atom, RSS and RDF as examples.
*Notational Conventions*
This section should be titled "Conformance Requirements". It should
make a clear distinction between user agent conformance and document
conformance, and clearly explain the requirements for each.
If there are separate categories of user agents, then they should be
defined here. For instance, a conformance checker would have different
requirements from a web browser. e.g. A conformance checker must report
errors to a user, whereas as a web browser isn't required to do so and
may recover gracefully, in the way defined by the specification (where
applicable).
It should state something like the following to define which sections
are normative and non-normative.
All examples and notes in this specification are non-normative, as are
all sections explicitly marked non-normative. Everything else in this
specification is normative.
*Defintion of an autodiscovery element*
This should be moved to a separate definitions section (perhaps within
the previous conformance requirements section). It does not belong in
the Relationship to HTML and XHTML section. The definitions should also
include other terms used throughout the spec, which are then used in the
conformance requirements (see the writing specifications article linked
above).
| An Atom autodiscovery element is a link element, as defined in
| section 12.3 of HTML 4 [W3C.REC-html401-19991224].
Assuming this section is normative, that reference should be normative
also. Throughout the spec, it should also refer to it as just an
"autodiscovery element" (see above about it not just being for Atom).
I do not agree that only <link> elements should be used for
autodiscovery. Since visible meta data is always better than invisible
meta data, documents should be allowed to use the <a> element as well.
| As with other types of link elements, an autodiscovery element MAY
| appear within the <head> element of an HTML or XHTML document,
Why is that requirement only stated as a *MAY*? It should be a *MUST*
requirement and it should be made clear that this is a document
conformance requirement only.
| but it MUST NOT appear within the <body>.
For document conformance, I agree. But, UA conformance requirements
also need to be defined. What must a UA do if it finds a link element
in the body? This error is actually far more common than you may think.
As part of a study of several billion pages done by Ian Hickson in
September this year, it was found that "Parse error: link element start
tag out of place." was the 32nd most common error and happened for about
1 in 8 documents, on average. That means in roughly 12.5% of pages, the
link element occurred in the body.
The study was similar to the Web Authoring Statistics [3] published by
Google in January (also done by Ian Hickson), but with a significantly
larger sample and much more data collected.
In HTML5 (which is based upon the way several browsers have already
implemented HTML), regardless of where the link tag occurs in the
serialisation, each link element is still inserted into the head. So,
strictly speaking, it is impossible for a link element to appear in the
body in HTML, though the tag could appear anywhere the author put it.
For example, if that were not the case and the link was not inserted
into the head regardless of where it occurred, consider the following
test case. Is the link considered to be in the head or not?
<head>
<title>Autodiscovery</title>
<script type="text/javascript">
document.write("<p>test<\/p>");
</script>
<link rel="alternate" type="application/atom+xml" href="/feed" />
</head>
The answer actually depends upon whether or not script is enabled. If
it's disabled, then the answer is yes. Otherwise, when the p element
written to the serialisation, it implies the end of the head and the
beginning of the body, and so the answer is no. That is actually the
behaviour of IE7, but not Firefox, Opera or Safari.
*Relationship to HTML and XHTML*
*Syntax rules inherited from HTML*
This significantly limited, informative list of syntax requirements, if
included, should be non-normative. Instead, the spec should normatively
refer to an HTML spec which clearly defines the syntax and parsing
requirements for conforming user agents.
However, if HTML 4.01 is chosen, then as far as the SGML parsing is
concerned, HTML 4.01 cannot be implemented in the real world. No web
browser does and I seriously doubt any existing tools that implement
autodiscovery do so either.
I would strongly recommend that you normatively reference HTML5 for the
parsing requirements, which is far more relevant than HTML 4.01 is.
However, even though the relevant parts of the parsing section are
relatively stable, the spec itself is not, which is a problem because
because it's not usually a good idea to normatively reference a moving
target.
To do so would make HTML5 a dependency, but it's difficult to progress
any specification that has such unstable dependencies. In other words,
this would technically be held up by the progress of HTML5 anyway. So,
therefore, it doesn't really make sense for this to defined separately
from HTML, particularly when it actually is an HTML feature itself.
*Syntax rules inherited from XHTML*
Again, this significantly limited, informative list of syntax
requirements, if included, should be non-normative. Instead, the spec
should normatively reference the XML 1.0 and XHTML 1.0 specs which
clearly define the syntax and parsing requirements.
*The rel attribute*
| The rel attribute MUST be present in an Atom autodiscovery element.
| As defined in section 6.12 of HTML 4 [W3C.REC-html401-19991224], the
| value of the rel attribute is a space-separated list of keywords.
That's another example of a normative reference to HTML4. In this case,
it's ok to reference HTML4 for the definition of the rel attribute, but
HTML5's definition would be better.
| The list of keywords MUST include the keyword "alternate" in
| uppercase, lowercase, or mixed case.
That's a reasonable example of a document conformance requirement,
though I'd suggest it be rephrased:
The list of keywords MUST include the keyword "alternate". The value
is case-insensitive.
It is also missing user agent conformance requirements, but they would
be covered by a normative reference to the HTML spec that defines how to
process the rel attribute. There's an edge case that should be covered.
e.g.
<link rel="alternate stylesheet" type="application/atom+xml" href="/feed">
In HTML, the combination of alternate and stylesheet has special
meaning. Yet the type attribute still has the atom MIME type. Does
that still represent an autodiscovery link? If so, that should be
defined and it is currently an interoperability issue. (Note: This
looks like it's also a problem with the HTML5 spec at the moment)
I believe the feed value should also be specified in this section, as it
is in the WHATWG spec. Primarily because a syndication feed isn't
necessarily an alternative representation of the page, as clearly
demonstrated in the mozilla.org example shown in an earlier post today.
*The type attribute*
The definition of this should also include the value
"application/rss+xml". This is because of the above reasons about it
not just being for Atom and because UAs already have to support it as well.
*The href attribute*
The defintion of this is very good. It contains both document and user
agent conformance requirements. No further comments.
*Multiple autodiscovery elements*
| * Each autodiscovery element SHOULD point to a different Atom feed.
What must a UA do if multiple links point to the same feed?
| * Each autodiscovery element SHOULD include a title attribute that
| gives a human-readable label for the feed that the element points
| to. Clients MAY use these titles to present a list of available
| Atom feeds to the end user.
That "MAY" in the last sentence should be at least a "SHOULD", but
probably not a "MUST" in case a UA has a good reason not to show it.
| * The order of the autodiscovery elements is significant. The first
| element SHOULD point to the publisher's preferred feed for the
| document.
| * Clients who present a list of autodiscovered feeds to the end user
| SHOULD present them in the same order as the autodiscovery
| elements appear in the document.
| * Clients who wish to choose exactly one feed without user input
| SHOULD choose the one pointed to by the first autodiscovery
| element.
This section is quite good, but there are a few issues. What if the
first one is an unsupported format? e.g. the first is RSS, the second
is Atom, and the UA only supports Atom. Since it's a SHOULD, UAs can
technically do that, but this stuff should be explicitly defined.
What if the link elements has an hreflang attributes, indicating
alternate languages for the feeds. Say the first is "en", the second
"fr", and ther user has configured "fr" as their preferred language.
UAs should choose the first one provided in the user's preferred language.
*Examples*
I left comments about the examples section before [4]. Those comments
still stand.
[1]
http://www.ietf.org/internet-drafts/draft-snell-atompub-autodiscovery-00.txt
[2] http://www.rssboard.org/rss-autodiscovery
[3] http://code.google.com/webstats/
[4] http://www.imc.org/atom-syntax/mail-archive/msg19103.html
--
Lachlan Hunt
http://lachy.id.au/