Hi,
  This feedback is related to the autodiscovery draft.

Before reading on, I suggest anyone writing a specification of any kind actually learn a little about how to write good conformance criteria.

http://ln.hixie.ch/?start=1140242962&count=1

I do not believe it is at all useful for this spec to continue as either normative or informational. If it were to be published as informational, who would it's target audience be? What benefit would it provide to anyone? What purpose would it serve?

James M Snell wrote:
To document best practice as it relates specifically to syndication
feeds.

It's not entirely clear what that actually means. How would it be any different from, or more useful than, existing documentation on the subject that has been around for the past 3 or 4 years.

What we do need is a normative specification that clearly defines both document and user agent conformance requirements, and that really has to be in a normative specification. The only issue that then remains is where this should take place and, for reasons documented later in this e-mail, I strongly believe that HTML5 is the correct place for this to be defined.

For example, HTML5 says nothing about whether the relative order of feed autodiscovery links within a document is significant. The Atom autodiscovery draft, however, defines that the order is significant.

That can be considered a limitation of the HTML5 spec which can be addressed there. In fact, at the time of writing this, you've already raised the issue on the WHATWG list and it looks like its been resolved.


Note: The rest of this feedback is written as though this spec were still going to be published as a normative RFC, despite the suggestion that it be published as an informational item only or not at all. That's because I had most of it written before that suggestion and it's useful feedback anyway.


Feed autodiscovery should ideally be defined independent of the syndication feed format. It is illogical to have a separate autodiscovery spec for Atom [1] and RSS [2]. As far as autodiscovery is concerned, the only difference between these and any other format is the MIME type. But, if this spec is to continue, it should at least be renamed to "Syndication Feed Autodiscovery" or similar.


*Introduction*

The introduction should discuss the use of Atom, RSS and RDF Site Summary because they're all widely used and are relevant to anyone implementing autodiscovery. I suggest it also talk about the generic concept of what a syndication feed is (independent of the syntax) and only refer to Atom, RSS and RDF as examples.


*Notational Conventions*

This section should be titled "Conformance Requirements". It should make a clear distinction between user agent conformance and document conformance, and clearly explain the requirements for each.

If there are separate categories of user agents, then they should be defined here. For instance, a conformance checker would have different requirements from a web browser. e.g. A conformance checker must report errors to a user, whereas as a web browser isn't required to do so and may recover gracefully, in the way defined by the specification (where applicable).

It should state something like the following to define which sections are normative and non-normative.

  All examples and notes in this specification are non-normative, as are
  all sections explicitly marked non-normative. Everything else in this
  specification is normative.


*Defintion of an autodiscovery element*

This should be moved to a separate definitions section (perhaps within the previous conformance requirements section). It does not belong in the Relationship to HTML and XHTML section. The definitions should also include other terms used throughout the spec, which are then used in the conformance requirements (see the writing specifications article linked above).

| An Atom autodiscovery element is a link element, as defined in
| section 12.3 of HTML 4 [W3C.REC-html401-19991224].

Assuming this section is normative, that reference should be normative also. Throughout the spec, it should also refer to it as just an "autodiscovery element" (see above about it not just being for Atom).

I do not agree that only <link> elements should be used for autodiscovery. Since visible meta data is always better than invisible meta data, documents should be allowed to use the <a> element as well.

| As with other types of link elements, an autodiscovery element MAY
| appear within the <head> element of an HTML or XHTML document,

Why is that requirement only stated as a *MAY*? It should be a *MUST* requirement and it should be made clear that this is a document conformance requirement only.

| but it MUST NOT appear within the <body>.

For document conformance, I agree. But, UA conformance requirements also need to be defined. What must a UA do if it finds a link element in the body? This error is actually far more common than you may think.

As part of a study of several billion pages done by Ian Hickson in September this year, it was found that "Parse error: link element start tag out of place." was the 32nd most common error and happened for about 1 in 8 documents, on average. That means in roughly 12.5% of pages, the link element occurred in the body.

The study was similar to the Web Authoring Statistics [3] published by Google in January (also done by Ian Hickson), but with a significantly larger sample and much more data collected.

In HTML5 (which is based upon the way several browsers have already implemented HTML), regardless of where the link tag occurs in the serialisation, each link element is still inserted into the head. So, strictly speaking, it is impossible for a link element to appear in the body in HTML, though the tag could appear anywhere the author put it.

For example, if that were not the case and the link was not inserted into the head regardless of where it occurred, consider the following test case. Is the link considered to be in the head or not?

<head>
<title>Autodiscovery</title>
<script type="text/javascript">
document.write("<p>test<\/p>");
</script>
<link rel="alternate" type="application/atom+xml" href="/feed" />
</head>

The answer actually depends upon whether or not script is enabled. If it's disabled, then the answer is yes. Otherwise, when the p element written to the serialisation, it implies the end of the head and the beginning of the body, and so the answer is no. That is actually the behaviour of IE7, but not Firefox, Opera or Safari.


*Relationship to HTML and XHTML*

*Syntax rules inherited from HTML*

This significantly limited, informative list of syntax requirements, if included, should be non-normative. Instead, the spec should normatively refer to an HTML spec which clearly defines the syntax and parsing requirements for conforming user agents.

However, if HTML 4.01 is chosen, then as far as the SGML parsing is concerned, HTML 4.01 cannot be implemented in the real world. No web browser does and I seriously doubt any existing tools that implement autodiscovery do so either.

I would strongly recommend that you normatively reference HTML5 for the parsing requirements, which is far more relevant than HTML 4.01 is. However, even though the relevant parts of the parsing section are relatively stable, the spec itself is not, which is a problem because because it's not usually a good idea to normatively reference a moving target.

To do so would make HTML5 a dependency, but it's difficult to progress any specification that has such unstable dependencies. In other words, this would technically be held up by the progress of HTML5 anyway. So, therefore, it doesn't really make sense for this to defined separately from HTML, particularly when it actually is an HTML feature itself.


*Syntax rules inherited from XHTML*

Again, this significantly limited, informative list of syntax requirements, if included, should be non-normative. Instead, the spec should normatively reference the XML 1.0 and XHTML 1.0 specs which clearly define the syntax and parsing requirements.


*The rel attribute*

| The rel attribute MUST be present in an Atom autodiscovery element.
| As defined in section 6.12 of HTML 4 [W3C.REC-html401-19991224], the
| value of the rel attribute is a space-separated list of keywords.

That's another example of a normative reference to HTML4. In this case, it's ok to reference HTML4 for the definition of the rel attribute, but HTML5's definition would be better.

| The list of keywords MUST include the keyword "alternate" in
| uppercase, lowercase, or mixed case.

That's a reasonable example of a document conformance requirement, though I'd suggest it be rephrased:

  The list of keywords MUST include the keyword "alternate".  The value
  is case-insensitive.

It is also missing user agent conformance requirements, but they would be covered by a normative reference to the HTML spec that defines how to process the rel attribute. There's an edge case that should be covered. e.g.

<link rel="alternate stylesheet" type="application/atom+xml" href="/feed">

In HTML, the combination of alternate and stylesheet has special meaning. Yet the type attribute still has the atom MIME type. Does that still represent an autodiscovery link? If so, that should be defined and it is currently an interoperability issue. (Note: This looks like it's also a problem with the HTML5 spec at the moment)

I believe the feed value should also be specified in this section, as it is in the WHATWG spec. Primarily because a syndication feed isn't necessarily an alternative representation of the page, as clearly demonstrated in the mozilla.org example shown in an earlier post today.

*The type attribute*

The definition of this should also include the value "application/rss+xml". This is because of the above reasons about it not just being for Atom and because UAs already have to support it as well.

*The href attribute*

The defintion of this is very good. It contains both document and user agent conformance requirements. No further comments.

*Multiple autodiscovery elements*

| * Each autodiscovery element SHOULD point to a different Atom feed.

What must a UA do if multiple links point to the same feed?

| * Each autodiscovery element SHOULD include a title attribute that
|   gives a human-readable label for the feed that the element points
|   to.  Clients MAY use these titles to present a list of available
|   Atom feeds to the end user.

That "MAY" in the last sentence should be at least a "SHOULD", but probably not a "MUST" in case a UA has a good reason not to show it.

| * The order of the autodiscovery elements is significant.  The first
|   element SHOULD point to the publisher's preferred feed for the
|   document.
| * Clients who present a list of autodiscovered feeds to the end user
|   SHOULD present them in the same order as the autodiscovery
|   elements appear in the document.
| * Clients who wish to choose exactly one feed without user input
|   SHOULD choose the one pointed to by the first autodiscovery
|   element.

This section is quite good, but there are a few issues. What if the first one is an unsupported format? e.g. the first is RSS, the second is Atom, and the UA only supports Atom. Since it's a SHOULD, UAs can technically do that, but this stuff should be explicitly defined.

What if the link elements has an hreflang attributes, indicating alternate languages for the feeds. Say the first is "en", the second "fr", and ther user has configured "fr" as their preferred language. UAs should choose the first one provided in the user's preferred language.

*Examples*

I left comments about the examples section before [4]. Those comments still stand.

[1] http://www.ietf.org/internet-drafts/draft-snell-atompub-autodiscovery-00.txt
[2] http://www.rssboard.org/rss-autodiscovery
[3] http://code.google.com/webstats/
[4] http://www.imc.org/atom-syntax/mail-archive/msg19103.html

--
Lachlan Hunt
http://lachy.id.au/

Reply via email to