Re: Atom Rank Extensions

Robert Sayre Wed, 03 May 2006 11:19:29 -0700


On 5/3/06, Andreas Sewe <[EMAIL PROTECTED]> wrote:

But does this sound like an improvement to you, too?


Sort of. I did a deep dive on this, and I think it's really huge. It's
had little community input, it's not at all clear that the approach is
correct, and it's not clear that clients will implement it, as the
space is so new. However, I do understand you want to ship something,
and it's nice that you want to document it openly. So, Information or
Experimental seems like a good fit here.

I think the document needs *a lot* of work. My comments follow:

draft-index-09:

1.  Introduction

   In the Atom Syndication Format [RFC4287], the order of entries as
   presented in a feed is typically considered to be insignificant.
   This presents a challenge when the set of entries is intended to
   represent an ordered or ranked list.  This document specifies an
   extension that allows feed publishers to establish numeric rankings
   for entries within a feed to be used as a means of organizing and
   sorting those entries.


Even with the alternative intro you've supplied, it's not clear what
problem this document solves. It seems to claim almost unlimited
powers. :) There are quite a few sections that seem to have a very low
return to me. I think you need to cut, and cut a lot. I think the
value evaporates very quickly outside of the 'ranking' element.

   The XML Namespaces URI [W3C.REC-xml-names-19990114] for the XML
   elements and attributes described in this specification is:
     http://purl.org/syndication/rank/1.0


Is this an appropriate namespace for an IETF document?


3.  Ranking Domains and Schemes

   A "Ranking Domain" is a uniquely identifiable logical set of entries
   with associated numeric ranking values.


I can parse the sentence, but I don't understand what this means.

   A "Ranking Scheme" identifies specific rules on how to interpret the
   numeric ranking values within one or more "Ranking Domains".


Same here. I suggest avoiding the new terminiology. A more
plain-spoken, concrete, operational approach would work better.

3.1.  The 'r:scheme' Element

   Ranking Schemes are defined using the r:scheme element.  A scheme
   includes zero or more r:value and r:range elements that define the
   set of possible values for the Ranking Scheme.

     rankingScheme = element r:scheme {
       atomCommonAttributes,
       attribute name { IRI }?,
       attribute label { text }?,
       attribute significance { 'ascending' | 'descending' }?,
       ( value | range )*
     }

   The "name" attribute provides a universally unique identifier for the
   scheme in the form of an absolute IRI.


This conflicts with the atom:author field. Why not @id, @domain, etc.

   The "label" attribute specifies a Language-Sensitive, human-readable
   label for the scheme.


OK...

   The "significance" attribute indicates how implementations are to
   interpret the significance of a numeric ranking value.  A value of
   "descending" indicates that the significance of the rank decreases as
   the numeric ranking value increases.  A value of "ascending"
   indicates that the significance of the rank increases as the numeric
   ranking value increases.  If not specified, the significance is
   considered to be "ascending".


@significance is also confusing. Suggest changing to @order or similar.

   An Atom feed element MAY contain any number of r:scheme elements.  A
   feed MUST NOT contain more than one r:scheme element with the same
   name.


Well, you have all of these child elements that can change the effect
of the element, so why would you want to ban this behavior?

     <feed xmlns="http://www.w3.org/2005/Atom";
           xmlns:r="http://purl.org/syndication/rank/1.0";>
       ...
       <r:scheme
         name="tag:example.org,2006:movie_reviews"
         xml:lang="en-us"
         label="Five Star Reviews"
         significance="ascending">
         <r:range scale="1"
                  step="0.5"
                  minimum="0.0"
                  maximum="5.0" />
       </r:scheme>
       ...
     </feed>


This example is confusing because you haven't defined r:range yet. I
also suggest grouping some attributes on the same line, if possible.

3.1.1.  The 'r:value' and 'r:range' Elements

   A Ranking Scheme is defined by a collection of zero or more r:value
   and r:range elements that constrain the set of values considered
   significant by the Scheme.

   The value element defines a discreet decimal value.  The element's
   content value MUST NOT contain and leading or trailing whitespace.


'discreet decimal value' is underspecified. UPDATE: I see you define
it later. I suggest using xsd:decimal in the RNC and calling out XML
Schema Datatypes at this point.

     value = element r:value {
       atomCommonAttributes,
       attribute label { text }?,
       attribute scale { decimal }?,
       ( decimal )
     }

   The value element is useful for defining Ranking Schemes consisting
   of a set of absolute values as in the example below,


This seems really complicated to me. It's not clear to me how an
application would apply this data, though I admit to having a fuzzy
notion.

   The range element defines a range of decimal values that MAY be
   bounded by minimum and maximum values.


Explain the consequences of including or omitting a boundary.


   The "scale" attribute on both the value and range elements specifies
   the total number of decimal digits to the right of the decimal
   indicator in the value of the numeric ranking value.  The scale is


I don't understand this.


   expressed as a non-negative integer.  If not specified, the value is
   considered to be zero.  Ranking Schemes that are based on fractional
   numeric ranking values SHOULD specify a scale.  Numeric ranking
   values that use a larger scale than defined for the scheme MUST be
   rounded to the nearest in-scale value (e.g. with scale=2, the rank
   0.123 is rounded down to 0.12, the rank 0.125 is rounded up to 0.13.)

...

   The "step" attribute specifies the minimum significant increment for

...

   The "origin" attribute specifies the base from which steps in a range

...

   Ranges and values defined within a Scheme MUST NOT overlap one
   another.


This section is really, really dense. I have a hard time understanding
any of it. Perhaps some Excel-like ASCII table illustrations would
help.

    rankingValue = element r:rank {
       atomCommonAttributes,
       attribute domain { IRI }?,
       attribute scheme { IRI }?,
       attribute label { text }?,
       { decimal }
     }


Given this definition, it's not clear to me why the feed-level stuff
is necessary. Is the idea that clients have to know that to edit?

   Ranking Domains group entries with attached numeric ranking values to
   logical sets.  Ranking Domains are uniquely identified by IRIs.


"Ranking Domains" come back to haunt us. I'm lost here.

   Domain Scopes SHOULD be considered open sets consistings of entries
   from any number of feeds.


More strange terminology.

   Processing a Ranking Domain to produce an ordered set involves the
   following steps:
   o  Select the Ranking Scheme.
   o  Identify the Ranking Domain
   o  Identify the available set of entries containing numeric ranking
      values within the identified Ranking Domain using the selected
      Ranking Scheme.
   o  Remove from the set all entries whose rankings fall outside the
      minimum and maximum values set by the selected Ranking Scheme.
   o  Sort the remaining set of ranked entries according to the
      significance and step of the numeric ranking as defined by the
      Ranking Scheme.


Ah, some enlightenment here. You definitely need to get this text up
front, expand it, and intersperse this text with illustrations. This
is the text that tells implementers what they're supposed to do.

   Feeds MAY contain ranked entries that have no specified scheme.  In
   such cases the Default Ranking Scheme should be applied.


This seems like overkill.

   The Default Ranking Scheme assumes ascending significance and a
   single range with no minimum or maximum value, no significant step,
   unspecified scale, and an origin of 0.


Oh wait, is this the introduction of this term? It doesn't seem useful
to me. I suggest text along the lines of "If there is no scheme
specified, [explain default processing]." No terminology necessary.

8.  Well-Known Ranking Schemes

   Feeds MAY contain ranked entries whose ranking scheme cannot be
   resolved (i.e., no r:scheme with a "name" attribute matching the
   rankings "scheme" attribute can be found).  In such cases software
   implementations MAY attempt to match such rankings to well-known
   schemes.  For instance, an online search engine may choose to define
   a ranking scheme that is reflective of the relevance of a given
   result to a search query; rather than require that a r:scheme element
   be included in every feed where the Ranking Scheme may be used, the
   search engine may separately publish its Ranking Scheme and
   associated Ranking Domain.  (The format of such a publication is
   beyond the scope of this specification.)

   A hypothetical search engine ranking using a well-known scheme

     <r:rank scheme="http://search.example.org/relevance";>5</r:rank>

   If a Ranking Scheme cannot be resolved this way (e.g., no r:scheme
   with a matching "name" attribute can be found and the scheme is not
   well-known), the Default Ranking Scheme should be applied.

   Further, it is possible that a processor may resolve multiple Ranking
   Schemes for a given Ranking.  For instance, a feed may contain an
   "r:scheme" that redefines a scheme well-known to the processor.  In
   such cases, processors should issue a warning to the user.


This capability seems like total overkill.


   Because this specification defines an extension to the Atom
   Syndication Format [RFC4287], it is subject to the same security
   consideration as defined in section 8 of that specification.

Appendix A.  Acknowledgements

   The authors gratefully acknowledge the feedback from the Atom
   Publishing working group during the development of this
   specification.


You should also acknowledge verbatim copies of text from RFC4287.



--

Robert Sayre

"I would have written a shorter letter, but I did not have the time."

Re: Atom Rank Extensions

Reply via email to