Andreas Sewe wrote:
> 
> Lisa Dusseault wrote:
>> For related work, you could look at the email spam filtering stuff 
>> from SIEVE: <http://www.ietf.org/internet-drafts/draft-ietf-sieve-
>> spamtestbis-02.txt>
> 
> Thanks for the pointer. I was aware of SIEVE's Filtering Extension but
> didn't file it mentally as related to the Atom Rank Extension.
> 

Yes, I had reviewed SIEVE's mechanism a while back.  In fact, one of the
use cases I had originally considered for feed rank is that the r:rank
element could be used by syndication intermediaries to express spam
rankings for feed entries...

e.g.

  <scheme name="tag:example.org,2006:spamtest"
          significance="ascending">
    <range minimum="0" maximum="100" step="1" />
  </scheme>

  <rank scheme="tag:example.org,2006:spamtest"
        label="Not Spam">0</rank>
  <rank scheme="tag:example.org,2006:spamtest"
        label="Might be spam">25</rank>
  <rank scheme="tag:example.org,2006:spamtest"
        label="Likely spam">50</rank>
  <rank scheme="tag:example.org,2006:spamtest"
        label="Spam">100</rank>


>> The similar problem is that many spam libraries already produce some 
>> kind of linear or similar scale of severity/likelihood rating for 
>> emails, much like existing services already provide their own scale 
>> of post rankings.  The approach SIEVE took, very roughly, was simply 
>> to tell implementors to find some way to map their implementation's 
>> ranking scheme to a canonical range of numerical values.  A SIEVE 
>> implementation might have one algorithm for converting SpamAssassin 
>> rankings to the canonical scale, and a different algorithm for some 
>> different library.
> 
> Unfortunately this approach does not suit ranking entries very well; the
> only "canonical range" I can think of being a superset of every
> conceivable Ranking Scheme is xsd:decimal. Smaller sets simply won't do.
> (While loss of information might be acceptable when classifying spam,
> other ranking use cases, e.g., grades, are less tolerant to it.)
> 

Right.  One of the goals for this work is to allow for as broad of a
range of ranking scenarios as possible. No single c18n scheme is capable
of covering such a range.

>[snip]
> But even when dropping r:scheme's capability to define a set of allowed
> values having an r:scheme-like element around would still be useful:
> 
> - It provides a single place to specify a scheme's @significance
> (ascending or descending). Otherwise each and every r:rank element needs
> to carry its own @significance -- which is especially pointless since
> @significance has not significance (no pun intended) for a single
> ranking value; it only ever makes sense for a set of values.
> 

One point that I really do need to make clear in the draft is that ranks
can be processed relative to ranking schemes other than what is
identified in the rank element's scheme attribute.

For instance, going back to the spam example, it is entirely possible
that I, as a client, could define a variant of the spamtest ranking
scheme that is much more restrictive in range than the generic scheme
shown above.

  <scheme name="tag:example.com,2006:myspamtest"
          significance="ascending">
    <range minimum="0" maximum="50 step="1" />
  </scheme>

The set of entries produced by the processing the spamtest rankings
against this scheme would automatically exclude all entries with a
spamtest rank greater than 50.

>[snip]
> To summarize, droppping r:scheme's datatyping capabilities might be
> acceptable, but dropping the concepts of Ranking Scheme and Domain is
> not. I would, however, like to see datatyping being included as well --
> iff we can come up with a solution which can describe all the common
> uses cases out there!
> 

+1

- James

Reply via email to