date:20050504

Re: Autodiscovery

2005-05-04 Thread fantasai

Phil Ringnalda wrote:

 Arve Bersvendsen wrote:

On Tue, 03 May 2005 18:52:59 +0200, Tim Bray [EMAIL PROTECTED] wrote:
http://diveintomark.org/rfc/draft-ietf-atompub-autodiscovery-01.txt

1) Change the attribute value for the rel from alternate to feed, 
Don't forget, since you would be doing that primarily for people who 
think too much, that you'll also need to include a profile [1] URI and 
make a guess at what dereferencing that URI ought to return, and 
probably take a stab at explaining how to deal with multiple profiles, 
since the HTML spec punted on that.
This would not be necessary if 'feed' were added to the HTML standard
directly.
Popularizing feed would have one benefit outside Atom's scope, though: 
there's currently no useful way for an RSS 1.0 feed to do autodiscovery 
with type=application/rdf+xml since it could be any alternate RDF, not 
just RSS: if Atom breaks the ice with feed then RSS 1.0 wins.
'feed' is not really defining a /relation/, it's defining a sort of
meta-content-type... But I would much prefer that to forcing 'alternate'
on non-'alternate' links.
~fantasai
(Copying to WHATWG mailing list: http://www.whatwg.org/ )

Re: Autodiscovery

2005-05-04 Thread Henri Sivonen

On May 4, 2005, at 02:56, David Nesting wrote:
Plus, feed is kind of application-specific.  What about related?
It's a spec for discovering *feeds*. It is proper to have an 
app-specific rel value to avoid feed-specific apps downloading non-feed 
related documents.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: Last Call: 'The Atom Syndication Format' to Proposed Standard

2005-05-04 Thread Henri Sivonen

On Apr 29, 2005, at 12:17, Martin Duerst wrote:
Making this more precise is definitely desirable. But there is also
an i18n issue: This works fine for languages that use spaces between
words. It doesn't work for languages that don't have spaces between
words (Chinese, Japanese, Thai,...). If Text elements are only used
for short things such as names or titles, that's not a big issue,
the text in question can just be put on a single line. However,
when the texts in question are long, it's a serious issue, and
should be fixed.
You seem to be assuming that the length of a line is restricted in 
XML source. Why? As far as I can tell, it should be permissible to 
produce Atom documents that contain no LF or CR characters.

Can't languages without spaces use long source lines and apply soft 
wrapping in a source view if necessary? Why is this a wire format 
problem?

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: Autodiscovery

2005-05-04 Thread Eric Scheid


On 4/5/05 3:52 PM, fantasai [EMAIL PROTECTED] wrote:

 'feed' is not really defining a /relation/, it's defining a sort of
 meta-content-type... But I would much prefer that to forcing 'alternate'
 on non-'alternate' links.

instead of feed, consider updates, which gets closer to the gist of the
sense

e.

Re: AutoDiscovery

2005-05-04 Thread Anne van Kesteren

Randy Charles Morin wrote:
+1 to adding lang as an attribute to link
thanks Robert
link lang='en' ...
The HTML and XHTML specification already define that.
--
 Anne van Kesteren
 http://annevankesteren.nl/

Re: Atom feed refresh rates

2005-05-04 Thread Julian Reschke

Brett Lindsley wrote:

Andy, I recall bringing up the same issue with respect to portable 
devices. My angle
was that firing up the transmitter, making a network connection and 
connecting to
the server is still an expensive operation in time and power (for a 
portable
device) - even if the server returns nothing .  There is no reason to 
check feeds
that are not being updated, but then, there currently is no way to know 
this.

I recall there was a proposal on cache control. That seemed like a good 
direction,
but I don't recall it being discussed. As you indicated, if the feed had 
some
element that indicated it won't be updated (for example) for another day 
(e.g.
a daily news summary), then the end client would need to only check once
a day.

Brett Lindsley, Motorola Labs
Isn't this what the HTTP Expires header is for 
(http://greenbytes.de/tech/webdav/rfc2616.html#header.expires)?

Best regards, Julian

Re: Atom feed refresh rates

2005-05-04 Thread Brett Lindsley

In reviewing the protocol spec (and the basic protocol spec), there is 
no mention
of recommended HTTP headers. There are examples in the basic protocol 
spec that
shows ETag and Last-Modified but not Expires. Maybe there should be a 
section
in the protocol spec showing recommended headers (a SHOULD) for use
with Atom feeds. This would encourage the use of these three headers.

Brett Lindsley, Motorola Labs.
Julian Reschke wrote:
Brett Lindsley wrote:

Andy, I recall bringing up the same issue with respect to portable 
devices. My angle
was that firing up the transmitter, making a network connection and 
connecting to
the server is still an expensive operation in time and power (for a 
portable
device) - even if the server returns nothing .  There is no reason to 
check feeds
that are not being updated, but then, there currently is no way to 
know this.

I recall there was a proposal on cache control. That seemed like a 
good direction,
but I don't recall it being discussed. As you indicated, if the feed 
had some
element that indicated it won't be updated (for example) for another 
day (e.g.
a daily news summary), then the end client would need to only check 
once
a day.

Brett Lindsley, Motorola Labs

Isn't this what the HTTP Expires header is for 
(http://greenbytes.de/tech/webdav/rfc2616.html#header.expires)?

Best regards, Julian

RE: Atom feed refresh rates

2005-05-04 Thread Andy Henderson


Isn't this what the HTTP Expires header is for
(http://greenbytes.de/tech/webdav/rfc2616.html#header.expires)?

I don't think this helps a lot with my original issue because in many cases
a feed's updater will either not know when they will next update the feed,
or will be updating the feed frequently throughout the day.

Andy

Re: Atom feed refresh rates

2005-05-04 Thread Graham

On 4 May 2005, at 9:10 am, Andy Henderson wrote:
I am adding Atom support to my Agg.  For RSS feeds, I have used the  
ttl and
sy:updatePeriod / sy:updateFrequency elements to  allow feed  
providers to
limit refresh rates.
Why?
I have, in any case, imposed a minimum refresh rate of one hour -  
because that seemed the decent thing to do.
This is a myth perpetuated by cheapskate bloggers. There's no  
technical reason for it beyond I bought a lousy hosting package.

However, I'm coming under pressure to reduce that minimum limit for  
feeds that are clearly
designed for shorter refresh periods - such as the Gmail Atom  
feeds.  I'm reluctant to implement a free-for-all so I'm looking  
for guidance on how I should tackle this issue.
Keep the global setting for all feeds limited to 60 (or 30) minutes,  
but allow the setting for individual feeds to be set lower.

Graham

Re: Atom feed refresh rates

2005-05-04 Thread Julian Reschke

Andy Henderson wrote:
Isn't this what the HTTP Expires header is for
(http://greenbytes.de/tech/webdav/rfc2616.html#header.expires)?
I don't think this helps a lot with my original issue because in many cases
a feed's updater will either not know when they will next update the feed,
 or will be updating the feed frequently throughout the day.
If they don't know that, how can the previous response you got help you 
in determining when to poll next?

Best regards,
Julian

Re: Atom feed refresh rates

2005-05-04 Thread Graham

On 4 May 2005, at 11:44 am, Brett Lindsley wrote:
There is no reason to check feeds that are not being updated, but  
then, there currently is no way to know this.
plug plug: http://www.fondantfancies.com/apps/shrook/distfaq.php
As you indicated, if the feed had some
element that indicated it won't be updated (for example) for  
another day (e.g.
a daily news summary), then the end client would need to only  
check once
a day.
Please don't confuse bandwidth (number of posts per day) with latency  
(checking rate). They're largely unrelated. You could only check once  
per day if the daily summary appeared at an exact, known time, was  
never late, and was never updated later.

Graham

Re: Autodiscovery

2005-05-04 Thread Antone Roundy

On Tuesday, May 3, 2005, at 11:41  PM, fantasai wrote:
David Nesting wrote:
I expect that many of my implementations will utilize content 
negotiation
(using the same URL as an HTML representation, where needed), so I 
expect
that I'll have some links like:
  link rel=alternate href=/ type=application/atom+xml
  link rel=alternate href=/ type=application/rss+xml
Or even
  link rel=alternate href= type=application/atom+xml
  link rel=alternate href= type=application/rss+xml
That won't work, because content negotiation will continue to return
the same thing it returned just now. You must somehow tell the server
to return a specific other version of the current document, and you
do that typically by sending a GET request with a different URL --
one that specifies a particular version of the resource.
See http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14
GET /path-to-the-feed HTTP/1.1
Accept: application/atom+xml
...
You don't have to change the URL--just list only the format you want in 
the Accept header.  If the autodiscovery link was lying/mistaken and 
that format really isn't available at that URL, you should get a 406 
(not acceptable) response.

Re: Autodiscovery

2005-05-04 Thread Robert Sayre


On 5/4/05, fantasai [EMAIL PROTECTED] wrote:
 
  Who's to say we can't overload it a little for this case?
 
 You are not writing the HTML 4.01 spec, you're writing an autodiscovery
 spec that takes advantage of the syntax *and semantics* given in HTML 4.
 Your specification should be consistent with HTML 4, not contradictory
 to it.

The autodiscovery spec is a reasonable interpretation of the *one
line* definition of the 'alternate' relation. It is not contradictory.

Robert Sayre

Re: Atom feed refresh rates

2005-05-04 Thread Walter Underwood


PaceCaching uses the HTTP model for Atom, whether Atom is used over HTTP
or some other protocol.

PaceCaching was rejected by the editors because it was too late (two months
ago) and non-core. I think that: a) it is never too late to get it right,
and b) scalability is core.

The PACE describes why refresh rates do not solve the problem adequately.

wunder

--On May 4, 2005 5:44:18 AM -0500 Brett Lindsley [EMAIL PROTECTED] wrote:

 
 
 Andy, I recall bringing up the same issue with respect to portable devices. 
 My angle
 was that firing up the transmitter, making a network connection and 
 connecting to
 the server is still an expensive operation in time and power (for a portable
 device) - even if the server returns nothing .  There is no reason to check 
 feeds
 that are not being updated, but then, there currently is no way to know this.
 
 I recall there was a proposal on cache control. That seemed like a good 
 direction,
 but I don't recall it being discussed. As you indicated, if the feed had some
 element that indicated it won't be updated (for example) for another day (e.g.
 a daily news summary), then the end client would need to only check once
 a day.
 
 Brett Lindsley, Motorola Labs
 
 Andy Henderson wrote:
 
 If I'm asking this in the wrong place, sorry; please redirect me if you can.
 
 I am the author of an Aggregator and I'm looking for advice on refresh
 rates.  There was some discussion in this group back in June about a
 possible 'Refresh rate' element.  That seems to have been dismissed in
 favour of bandwidth throttling techniques, notably etag, last-modified and
 compression.  I already support all these plus some additional ones.  I am
 uncomfortable, though, with the implication that refresh rates don't matter
 and should be left to the end-user to decide.
 
 I am adding Atom support to my Agg.  For RSS feeds, I have used the ttl and
 sy:updatePeriod / sy:updateFrequency elements to  allow feed providers to
 limit refresh rates.  I have, in any case, imposed a minimum refresh rate of
 one hour - because that seemed the decent thing to do.  However, I'm coming
 under pressure to reduce that minimum limit for feeds that are clearly
 designed for shorter refresh periods - such as the Gmail Atom feeds.  I'm
 reluctant to implement a free-for-all so I'm looking for guidance on how I
 should tackle this issue.
 
 Andy Henderson
 Constructive IT Advice
 
  
 
 
 
 



--
Walter Underwood
Principal Architect, Verity

Re: Atom feed refresh rates

2005-05-04 Thread Robert Sayre


On 5/4/05, Walter Underwood [EMAIL PROTECTED] wrote:
 
 PaceCaching uses the HTTP model for Atom, whether Atom is used over HTTP
 or some other protocol.
 
 PaceCaching was rejected by the editors because it was too late (two months
 ago) and non-core.

In this WG, the editors don't reject proposals or schedule issues.
Those tasks fall to the chairs and secretary, respectively.

Robert Sayre

Re: Atom feed refresh rates

2005-05-04 Thread Eric Scheid


On 5/5/05 12:44 AM, Graham [EMAIL PROTECTED] wrote:

 uses 3GB a day, or about $1.20 at current prices.

only in some parts of the world.

over here I'm paying 13.2 cents per K and reading from a recent bill
2,982.61 Kbytes cost me $393.79 AUD.

e.

Re: Autodiscovery

2005-05-04 Thread Joe Gregorio


On 5/4/05, Robert Sayre [EMAIL PROTECTED] wrote:
 
 On 5/4/05, fantasai [EMAIL PROTECTED] wrote:
 
   Who's to say we can't overload it a little for this case?
 
  You are not writing the HTML 4.01 spec, you're writing an autodiscovery
  spec that takes advantage of the syntax *and semantics* given in HTML 4.
  Your specification should be consistent with HTML 4, not contradictory
  to it.
 
 The autodiscovery spec is a reasonable interpretation of the *one
 line* definition of the 'alternate' relation. It is not contradictory.

+1

-- 
Joe Gregoriohttp://bitworking.org

Re: Atom feed refresh rates

2005-05-04 Thread Tim Bray

On May 4, 2005, at 7:44 AM, Graham wrote:
A quick look at that site turned up only one other site actually 
complaining, MSDN, and they changed their minds:
Actually, as I recall, last time this came up (proposed by Walter 
Underwood), someone pointed out accurately that RSS2 has had this 
functionality for a long time and that nobody ever really implemented 
it; thus there was a strong vote from experience against such a 
feature. -Tim

Re: Atom feed refresh rates

2005-05-04 Thread Roger B.


 This is a myth perpetuated by cheapskate bloggers. There's no
 technical reason for it beyond I bought a lousy hosting package.

Graham: I disagree. In a time where referrer and trackback spam agents
are hammering servers everywhere, it's quite reasonable for aggregator
developers to exhibit restraint and not add to the burden that the
blogosphere has unintentionally created.

That's not to say that there's something necessarily wrong with an
aggregator that allows users to pull feeds every five minutes. If
you're building something for people who are going to be subscribing
to Gmail feeds or referrer logs (I'm subscribed to both in
Newzcrawler), then you have to cater to those needs. The most anyone
can ask is that you provide reasonable defaults and leave it at that.

But I've got my own code set to limit refreshes to an hour or more,
and don't forsee changing it. It's the right thing for *me* to do.

--
Roger Benningfield

Re: Autodiscovery

2005-05-04 Thread fantasai

Arve Bersvendsen wrote:
On Wed, 04 May 2005 09:43:38 +0200, Eric Scheid  
[EMAIL PROTECTED] wrote:

instead of feed, consider updates, which gets closer to the gist 
of  the sense
No. To me 'Updates' signifies that something is 'updated'. Even posting  
new content falls outside of that definition.
That would signify updates to my document. If I'm linking to the
CNN news feed, or my-favorite-blog, that wouldn't be appropriate.
For this purpose, the syntax needs to signify that this is a feed,
that it needs to be handled as such.. and that there is no other
significant relationship between the document and the feed it links
to (unless otherwise specified).
~fantasai

Re: Autodiscovery

2005-05-04 Thread fantasai

Robert Sayre wrote:
On 5/4/05, fantasai [EMAIL PROTECTED] wrote:
Who's to say we can't overload it a little for this case?
You are not writing the HTML 4.01 spec, you're writing an autodiscovery
spec that takes advantage of the syntax *and semantics* given in HTML 4.
Your specification should be consistent with HTML 4, not contradictory
to it.
The autodiscovery spec is a reasonable interpretation of the *one
line* definition of the 'alternate' relation. It is not contradictory.
The definition of 'alternate' is not one line long on my screen, but
here's the first sentence of it:
 # Alternate
 #   Designates substitute versions for the document in which the link occurs.
 -- http://www.w3.org/TR/REC-html40/types.html#h-6.12
How is a link from the top of my homepage to my friend's weblog feed
designating a substitute version for the document in which the link
occurs?
Note that we are not arguing the semantics of the link element in
an Atom document, but the semantics of the link element in an HTML
document.
~fantasai

Re: Autodiscovery

2005-05-04 Thread Robert Sayre


On 5/4/05, fantasai [EMAIL PROTECTED] wrote:
 
 The definition of 'alternate' is not one line long on my screen, but
 here's the first sentence of it:
 
   # Alternate
   #   Designates substitute versions for the document in which the link 
 occurs.
 
   -- http://www.w3.org/TR/REC-html40/types.html#h-6.12
 
 How is a link from the top of my homepage to my friend's weblog feed
 designating a substitute version for the document in which the link
 occurs?

I don't know, but I'm not sure why you think that's what the
autodiscovery spec endorses. Is there some part of the spec that
endorses that? The autodiscovery spec is for use by UAs like Mozilla
and Safari that present little icons alerting the user to a feed
version of a page. Often, I never visit the page again, once I've
subscribed to the feed. The feed is a substitute.

 Note that we are not arguing the semantics of the link element in
 an Atom document, but the semantics of the link element in an HTML
 document.

Yes, I caught that.

Robert Sayre

Re: Autodiscovery

2005-05-04 Thread Eric Scheid


On 4/5/05 11:11 PM, Robert Sayre [EMAIL PROTECTED] wrote:

 The autodiscovery spec is a reasonable interpretation of the *one
 line* definition of the 'alternate' relation.

how is a feed of recent entries a substitute version for the document in
which the link occurs when that document is some blog post long since
dropped out of the feed?

 Alternate
 Designates substitute versions for the document in which the link occurs. When
 used together with the lang attribute, it implies a translated version of the
 document. When used together with the media attribute, it implies a version
 designed for a different medium (or media).

Re: Autodiscovery

2005-05-04 Thread Thomas Broyer

Robert Sayre wrote:
The autodiscovery spec is a reasonable interpretation of the *one
line* definition of the 'alternate' relation. It is not contradictory.
But a feed is not a substitute version of an archive page as most 
archived entries are not in the feed anymore.

That said, I'm totally in favor of using rel=alternate to link to a 
feed from the _alternate_ HTML version.

From an archive page, you should rather use rel=start.
Actually, here's my view of those things:
In the latest news page (generally the homepage for a weblog):
link rel=alternate type=application/atom+xml href=feed.atom
In a category page:
link rel=start type=text/html href=../index.html
link rel=start type=application/atom+xml href=../feed.atom
link rel=alternate type=application/atom+xml href=category.atom
link rel=section type=text/html href=category/index.html
link rel=section type=application/atom+xml 
href=category/category.atom

In a single-entry archive page:
link rel=start type=text/html href=../../index.html
link rel=start type=application/atom+xml href=../../feed.atom
link rel=section type=text/html href=../index.html
link rel=section type=application/atom+xml href=../category.atom
!-- no alternate --
However, is this enough for the autodiscovery purpose?
--
Thomas Broyer

Re: Autodiscovery

2005-05-04 Thread Roger B.


 how is a feed of recent entries a substitute version for the document in
 which the link occurs when that document is some blog post long since
 dropped out of the feed?

Eric: A devil's advocacy moment... if I change the published date for
the document to today's date, it will suddenly spring forward into my
feed of recent entries. And at some point in the past, it was already
in that feed.

--
Roger Benningfield

Re: Atom feed refresh rates

2005-05-04 Thread Chris DeSalvo

On May 4, 2005, at 3:44 AM, Brett Lindsley wrote:
Andy, I recall bringing up the same issue with respect to portable 
devices. My angle
was that firing up the transmitter, making a network connection and 
connecting to
the server is still an expensive operation in time and power (for a 
portable
device) - even if the server returns nothing .  There is no reason to 
check feeds
that are not being updated, but then, there currently is no way to 
know this.
As the author of an aggregator app for a portable wireless device I can 
tell you that this is a serious problem for this class of products.  In 
my app I've implemented every trick in the book to try and reduce the 
amount of data that I have to pull through the radio and parse.  I use 
If-None-Match and If-Changed-Since headers in my requests, I support 
compression, I respect caching hints from the servers.  It doesn't help 
in all cases.  I have 112 loaded up in my aggregator and only 74 of the 
servers hosting those feeds ever return a 304.  The rest give me a 200 
and gladly hand me everything regardless of whether it has changed or 
not.  17 of the servers don't bother supplying an ETag header.

My feed list amounts to about 20 MB of data per day when polling once 
per hour.  That is a lot of air time for a small radio, and a lot time 
spent grinding in an XML parser for a small CPU.  This is especially 
upsetting because by my measurements only about 2 MB of data is fresh 
for any given day.  The main hit is in battery life  the above stats 
can trivially knock HOURS off of the life of a small battery.

I've written extensively about this problem here:
http://www.desalvo.org/blog/?p=230
with a real-world example studied here:
http://www.desalvo.org/blog/?p=232
So, I guess I'd like to see an optional update-frequency hint element.
Thanks,
Chris

Re: Autodiscovery

2005-05-04 Thread Dan Brickley


* Eric Scheid [EMAIL PROTECTED] [2005-05-05 02:35+1000]
 
 On 4/5/05 11:11 PM, Robert Sayre [EMAIL PROTECTED] wrote:
 
  The autodiscovery spec is a reasonable interpretation of the *one
  line* definition of the 'alternate' relation.
 
 how is a feed of recent entries a substitute version for the document in
 which the link occurs when that document is some blog post long since
 dropped out of the feed?

Because the HTML definition is close to meaningless. I can substitute
any document for another, and the 2nd is a substitution not through any 
intrinsic characteristics, but because it was substituted. Many of the 
HTML link type definitions don't bear up under detailed scrutiny...

Dan

Re: Atom feed refresh rates

2005-05-04 Thread Graham

On 4 May 2005, at 7:11 pm, Chris DeSalvo wrote:
My feed list amounts to about 20 MB of data per day when polling  
once per hour.  That is a lot of air time for a small radio, and a  
lot time spent grinding in an XML parser for a small CPU.  This is  
especially upsetting because by my measurements only about 2 MB of  
data is fresh for any given day.  The main hit is in battery life   
the above stats can trivially knock HOURS off of the life of a  
small battery.
So you're saying the first smartphone aggregator that uses a gateway  
server to move the heavy lifting off of the device is going to clean  
up the market. What's this got to do with Atom?

So, I guess I'd like to see an optional update-frequency hint element.
Why?
Graham

Re: Autodiscovery

2005-05-04 Thread fantasai

Dan Brickley wrote:
* Eric Scheid [EMAIL PROTECTED] [2005-05-05 02:35+1000]
On 4/5/05 11:11 PM, Robert Sayre [EMAIL PROTECTED] wrote:

The autodiscovery spec is a reasonable interpretation of the *one
line* definition of the 'alternate' relation.
how is a feed of recent entries a substitute version for the document in
which the link occurs when that document is some blog post long since
dropped out of the feed?
Because the HTML definition is close to meaningless. I can substitute
any document for another, and the 2nd is a substitution not through any 
intrinsic characteristics, but because it was substituted. Many of the 
HTML link type definitions don't bear up under detailed scrutiny...
I think you're taking your anarchic interpretation a little too far there.
Especially there: if you read the *spec*, you might notice that the definition
of 'alternate' continues:
  # When used together with the media attribute, it implies a version
  # designed for a different medium (or media).
From section 12.2.4, we also have
  #  The rel attribute specifies the relationship of the linked document
  # with the current document.
So, according to HTML 4.01 -- which is the definitive spec as far as HTML
is concerned -- the following link
   link rel=alternate type=application/atom+xml href=feed.atom
designates a link to a version of the linking document that is
application/atom+xml.
Again, my friend's blog feed is not an Atom version of /my/ web page;
linking to it as alternate would be wrong.
~fantasai

Re: Atom feed refresh rates

2005-05-04 Thread Chris DeSalvo

On May 4, 2005, at 11:35 AM, Graham wrote:
On 4 May 2005, at 7:11 pm, Chris DeSalvo wrote:
My feed list amounts to about 20 MB of data per day when polling once 
per hour.  That is a lot of air time for a small radio, and a lot 
time spent grinding in an XML parser for a small CPU.  This is 
especially upsetting because by my measurements only about 2 MB of 
data is fresh for any given day.  The main hit is in battery life  
the above stats can trivially knock HOURS off of the life of a small 
battery.
So you're saying the first smartphone aggregator that uses a gateway 
server to move the heavy lifting off of the device is going to clean 
up the market. What's this got to do with Atom?

So, I guess I'd like to see an optional update-frequency hint element.
Why?
If the feed provided a hint for a reasonable polling frequency, it 
would be a plus for limited-resource devices.  I hate to suggest that 
the format be changed as a prophylactic measure against bad-citizen 
servers, but that is the problem that I have to solve for my platform 
and applications.  In case anyone cares, this is for the T-Mobile 
Sidekick.  I work at Danger, Inc, the developer of the OS and hardware. 
 I work on the OS and applications.

-chris
p.s.  And yes, someone providing a good gateway, with a snazzy push 
protocol would make my life a lot easier.

Re: Autodiscovery

2005-05-04 Thread Tim Bray

On May 4, 2005, at 11:02 AM, Robert Sayre wrote:
I think it would be a mistake to see this as an opportunity to invent
a supremely capable and expressive autodiscovery spec. I've seen
mozilla, safari, NNW do autodiscovery. I'm sure bots from PubSub,
Technorati, Yahoo, etc do it as well. We should document what they do,
and settle on one arbitrary choice where they differ for no good
reason.
Mark's draft does an excellent job of documenting that reality.
+1.
It's Good Enough.  -Tim

Re: Autodiscovery

2005-05-04 Thread fantasai

Antone Roundy wrote:
On Tuesday, May 3, 2005, at 11:41  PM, fantasai wrote:
David Nesting wrote:
I expect that many of my implementations will utilize content 
negotiation
(using the same URL as an HTML representation, where needed), so I 
expect
that I'll have some links like:
  link rel=alternate href=/ type=application/atom+xml
  link rel=alternate href=/ type=application/rss+xml
Or even
  link rel=alternate href= type=application/atom+xml
  link rel=alternate href= type=application/rss+xml
That won't work, because content negotiation will continue to return
the same thing it returned just now. You must somehow tell the server
to return a specific other version of the current document, and you
do that typically by sending a GET request with a different URL --
one that specifies a particular version of the resource.
See http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14
GET /path-to-the-feed HTTP/1.1
Accept: application/atom+xml
...
You don't have to change the URL--just list only the format you want in 
the Accept header.  If the autodiscovery link was lying/mistaken and 
that format really isn't available at that URL, you should get a 406 
(not acceptable) response.
Where does it say that including a 'type' attribute on a link forces
the UA to send a restricted Accept header?
~fantasai

Re: Autodiscovery

2005-05-04 Thread Julian Reschke

Antone Roundy wrote:
On Wednesday, May 4, 2005, at 12:59  PM, fantasai wrote:
Again, my friend's blog feed is not an Atom version of /my/ web page;
linking to it as alternate would be wrong.

To me, this raises a red flag, suggesting that using an autodiscovery 
link from your web page to your friend's feed is not what autodiscovery 
is intended for.
+1
Julian

Re: Autodiscovery

2005-05-04 Thread fantasai

Antone Roundy wrote:
On Wednesday, May 4, 2005, at 12:59  PM, fantasai wrote:
Again, my friend's blog feed is not an Atom version of /my/ web page;
linking to it as alternate would be wrong.
To me, this raises a red flag, suggesting that using an autodiscovery 
link from your web page to your friend's feed is not what autodiscovery 
is intended for.
Probably not. But the same argument applies if I have an autodiscovery
link from a single entry in my blog to the main blog feed (which is a
valid alternate version of my weblog's front page, but not of that single
entry).
~fantasai

Re: Autodiscovery

2005-05-04 Thread Joe Gregorio


On 5/4/05, Robert Sayre [EMAIL PROTECTED] wrote: 
 Mark's draft does an excellent job of documenting that reality. 

+1

   -joe

-- 
Joe Gregoriohttp://bitworking.org

Re: Atom feed refresh rates

2005-05-04 Thread Chris DeSalvo

I do not disagree.  I just wanted to get my $0.02 in for completeness.  
I'm happy as a clam with atom as it is now.

-chris
On May 4, 2005, at 12:52 PM, Robert Sayre wrote:
No one is denying the existence of the problem you're describing.
However, this WG has consistently decided is that an optional XML
element of the kind you're describing wouldn't solve the problem.
Essentially, we'd be trading one evangelism problem for another.

Re: Autodiscovery

2005-05-04 Thread Nikolas 'Atrus' Coukouma


fantasai wrote:


 Arve Bersvendsen wrote:


 On Wed, 04 May 2005 09:43:38 +0200, Eric Scheid 
 [EMAIL PROTECTED] wrote:

 instead of feed, consider updates, which gets closer to the gist
 of  the sense


 No. To me 'Updates' signifies that something is 'updated'. Even
 posting  new content falls outside of that definition.


 That would signify updates to my document. If I'm linking to the
 CNN news feed, or my-favorite-blog, that wouldn't be appropriate.
 For this purpose, the syntax needs to signify that this is a feed,
 that it needs to be handled as such.. and that there is no other
 significant relationship between the document and the feed it links
 to (unless otherwise specified).

 ~fantasai

These are both valid interpretations of updates. From Princeton's WordNet:
update - n - news that updates your information
- v - 1: modernize or bring up to date; We updated the kitchen in the
old house
  2: bring up to date; supply with recent information
  3: bring to the latest state of technology

As this definition suggests, most people think of updates as
modifications of items that already exists first and completely new
items second. In the land of feeds, the frequency is reversed (most
updates in feeds are new items, not modifications to existing ones).

-Nikolas 'Atrus' Coukouma

Re: Autodiscovery

2005-05-04 Thread Nikolas 'Atrus' Coukouma


fantasai wrote:


 Arve Bersvendsen wrote:


 On Wed, 04 May 2005 09:43:38 +0200, Eric Scheid 
 [EMAIL PROTECTED] wrote:

 instead of feed, consider updates, which gets closer to the gist
 of  the sense


 No. To me 'Updates' signifies that something is 'updated'. Even
 posting  new content falls outside of that definition.


 That would signify updates to my document. If I'm linking to the
 CNN news feed, or my-favorite-blog, that wouldn't be appropriate.
 For this purpose, the syntax needs to signify that this is a feed,
 that it needs to be handled as such.. and that there is no other
 significant relationship between the document and the feed it links
 to (unless otherwise specified).

 ~fantasai

These are both valid interpretations of updates. From Princeton's WordNet:
update - n - news that updates your information
- v - 1: modernize or bring up to date; We updated the kitchen in the
old house
  2: bring up to date; supply with recent information
  3: bring to the latest state of technology

As this definition suggests, most people think of updates as
modifications of items that already exists first and completely new
items second. In the land of feeds, the frequency is reversed (most
updates in feeds are new items, not modifications to existing ones).

-Nikolas 'Atrus' Coukouma

Re: Autodiscovery

2005-05-04 Thread Nikolas 'Atrus' Coukouma


Eric Scheid wrote:

On 4/5/05 11:11 PM, Robert Sayre [EMAIL PROTECTED] wrote:

  

The autodiscovery spec is a reasonable interpretation of the *one
line* definition of the 'alternate' relation.



how is a feed of recent entries a substitute version for the document in
which the link occurs when that document is some blog post long since
dropped out of the feed?

I'd suggest placing the link element only on the front page of your blog
if this is a concern. The feed usually is a substitute version for the
document in which the link occurs for that, at least. There's nothing
in the spec that even suggests you to place the autodiscovery
information in archive pages. In practice, people probably will, but I'm
not sure it's worth worrying about.

Do you have some example that's more generally applicable?

-Nikolas 'Atrus' Coukouma

Re: Autodiscovery

2005-05-04 Thread Antone Roundy

On Wednesday, May 4, 2005, at 04:49  PM, Nikolas 'Atrus' Coukouma wrote:
Eric Scheid wrote:
On 4/5/05 11:11 PM, Robert Sayre [EMAIL PROTECTED] wrote:
The autodiscovery spec is a reasonable interpretation of the *one
line* definition of the 'alternate' relation.
how is a feed of recent entries a substitute version for the 
document in
which the link occurs when that document is some blog post long since
dropped out of the feed?
I'd suggest placing the link element only on the front page of your 
blog
if this is a concern. The feed usually is a substitute version for the
document in which the link occurs for that, at least. There's nothing
in the spec that even suggests you to place the autodiscovery
information in archive pages. In practice, people probably will, but 
I'm
not sure it's worth worrying about.
There is a good reason for putting the link in the individual entry 
pages: if people get to your blog via some location other than your 
blog homepage, you don't want them to have to go to your homepage to 
subscribe to your blog's feed.  In such a case, sure, alternate 
wouldn't be descriptive of the feed's relationship to the isolated 
page, but the way that such links will be processed by browsers will 
match the intent for publishing the link - if you find this entry 
interesting enough to want to subscribe to my feed, here's where to do 
it.

I personally don't care whether it's alternative or something like 
feed.  Alternative is a more generally applicable term, but yeah, 
it doesn't sound quite right on individual entry pages.

Re: Autodiscovery

2005-05-04 Thread Eric Scheid


On 5/5/05 4:02 AM, Thomas Broyer [EMAIL PROTECTED] wrote:

 Robert Sayre wrote:
 The autodiscovery spec is a reasonable interpretation of the *one
 line* definition of the 'alternate' relation. It is not contradictory.
 
 But a feed is not a substitute version of an archive page as most
 archived entries are not in the feed anymore.
 
 That said, I'm totally in favor of using rel=alternate to link to a
 feed from the _alternate_ HTML version.
 
 From an archive page, you should rather use rel=start.

The problem is, an automaton wouldn't know which to use as it wouldn't know
if the page it is looking at is an entry archive page or a recent entries
page, which rather defeats the purpose of auto-discovery.

Also, it would be entirely reasonable to use @rel='alternate' to point to an
@type='application/atom+xml' Atom Entry Document from an archive page.

Furthermore, from a recent entries page it would also be entirely reasonable
to use @rel='start' to point to the first archive entry page.

Thus, the meanings of 'alternate' and 'start' would be *reversed* depending
on what kind of page you were looking at. This is not conducive to
hands-free auto discovery.

Using @rel='feed' from both kinds of pages fixes that problem.

e.

Re: Atom feed refresh rates

2005-05-04 Thread Robert Sayre


On 5/4/05, Chris DeSalvo [EMAIL PROTECTED] wrote:

 
 If the feed provided a hint for a reasonable polling frequency, it
 would be a plus for limited-resource devices.  I hate to suggest that
 the format be changed as a prophylactic measure against bad-citizen
 servers, but that is the problem that I have to solve for my platform
 and applications. 

No one is denying the existence of the problem you're describing.
However, this WG has consistently decided is that an optional XML
element of the kind you're describing wouldn't solve the problem.
Essentially, we'd be trading one evangelism problem for another.

Robert Sayre

Re: Atom feed refresh rates

2005-05-04 Thread Lance Lavandowska


On 5/4/05, Roger B. [EMAIL PROTECTED] wrote:
 
 That's not to say that there's something necessarily wrong with an
 aggregator that allows users to pull feeds every five minutes. If

In the toy aggregator I wrote I played with a scheduler that tried to
throttle itself based on the feeds response.  That is to say it
started polling every ten minutes.  If the feed returned a 302 (or the
corresponding Etag i haven't changed) then it extended that to every
20.  Then 30...  The problem I had was deciding what the maximum
should be (1 hour? 2? 24?).  Upon getting a 'fresh' feed it reset the
interval to 10 minutes and started over again.

I'm certain I got this idea from someone else, but don't recall who
originated the idea.

Lance Lavandowska

Re: Autodiscovery

2005-05-04 Thread Eric Scheid


On 5/5/05 4:17 AM, Dan Brickley [EMAIL PROTECTED] wrote:

 The autodiscovery spec is a reasonable interpretation of the *one
 line* definition of the 'alternate' relation.
 
 how is a feed of recent entries a substitute version for the document in
 which the link occurs when that document is some blog post long since
 dropped out of the feed?
 
 Because the HTML definition is close to meaningless. I can substitute
 any document for another, and the 2nd is a substitution not through any
 intrinsic characteristics, but because it was substituted. Many of the
 HTML link type definitions don't bear up under detailed scrutiny...

Only if you take the most broadest sense of the word 'substitute'.

This is like saying that not only is olive oil a substitute for butter in
cooking, but so is engine oil, concrete, a 400 pound gorilla, and the square
root of the gross national product of madagascar.

No, I suspect they used the word 'substitute' in it's more narrow sense, and
they used the word 'substitute' because they didn't want to write
alternate: Designates an alternate version for the document in which the
link occurs which would be circular.

e.

Re: Autodiscovery

2005-05-04 Thread Eric Scheid


On 5/5/05 5:20 AM, Antone Roundy [EMAIL PROTECTED] wrote:

 On Wednesday, May 4, 2005, at 12:59  PM, fantasai wrote:
 Again, my friend's blog feed is not an Atom version of /my/ web page;
 linking to it as alternate would be wrong.
 
 To me, this raises a red flag, suggesting that using an autodiscovery
 link from your web page to your friend's feed is not what autodiscovery
 is intended for.

I agree.

However, using a link from an archive page is common practice (very!), but
is one that would confound the use of Atom Entry Documents as
@rel='alternate'.

e.

Re: Autodiscovery

2005-05-04 Thread Nikolas 'Atrus' Coukouma


Eric Scheid wrote:

On 5/5/05 4:38 AM, Nikolas 'Atrus' Coukouma [EMAIL PROTECTED] wrote:
  

Do you have some example that's more generally applicable?



in practice, people will put a link to the feed from which this page, and
others like it, are likely to be found, into entry only pages.

otherwise auto-discovery doesn't work unless you first navigate to the front
page of someone's blog.

people want to be able to say here's a link to my feed from entry pages.

e.

As I said I'm not sure it's worth worrying about. My current opinion
is that it's just not worth making this change at this point, if this is
in fact the only concern. It applies to a large number of pages with a
small number of views and those are done for usability. Even if pages
only had the link on the main page, I think it would allow 95% of users
to find it. Maybe this is more of a concern for blogs where the archives
are a major entry point. Perhaps this is even the usual case and I'm
just used to people coming in the front door. What are the chances of
someone subscribing to your feed if they never even look at the front page?

-Nikolas 'Atrus' Coukouma

Re: rel profiles [was Autodiscovery

2005-05-04 Thread Kevin Marks

We have published profiles for both license and nofollow:
http://developers.technorati.com/wiki/RelLicense
http://developers.technorati.com/wiki/RelNoFollow
feel free to use them...
On May 3, 2005, at 11:16 PM, Mark Pilgrim wrote:
On 5/4/05, Henri Sivonen [EMAIL PROTECTED] wrote
No you don't. rel='license' and rel='nofollow' have been deployable
without a profile. You just release running code that hard-codes
rel='feed' and, boom, no profile needed.
Then I'm confused as to why you can't just release running code that
hard-codes rel=alternate.  You know, like people have already done.
--
Cheers,
-Mark

Re: Autodiscovery and alternate

2005-05-04 Thread Kevin Marks

How about alternate be recommended for only true substitutes; a feed 
for comments or pictures should not be labelled alternate, as it is 
not a substitute. feed is appealing, but does fly in the face of 
practice.

There are existing rel values that could apply to qualify other kinds 
of feeds, or we could suggest new ones.

eg, if it is an titles-only feed, rel=contents would apply
If you had both full-content and summary feeds available, this could be 
indicated in a machine readable way (I appreciate that Atom handles 
this properly within the format, unlike RSS, but offering both versions 
is something I see many sites doing).

I am amazed that there was no rel=summary defined by the w3c; this 
would be a useful extension to consider.

http://www.w3.org/TR/1999/REC-html401-19991224/types.html#type-links
On May 3, 2005, at 10:29 PM, fantasai wrote:
Arve Bersvendsen wrote:
On Tue, 03 May 2005 18:52:59 +0200, Tim Bray [EMAIL PROTECTED] wrote:
http://diveintomark.org/rfc/draft-ietf-atompub-autodiscovery-01.txt
1) Change the attribute value for the rel from alternate to feed, 
or  some similar wording.  A feed is not always an alternate of the 
HTML  document in which it occurs.
As I mentioned last November [1] I agree with not requiring the
'alternate' rel value for the reasons stated in
  http://fantasai.inkedblade.net/weblog/2004/linking-feeds/
Briefly, it is an abuse of its semantics because many feed links
are not links to alternate representations of the current page.
[1] http://www.imc.org/atom-syntax/mail-archive/msg11705.html
~fantasai

PaceOriginalAttribute (was: PaceDuplicateIDWithSource2)

2005-05-04 Thread Robert Sayre


http://www.intertwingly.net/wiki/pie/PaceOriginalAttribute

On 5/3/05, Martin Duerst [EMAIL PROTECTED] wrote:
 
 I'm not really happy with this.

I found Martin's comments (copied in full below) to be accurate. So, I
thought I would try another approach. Comments, suggestions, and
alterations are welcome.

Robert Sayre


== Abstract ==

Preserve the original ID elsewhere, and require republishers to mint
new IDs for *their entries*.

== Status ==

Open

== Rationale ==

Duplicate entry ids in feeds are too easy to create unintentionally,
and the legitimate uses can't be verified as updates unless they come
from the originating feed.

== Proposal ==

Add an 'original' attribute to atom:source and reword as follows:

{{{

   If an atom:entry is copied from one feed into another feed, then the
   source atom:feed's metadata (all child elements of atom:feed other
   than the atom:entry elements) MAY be preserved within the copied
   entry by adding an atom:source child element, if it is not already
   present in the entry, and including some or all of the source feed's
   metadata elements as the atom:source element's children.  Such
   metadata SHOULD be preserved if the source atom:feed contains any of
   the child elements atom:author, atom:contributor, atom:copyright, or
   atom:category and those child elements are not present in the source
   atom:entry.

 4.2.11.1 The 'original' Attribute
 
   Atom entries can be republished and altered by intermediaries, but 
   Atom feeds MUST NOT contain duplicate atom:id values. The 'original' 
   attribute contains the entry's initial atom:id value. atom:source 
   elements MUST have an 'original' attribute.

}}}


== Impacts ==


== Notes ==



CategoryProposals


On 5/3/05, Martin Duerst [EMAIL PROTECTED] wrote:
 
 I'm not really happy with this. Conceptually, it seems to replace
 an ID for an entry with a pair (ID,feed). As IDs are URIs/IRIs
 (remember what they expand to), that doesn't make sense.
 What guarantee do we have that two feeds will be different?
 (yes, these days, we get feeds over http, but there are other
 ways to get feeds, where things may not be that easy).
 
 If we don't have a solution for the malicious case, and we
 think we need one, we should work on some security stuff.
 
 If we think that accidential ID duplication is a problem,
 then let's look at how we can improve the explanation.
 After that, there may still be an occasional accident,
 but the spec should be worded to catch that, not to
 provide a loophole.
 
 If we have to allow duplicate IDs, I'd rather prefer we
 do it without all this feed/source/... stuff: I.e. if
 you are an aggregator and can't manage to do duplicate
 elimination, you can just delegate the problem to the
 next place in the feeding chain.
 
 Regards,Martin.

Atom on portable wireless device (was: RE: Atom feed refresh rates)

2005-05-04 Thread Bob Wyman


Chris DeSalvo wrote:
 As the author of an aggregator app for a portable wireless device I 
 can tell you that this is a serious problem for this class of products.
You didn't list support for RFC3229+feed[1,2] as one of the things
you are doing. This would help you drastically reduce the bandwidth needed
when you find a feed that actually has new content. If you use RFC3229+feed
to pull a feed, then you will only get the new entries in the feed -- not
ones that you've copied over before. It's one step beyond If-None-Match,
etc.
But, the real problem with your approach is that you have apparently
coded the device so that it goes out and polls large numbers of feeds. This
doesn't make sense. For a portable wireless device with limited bandwidth
and limited connectivity, you should be accessing feeds via an intermediary
proxy that gathers up all your updates into a *single* feed. That feed
should be served using RFC3229+feed to ensure that you only copy from it the
updated entries since you last pulled from it. Of course, it would also make
sense to support compression on the results. There is no more efficient
mechanism for polling for feeds from the kind of device you describe.
You say that you're reading about 20MB per day but you're only able
to harvest 2MB of fresh data from it? This 1/10 harvesting yield is
actually pretty normal when polling RSS/Atom feeds served without
RFC3229+feed. If you used RFC3229+feed, you would find that your yield would
start to approach 100% rather then the 10% you are at now. Additionally,
given the efficiencies here, you would be able to increase your polling
frequency almost arbitrarily without significantly increasing the bandwidth
consumption of your system. Thus, you could cut latency below the average of
30 minutes which is implied by a polling frequency of 1 hour.
You've written on your blog that you want to see more 304
responses. Well, I would suggest that what you *really* should want is more
226 responses -- 226 is the success code for an RFC3229+feed GET
operation.

bob wyman

[1] http://bobwyman.pubsub.com/main/2004/10/massive_bandwid.html
[2] http://bobwyman.pubsub.com/main/2004/09/using_rfc3229_w.html

 Original Message ==
In 
my app I've implemented every trick in the book to try and reduce the 
amount of data that I have to pull through the radio and parse.  I use 
If-None-Match and If-Changed-Since headers in my requests, I support 
compression, I respect caching hints from the servers.  It doesn't help 
in all cases.  I have 112 loaded up in my aggregator and only 74 of the 
servers hosting those feeds ever return a 304.  The rest give me a 200 
and gladly hand me everything regardless of whether it has changed or 
not.  17 of the servers don't bother supplying an ETag header.

My feed list amounts to about 20 MB of data per day when polling once 
per hour.  That is a lot of air time for a small radio, and a lot time 
spent grinding in an XML parser for a small CPU.  This is especially 
upsetting because by my measurements only about 2 MB of data is fresh 
for any given day.  The main hit is in battery life - the above stats 
can trivially knock HOURS off of the life of a small battery.

I've written extensively about this problem here:
http://www.desalvo.org/blog/?p=230
with a real-world example studied here:
http://www.desalvo.org/blog/?p=232

So, I guess I'd like to see an optional update-frequency hint element.

Thanks,
Chris

http://www.intertwingly.net/wiki/pie/PaceAllowDuplicateIDs

2005-05-04 Thread Bob Wyman









+1 with a comment:



If this
Pace is accepted (and I hope it will be) the issue of Duplicate IDs should probably
be dealt with in Marks Implementation Guide.[1] 



Atom
supports the publishing of newer versions of an entry which use the
same atom:id as earlier versions of the same entry. It is not required that
atom:updated be modified when a newer version is written. 

If the
PaceAllowDuplicateIDs is accepted, it will be permitted to have multiple
entries with the same atom:id in a single feed. However, the Pace language says
processors SHOULD regard as feed generation errors any entries
which duplicate both the atom:id and atom:updated of another entry in the same
feed. Thus, feed authors who wish to publish feeds with duplicate atom:ids
should ensure that any entry which duplicates an entry already in the feed has
a different value for atom:updated. This constraint is not a requirement of the
language, but it is a clear derivative of it.

Basically,
you dont have to update atom:updated unless you think it makes sense OR
you are publishing to a feed that already has an entry with the same atom:id as
the atom:id of the entry you are currently publishing.



 bob wyman



[1] http://diveintomark.org/rfc/draft-ietf-atompub-impl-guide-00.html

Re: http://www.intertwingly.net/wiki/pie/PaceAllowDuplicateIDs

2005-05-04 Thread Tim Bray


On May 4, 2005, at 6:20 PM, Bob Wyman wrote:
+1 with a comment:

If this Pace is accepted (and I hope it will be) the issue of 
Duplicate IDs should probably be dealt with in Marks Implementation 
Guide.[1]
Er, I had planned to refine this a bit and then announce it to the 
group with some explanations and some other background research I did; 
so how about I promise to do that later this evening; and please 
consider waiting for that before you all pile in, pro or contra. -Tim

Re: Autodiscovery

2005-05-04 Thread Eric Scheid


On 5/5/05 5:36 AM, fantasai [EMAIL PROTECTED] wrote:

  - specify that UAs MAY also recognize the rel=alternate and
type=application/atom+xml combination as an autodiscoverable Atom
feed even if 'feed' is not among the rel values,

 and that UA should check that the representation returned when
 requesting that resource is an Atom Feed Document, and not an
 Atom Entry Document.

e.

PaceAllowDuplicateIDs

2005-05-04 Thread Tim Bray

co-chair-hat status=OFF
http://www.intertwingly.net/wiki/pie/PaceAllowDuplicateIDs
This Pace was motivated by a talk I had with Bob Wyman today about the 
problems the synthofeed-generator community has.

Summary:
1. There are multiple plausible use-cases for feeds with duplicate IDs
2. Pro and Contra
3. Alternate Paces
4. Details about this Pace
1. Use-Cases
Here's a stream of stock-market quotes.
feedtitleMy Portfolio/title
 
 entrytitleMSFT/title
  updated2005-05-03T10:00:00-05:00/updated
  contentBid: 25.20 Ask: 25.50 Last: 25.20/content/item
  /entry
 entrytitleMSFT/title
  updated2005-05-03T11:00:00-05:00/updated
  contentBid: 25.15 Ask: 25.25 Last: 25.20/content/item
  /entry
 entrytitleMSFT/title
  updated2005-05-03T12:00:00-05:00/updated
  contentBid: 25.10 Ask: 25.15 Last: 25.10/content/item
  /entry
/feed
You could also imagine a stream of weather readings.  Bob's actual 
here-and-now today use-case from PubSub is earthquakes, an entry 
describes an earthquake and they keep re-issuing it as new info about 
strength/location comes in.

Some people only care about the most recent version of the entry, 
others might want to see all of them.  Basically, each atom:entry 
element describes the same Entry, only at a different point in time.

You could argue that in some cases, these are representations of the 
Web resources identified by the atom:id URI, but I don't think we need 
to say that explicitly.

Yes, you could think of alternate ways of representing stock quotes or 
any of the other use-cases but this is simple and direct and idiomatic.

2. Pro and Contra
Given that I issued the consensus call rejecting the last attempt to do 
this, which was  PaceRepeatIdInDocument, I felt nervous about 
revisiting the issue.  So I went and reviewed the discussion around 
that one, which I extracted and placed at 
http://www.tbray.org/tmp/RepeatID.txt for the WG's convenience.

Reviewing that discussion, I'm actually not impressed.  There were a 
few -1's but very few actual technical arguments about why this 
shouldn't be done.  The most common was Software will screw this up.  
On reflection, I don't believe that.  You have a bunch of Entries, some 
of them have the same ID and are distinguished by datestamp.  Some 
software will show the latest, some will show all of them, the good 
software will allow switching back and forth.  Doesn't seem like rocket 
science to me.

So here's how I see it: there are plausible use cases for doing this, 
and one of the leading really large-scale implementors in the space 
(PubSub) wants to do this right now.  Bob's been making strong claims 
about not being able to use Atom if this restriction remains in place.

I believe strongly that if there's something that implementors want to 
do, standards shouldn't get in the way unless there's real 
interoperability damage.  I'm certainly prepared to believe that this 
could cause interoperability damage, but to date I haven't seen any 
convincing arguments that it will.  I think that if we nonetheless 
forbid it, people who want to do this will (a) use RSS instead of Atom, 
(b) cook up horrible kludges, or (c) ignore us and just do it.

So my best estimate is that the cost of allowing dupes is probably much 
lower than the cost of forbidding them.

Finally, our charter does say that we're also supposed to specify how 
you'd go about archiving feeds, and AllowDuplicateIDs makes this 
trivial.  I looked around and failed to find how we claimed we were 
going to do that while still forbidding duplicates, but it's possible I 
missed that.

3. Alternate Paces
I didn't want to just revive PaceRepeatIdInDocument, because it used 
the word version in what I thought was kind of a sloppy way, and 
because it wasn't current against format-08.  I don't like either 
PaceDuplicateIDWithSource or ...WithSource2, they are complicated and 
don't really meet PubSub's needs anyhow.  So I'm strongly -1 on both of 
those.  Yes, that means that if this Pace fails, we'll allow no 
duplicates at all.  I prefer either dupes OK or no dupes to dupes 
OK in the following circumstances; cleaner.

4. Details
Section 4.1.2 of format-08 says that atom:entry represents an 
individual entry.  The Pace says that if you have dupes, they 
represent the same entry, which I think is consistent with both the 
letter and spirit of 4.1.2.

The Pace discourages duplicate timestamps without resorting to MUST 
language, because accidents can happen; this allows software to throw 
such entries on the floor while positively encouraging noisy 
complaining.  On the other hand, if the WG wanted either to insist on a 
MUST here or remove the discouragement altogether I could live with 
that.

Finally, it makes it clear that if there are entries with duplicate 
atom:id, software is free to display all or a subset, and calls out the 
likely common case where you discard all but the most recent.  If I 
were Brent Simmons or equivalent, I'd be coding up a button where you

54 matches

Mail list logo