Re: Expires extension draft (was Re: Feed History -02)

2005-08-10 Thread Henry Story


There is an interesting problem of how this interacts with the  
history tag.


If you set an a feed

feed
  expires.../expires
  entry/entry
  entry/entry
/feed

then what are you setting it on? Well not the document, clearly, as  
you have
pointed out since HTTP headers deal with that. So it must be on the  
feed. And

of course a feed is identified by its id.

Now we have to make sure we avoid contradictions where different feed  
documents

describing the same feed state that the feed has different expiry dates.

Eg:

the main feed--
feed
  expiresAugust 25 2006/expires
  idtag:example.com,2000/feed1/id
  entry/entry
  entry/entry
  history:previoushttp://example.com/archive1/history:previous
/feed
---

The above is a partial description of the feed tag:example.com,2000/ 
feed1

The history:previous link points to another document that gives
us more information about that feed.

---the archive feed
feed
  expiresAugust 15 2006/expires
  idtag:example.com,2000/feed1/id
  entry/entry
  entry/entry
  history:previoushttp://example.com/archive2/history:previous
/feed
---

Now of course we have a feed id with two expiry dates. Which one is  
correct?

In graph terms we end up with something like this:

tag:example.com,2000
   |expires--August 25 2006
   |expires--August 15 2006
   |entry...
   |entry...
   |entry...
   |entry...

One has the feeling that the expires relation should be functional,  
ie have

only one value.

This makes me think again that for what I was looking for (that the  
document
in history:previous not change, so that one can work out when to stop  
fetching
documents) can in fact be entirely be taken care of by the http  
expiry dates and
cache control mechanism. Of course if this is so, I think it should  
be noted
clearly in the history spec. ((btw. Is it easy to set expiry dates  
for documents served

by apache?))


Henry Story


On 10 Aug 2005, at 04:46, James M Snell wrote:


This is fairly quick and off-the-cuff, but here's an initial draft  
to get the ball rolling..


 http://www.snellspace.com/public/draft-snell-atompub-feed-expires.txt

- James

Henry Story wrote:



To answer my own question

[[
Interesting... but why have a limit of one year? For archives, I   
would like a limit of

forever.
]]

 I found the following in the HTTP spec

[[
   To mark a response as never expires, an origin server sends an
   Expires date approximately one year from the time the response is
   sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one
   year in the future.
]]

(though that still does not explain why.)

Now I am wondering if the http mechanism is perhaps all that is  
needed
for what I want with the unchanging archives. If it is then  
perhaps this
could be explained in the Feed History RFC. Or are there other   
reasons to

add and expires tag to the document itself?

Henry Story

On 9 Aug 2005, at 19:09, James M Snell wrote:







rules as atom:author elements.

Here it is: http://www.intertwingly.net/wiki/pie/PaceCaching




The expires and max-age elements look fine. I hesitate at  
bringing  in a caching discussion.  I'm much more comfortable  
leaving the  definition of caching rules to the protocol level  
(HTTP) rather  than the format extension level.  Namely, I don't  
want to have to  go into defining rules for how HTTP headers that  
affect caching  interact with the expires and max-age elements...  
IMHO, there is  simply no value in that.
The expires and max-age extension elements affect the feed /  
entry  on the application level not the document level.  HTTP  
caching  works on the document level.





Adding max-age also means defining IntegerConstruct and disallowing
white space around it. Formerly, it was OK as a text construct, but
the white space issues change that.





This is easy enough.



Also, we should decide whether cache information is part of the   
signature.

I can see arguments either way.





-1.  let's let caching be handled by the transport layer.











Re: Expires extension draft (was Re: Feed History -02)

2005-08-10 Thread Walter Underwood

--On August 10, 2005 1:56:05 PM +1000 Eric Scheid [EMAIL PROTECTED] wrote:

 Aside: a perfect example of what sense of 'expires' is in the I-D itself...
 
 Network Working Group
 Internet-Draft
 Expires: January 2, 2006

Especially perfect because the HTTP header does not reflect the expiration.

Honestly, another reason to put expiration inside the feed is that 
HTTP caching is just not used. Well, except to force reloads and show
you new ads. But it is extremely rare to see it per-document cache
information.

wunder
--
Walter Underwood
Principal Architect, Verity



Re: Feed History -02

2005-08-10 Thread Mark Nottingham


So, you're really looking for entry-level, time-based invalidation, no?

I guess the simplest way to do this would be to dereference the link  
and see if you get a 404/410; if you do, you know it's no longer good.


That's not terribly efficient, but OTOH managing metadata in multiple  
places is tricky, and predicting the future doubly so :) Most people  
get expiration times really wrong. And clock sync becomes an issue as  
well.


I'd think that if you have reasonable control over the polling of the  
feed, and a solid enough state model (which might include an explicit  
deletion mechanism), you could have a similar effect by just removing  
the items from the feed when they expire, with the expectation that  
when they disappear from the feed, they disappear from the client.  
Would that work for your use case?



On 09/08/2005, at 9:07 PM, James M Snell wrote:



First off, let me stress that I am NOT talking about caching  
scenarios here...  (my use of the terms application layer and  
transport layer were an unfortunate mistake on my part that only  
served to confuse my point)


Let's get away from the multiprotocol question for a bit (it never  
leads anywhere constructive anyway)... Let's consider an aggregator  
scenario. Take an entry from a feed that is supposed to expire  
after 10 days.  The feed document is served up to the aggregator  
with the proper HTTP headers for expiration.  The entry is  
extracted from the original feed and dumped into an aggregated  
feed.  Suppose each of the entries in the aggregated feed are  
supposed to have their own distinct expirations.  How should the  
aggregator communicate the appropriate expirations to the  
subscriber?  Specifying expirations on the HTTP level does not  
allow me to specify expirations for individual entries within a  
feed.  Use case: an online retailer wishes to produce a special  
offers feed.  Each offer in the feed is a distinct entity with  
it's own terms and own expiration:  e.g. some offers are valid for  
a week, other offers are valid for two weeks, etc.  The expiration  
of the offer (a business level construct) is independent of whether  
or not the feed is being cached or not (a protocol level  
construct); publishing a new version of the feed (e.g. by adding a  
new offer to the feed) should have no impact on the expiration of  
prior offers published to the feed.


Again, I am NOT attempting to reinvent an abstract or transport- 
neutral caching mechanism in the same sense that the atom:updated  
element is not attempting to reinvent Last-Modified or that the via  
link relation is not attempting to reinvent the Via header, etc.   
They serve completely different purposes. The expires and max-age  
extensions I am proposing should NOT be used for cache control of  
the Atom documents in which they appear.


I think we can declare victory here by simply a) using whatever   
caching mechanism is available, and b) designating a won't  
change  flag.
Speaking *strictly* about cache control of Atom documents, +1.  No  
document level mechanisms for cache control are necessary.


- James


Mark Nottingham wrote:


HTTP isn't a transport protocol, it's a transfer protocol; i.e.,  
the  caching information (and other entity metadata) are *part of*  
the  entity, not something that's conceptually separate.


The problem with having an abstract or transport-neutral  
concept  of caching is that it leaves you with an awkward choice;  
you can  either a) exactly replicate the HTTP caching model, which  
is  difficult to do in other protocols, b) dumb down HTTP  
caching to a  subset that's neutral, or c) introduce a  
contradictory caching  model and suffer the clashes between HTTP  
caching and it.


This is the same road that Web services sometimes tries to go  
down,  and it's a painful one; coming up with the grand, protocol- 
neutral  abstraction that enables all of the protocol-specific  
features is  hard, and IMO not necessary. Ask yourself: are there  
any situations  where you *have* to be able to seamlessly switch  
between protocols,  or is it just a convenience?


I think we can declare victory here by simply a) using whatever   
caching mechanism is available, and b) designating a won't  
change  flag.







On 09/08/2005, at 11:53 AM, James M Snell wrote:



Henry Story wrote:


Now I am wondering if the http mechanism is perhaps all that is   
needed
for what I want with the unchanging archives. If it is then   
perhaps this
could be explained in the Feed History RFC. Or are there other
reasons to

add and expires tag to the document itself?




On the application level, a feed or entry may expire or age   
indepedently of whatever caching mechanisms may be applied at  
the  transport level.  For example, imagine a source that  
publishes  special offers in the form of Atom entries that expire  
at a given  point in time.  Now suppose that those entries are  
being  distributed via XMPP and HTTP.  It 

Re: nested feeds (was: Feed History -02)

2005-08-09 Thread Henry Story


Sorry for taking so long to reply. I have been off on a 700km cycle trip
http://blogs.sun.com/roller/page/bblfish/20050807

I don't really want to spend to much time on the top-X discussion, as  
I am
a lot more interested in the feed history itself, but here are some  
thoughts

anyway...


On 29 Jul 2005, at 17:01, Eric Scheid wrote:

On 29/7/05 11:39 PM, Henry Story [EMAIL PROTECTED] wrote:
Below I think I have worked out how one can in fact have a top20   
feed, and I
show how this can be combined without trouble with the   
history:next ...

link...

On 29 Jul 2005, at 13:12, Eric Scheid wrote:

On 29/7/05 7:57 PM, Henry Story [EMAIL PROTECTED] wrote:

1- The top 20 list: here one wants to move to the previous top   
20  list and
think of them as one thing. The link to the next feed is not  
meant  to be
additive. Each feed is to be seen as a whole. I have a little   
trouble still

thinking of  these as feeds, but ...


What happens if the publisher realises they have a typo and need  
to  emit an
update to an entry? Would the set of 20 entries (with one entry   
updated) be

seen as a complete replacement set?

Well if it is a typo and this is considered to be an  
insignificant  change
then they can change the typo in the feed document and not need to  
change  any

updated time stamps.



Misspelling the name of the artist for the top 20 songs list is not
insignificant. Even worse fubars are possible too -- such as  
attributing the

wrong artist/author to the #1 song/book (and even worse: leaving off a
co-author).


Yes, I see this now. This is a problem for my suggestion. The  
atom:updated field
cannot be used to indicate the date at which an entry has a certain  
position in a
chart for the reason you mention. We could then no longer update that  
entry
for spelling mistakes or other more serious issues. One would have to  
add
a about date or something, and then the things gets a little more  
complicated

than I care to think about right now.

The way I see it, maybe a better way would be to have a sliding   
window feed
where each entry points to another Atom Feed Document with it's  
own  URI, and
it is that second Feed Document which contains the individual  
items  (the top

20 list).

This is certainly closer to my intuitions too.  A top 20 something  
is  *not* a
feed. Feed entries are not ordered, and are not meant to be  
thought of as a

closed collection. At least this is my initial intuition. BUT



Not all Atom Feed Documents are feeds, some are static collections of
entries. We keep tripping over this :-(

I can think of a solution like the following: Let us imagine a top  
20  feed
where the resources being described by the entries are the  
position in the

top list. So we have entries with ids such as

http://TopOfThePops.co.uk/top20/Number1
http://TopOfThePops.co.uk/top20/Number2
http://TopOfThePops.co.uk/top20/Number3 ...



will contain a new entry such as

  entry
   titleTop of the pops entry number 1/title
   link href=http://TopOfThePops.co.uk/top20/Number1//
   idhttp://TopOfThePops.co.uk/top20/Number1/id
   updated2005-07-05T18:30:00Z/updated
   summaryTop of the pops winner for the week starting 5 July
2005/summary
 /entry



The problem here is that this doesn't describe the referent, it only
describes the reference. I want to see top 20 feeds where each  
entry links
to the referent in question. For example, the Amazon Top 10 Selling  
Books
feed would link to the book specific page at Amazon, not to some  
page saying

the #3 selling book is at the other end of this link.


Oh, I don't really want to defend this position too much but there would
be a way around this criticism by simply having the link point to the  
album

like this:

  entry
   titleTop of the pops entry number 1/title
   link href=http://www.amazon.fr/exec/obidos/ASIN/B4ULZV/
   idhttp://TopOfThePops.co.uk/top20/Number1/id
   updated2005-07-05T18:30:00Z/updated
   summaryTop of the pops winner for the week starting 5 July
2005/summary
/entry

So here the id would be the same for each position from week to week,  
but

the link it points to would change.

We would still need to solve the issue of the date at which it had that
position, though...

And so yes, a feed where the entry is a feed seems easier to work  
with in this case.

The feed would be something like this I suppose:

feed
   titletop 20 French songs/title
   ...
entry
   titleweek of August 1 2005/title
   id...?.../id
   updated2005-08-01T18:30:00Z/updated
   content src=http://TopOfThePops.fr/top20/2005_08_01/;  
type=application/atom+xml

/entry
entry
   titleweek of August 1 2005/title
   id...?.../id
   !--There was an important update--
   content src=http://TopOfThePops.fr/top20/2005_08_01/;  
type=application/atom+xml

   updated2005-08-02T18:30:00Z/updated
/entry
entry
   titleweek of August 8 2005/title
   id...?.../id
   !--There was an 

Re: Feed History -02

2005-08-09 Thread Henry Story



On 4 Aug 2005, at 06:27, Mark Nottingham wrote:
So, if I read you correctly, it sounds like you have a method  
whereby a 'top20' feed wouldn't need history:prev to give the kind  
of history that you're thinking of, right?


If that's the case, I'm tempted to just tweak the draft so that  
history:stateful is optional if history:prev is present. I was  
considering dropping stateful altogether, but I think something is  
necessary to explicitly say don't try to keep a history of my  
feed. My latest use case for this is the RSS feed that Netflix  
provides to let you keep an eye on your queue (sort of like top20,  
but more specialised).


Sound good?


Sounds good to me.

But I would really like some way to specify that the next feed  
document is an archive (ie. won't change). This would make it easy  
for clients to know when to stop following the links, ie, when

they have cought up with the changes since they last looked at the feed.

Perhaps something like this:

history:prev archive=yeshttp://liftoff.msfc.nasa.gov/2003/04/ 
feed.rss/history


Henry Story



Re: Feed History -02

2005-08-09 Thread Walter Underwood

--On August 9, 2005 1:07:29 PM +0200 Henry Story [EMAIL PROTECTED] wrote:

 But I would really like some way to specify that the next feed  document is an
 archive (ie. won't change). This would make it easy  for clients to know when
 to stop following the links, ie, when they have cought up with the changes
 since they last looked at the feed.

I made some proposals for cache control info (expires and max-age).
That might work for this.

wunder
--
Walter Underwood
Principal Architect, Verity



Re: Feed History -02

2005-08-09 Thread James M Snell


Walter Underwood wrote:


--On August 9, 2005 1:07:29 PM +0200 Henry Story [EMAIL PROTECTED] wrote:
 


But I would really like some way to specify that the next feed  document is an
archive (ie. won't change). This would make it easy  for clients to know when
to stop following the links, ie, when they have cought up with the changes
since they last looked at the feed.
   



I made some proposals for cache control info (expires and max-age).
That might work for this.

 

I missed these proposals.  I've been giving some thought to an expires 
/ and max-age / extension myself and was getting ready to write up a 
draft. Expires is a simple date construct specifying the exact moment 
(inclusive) that the entry/feed expires.  Max-age is a non negative 
integer specifying the number of miliseconds (inclusive) from the moment 
specified by atom:updated when then entry/feed expires.  The two cannot 
appear together within a single entry/feed and follows the same basic 
rules as atom:author elements.


- James



Re: Feed History -02

2005-08-09 Thread Henry Story



On 9 Aug 2005, at 18:32, Walter Underwood wrote:
--On August 9, 2005 9:28:52 AM -0700 James M Snell  
[EMAIL PROTECTED] wrote:

I made some proposals for cache control info (expires and max-age).
That might work for this.


I missed these proposals.  I've been giving some thought to an  
expires / and max-age / extension myself and was getting ready  
to write up a draft. Expires is a simple date construct specifying  
the exact moment (inclusive) that the entry/feed expires.  Max-age  
is a non negative integer specifying the number of miliseconds  
(inclusive) from the moment specified by atom:updated when then  
entry/feed expires.  The two cannot appear together within a  
single entry/feed and follows the same basic



rules as atom:author elements.

Here it is: http://www.intertwingly.net/wiki/pie/PaceCaching


Interesting... but why have a limit of one year? For archives, I  
would like a limit of

forever.

But otherwise I suppose this would do. Instead of putting the  
information in the
history link of the linking feed, you would put it in the archive  
feed. Which sounds
good. I suppose we end up with some duplication of information here  
with the http headers

again.


Adding max-age also means defining IntegerConstruct and disallowing
white space around it. Formerly, it was OK as a text construct, but
the white space issues change that.

Also, we should decide whether cache information is part of the  
signature.

I can see arguments either way.

wunder
--
Walter Underwood
Principal Architect, Verity





Re: Feed History -02

2005-08-09 Thread James M Snell


Walter Underwood wrote:


--On August 9, 2005 9:28:52 AM -0700 James M Snell [EMAIL PROTECTED] wrote:

 


I made some proposals for cache control info (expires and max-age).
That might work for this.

 


I missed these proposals.  I've been giving some thought to an expires / and 
max-age / extension myself and was getting ready to write up a draft. Expires is a 
simple date construct specifying the exact moment (inclusive) that the entry/feed expires.  
Max-age is a non negative integer specifying the number of miliseconds (inclusive) from the 
moment specified by atom:updated when then entry/feed expires.  The two cannot appear 
together within a single entry/feed and follows the same basic
   


rules as atom:author elements.

Here it is: http://www.intertwingly.net/wiki/pie/PaceCaching

 

The expires and max-age elements look fine. I hesitate at bringing in a 
caching discussion.  I'm much more comfortable leaving the definition of 
caching rules to the protocol level (HTTP) rather than the format 
extension level.  Namely, I don't want to have to go into defining rules 
for how HTTP headers that affect caching interact with the expires and 
max-age elements... IMHO, there is simply no value in that. 

The expires and max-age extension elements affect the feed / entry on 
the application level not the document level.  HTTP caching works on the 
document level.



Adding max-age also means defining IntegerConstruct and disallowing
white space around it. Formerly, it was OK as a text construct, but
the white space issues change that.

 


This is easy enough.


Also, we should decide whether cache information is part of the signature.
I can see arguments either way.

 


-1.  let's let caching be handled by the transport layer.

- James



Re: Feed History -02

2005-08-09 Thread Henry Story


To answer my own question

[[
Interesting... but why have a limit of one year? For archives, I  
would like a limit of

forever.
]]

 I found the following in the HTTP spec

[[
   To mark a response as never expires, an origin server sends an
   Expires date approximately one year from the time the response is
   sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one
   year in the future.
]]

(though that still does not explain why.)

Now I am wondering if the http mechanism is perhaps all that is needed
for what I want with the unchanging archives. If it is then perhaps this
could be explained in the Feed History RFC. Or are there other  
reasons to

add and expires tag to the document itself?

Henry Story

On 9 Aug 2005, at 19:09, James M Snell wrote:




rules as atom:author elements.

Here it is: http://www.intertwingly.net/wiki/pie/PaceCaching



The expires and max-age elements look fine. I hesitate at bringing  
in a caching discussion.  I'm much more comfortable leaving the  
definition of caching rules to the protocol level (HTTP) rather  
than the format extension level.  Namely, I don't want to have to  
go into defining rules for how HTTP headers that affect caching  
interact with the expires and max-age elements... IMHO, there is  
simply no value in that.
The expires and max-age extension elements affect the feed / entry  
on the application level not the document level.  HTTP caching  
works on the document level.




Adding max-age also means defining IntegerConstruct and disallowing
white space around it. Formerly, it was OK as a text construct, but
the white space issues change that.




This is easy enough.


Also, we should decide whether cache information is part of the  
signature.

I can see arguments either way.




-1.  let's let caching be handled by the transport layer.





Re: Feed History -02

2005-08-09 Thread James M Snell


Henry Story wrote:



Now I am wondering if the http mechanism is perhaps all that is needed
for what I want with the unchanging archives. If it is then perhaps this
could be explained in the Feed History RFC. Or are there other  
reasons to

add and expires tag to the document itself?


On the application level, a feed or entry may expire or age indepedently 
of whatever caching mechanisms may be applied at the transport level.  
For example, imagine a source that publishes special offers in the form 
of Atom entries that expire at a given point in time.  Now suppose that 
those entries are being distributed via XMPP and HTTP.  It is helpful to 
have a transport independent expiration/max-age mechanism whose 
semantics operate on the application layer rather than the transport layer.


- James



Re: Feed History -02

2005-08-09 Thread Mark Nottingham



On 09/08/2005, at 4:07 AM, Henry Story wrote:


But I would really like some way to specify that the next feed  
document is an archive (ie. won't change). This would make it easy  
for clients to know when to stop following the links, ie, when they  
have cought up with the changes since they last looked at the feed.


Perhaps something like this:

history:prev archive=yeshttp://liftoff.msfc.nasa.gov/2003/04/ 
feed.rss/history


I'd think that would be more appropriate as an extension to the  
archive itself, wouldn't it? That way, the metadata (the fact that  
it's an archive) is part of the data (the archive feed).


E.g.,

atom:feed
  ...
  archive:yes_im_an_archive/
/atom:feed

By (current) definition, anything that history:prev points to is an  
archive.


Cheers,


--
Mark Nottingham http://www.mnot.net/



Re: Feed History -02

2005-08-09 Thread Mark Nottingham


HTTP isn't a transport protocol, it's a transfer protocol; i.e., the  
caching information (and other entity metadata) are *part of* the  
entity, not something that's conceptually separate.


The problem with having an abstract or transport-neutral concept  
of caching is that it leaves you with an awkward choice; you can  
either a) exactly replicate the HTTP caching model, which is  
difficult to do in other protocols, b) dumb down HTTP caching to a  
subset that's neutral, or c) introduce a contradictory caching  
model and suffer the clashes between HTTP caching and it.


This is the same road that Web services sometimes tries to go down,  
and it's a painful one; coming up with the grand, protocol-neutral  
abstraction that enables all of the protocol-specific features is  
hard, and IMO not necessary. Ask yourself: are there any situations  
where you *have* to be able to seamlessly switch between protocols,  
or is it just a convenience?


I think we can declare victory here by simply a) using whatever  
caching mechanism is available, and b) designating a won't change  
flag.







On 09/08/2005, at 11:53 AM, James M Snell wrote:


Henry Story wrote:
Now I am wondering if the http mechanism is perhaps all that is  
needed
for what I want with the unchanging archives. If it is then  
perhaps this
could be explained in the Feed History RFC. Or are there other   
reasons to

add and expires tag to the document itself?


On the application level, a feed or entry may expire or age  
indepedently of whatever caching mechanisms may be applied at the  
transport level.  For example, imagine a source that publishes  
special offers in the form of Atom entries that expire at a given  
point in time.  Now suppose that those entries are being  
distributed via XMPP and HTTP.  It is helpful to have a transport  
independent expiration/max-age mechanism whose semantics operate on  
the application layer rather than the transport layer.


- James






--
Mark Nottingham   Principal Technologist
Office of the CTO   BEA Systems



Re: Expires extension draft (was Re: Feed History -02)

2005-08-09 Thread Eric Scheid

On 10/8/05 12:46 PM, James M Snell [EMAIL PROTECTED] wrote:

 This is fairly quick and off-the-cuff, but here's an initial draft to
 get the ball rolling..
 
 http://www.snellspace.com/public/draft-snell-atompub-feed-expires.txt
 

Looks good, I think it does need a little bit of prose explaining that this
has nothing to do with caching, and should not be used in scheduling when to
revisit/refresh/expire local copies of the resource.

Similarly, if I understand correctly, when you write
   The 'max-age' extension element is used to indicate
the maximum age of a feed or entry.
you are referring to the max-age until the informational content of the
respective feed or entry expires. And similarly with age:expires. Yes?

Aside: a perfect example of what sense of 'expires' is in the I-D itself...

Network Working Group
Internet-Draft
Expires: January 2, 2006

:-)

e.



Re: Feed History -02

2005-08-09 Thread James M Snell


First off, let me stress that I am NOT talking about caching scenarios 
here...  (my use of the terms application layer and transport layer 
were an unfortunate mistake on my part that only served to confuse my point)


Let's get away from the multiprotocol question for a bit (it never leads 
anywhere constructive anyway)... Let's consider an aggregator scenario. 
Take an entry from a feed that is supposed to expire after 10 days.  The 
feed document is served up to the aggregator with the proper HTTP 
headers for expiration.  The entry is extracted from the original feed 
and dumped into an aggregated feed.  Suppose each of the entries in the 
aggregated feed are supposed to have their own distinct expirations.  
How should the aggregator communicate the appropriate expirations to the 
subscriber?  Specifying expirations on the HTTP level does not allow me 
to specify expirations for individual entries within a feed.  Use case: 
an online retailer wishes to produce a special offers feed.  Each 
offer in the feed is a distinct entity with it's own terms and own 
expiration:  e.g. some offers are valid for a week, other offers are 
valid for two weeks, etc.  The expiration of the offer (a business level 
construct) is independent of whether or not the feed is being cached or 
not (a protocol level construct); publishing a new version of the feed 
(e.g. by adding a new offer to the feed) should have no impact on the 
expiration of prior offers published to the feed.


Again, I am NOT attempting to reinvent an abstract or transport-neutral 
caching mechanism in the same sense that the atom:updated element is not 
attempting to reinvent Last-Modified or that the via link relation is 
not attempting to reinvent the Via header, etc.  They serve completely 
different purposes. The expires and max-age extensions I am proposing 
should NOT be used for cache control of the Atom documents in which they 
appear.


I think we can declare victory here by simply a) using whatever  
caching mechanism is available, and b) designating a won't change  flag.
Speaking *strictly* about cache control of Atom documents, +1.  No 
document level mechanisms for cache control are necessary.


- James


Mark Nottingham wrote:

HTTP isn't a transport protocol, it's a transfer protocol; i.e., the  
caching information (and other entity metadata) are *part of* the  
entity, not something that's conceptually separate.


The problem with having an abstract or transport-neutral concept  
of caching is that it leaves you with an awkward choice; you can  
either a) exactly replicate the HTTP caching model, which is  
difficult to do in other protocols, b) dumb down HTTP caching to a  
subset that's neutral, or c) introduce a contradictory caching  
model and suffer the clashes between HTTP caching and it.


This is the same road that Web services sometimes tries to go down,  
and it's a painful one; coming up with the grand, protocol-neutral  
abstraction that enables all of the protocol-specific features is  
hard, and IMO not necessary. Ask yourself: are there any situations  
where you *have* to be able to seamlessly switch between protocols,  
or is it just a convenience?


I think we can declare victory here by simply a) using whatever  
caching mechanism is available, and b) designating a won't change  
flag.







On 09/08/2005, at 11:53 AM, James M Snell wrote:


Henry Story wrote:


Now I am wondering if the http mechanism is perhaps all that is  needed
for what I want with the unchanging archives. If it is then  perhaps 
this
could be explained in the Feed History RFC. Or are there other   
reasons to

add and expires tag to the document itself?



On the application level, a feed or entry may expire or age  
indepedently of whatever caching mechanisms may be applied at the  
transport level.  For example, imagine a source that publishes  
special offers in the form of Atom entries that expire at a given  
point in time.  Now suppose that those entries are being  distributed 
via XMPP and HTTP.  It is helpful to have a transport  independent 
expiration/max-age mechanism whose semantics operate on  the 
application layer rather than the transport layer.


- James






--
Mark Nottingham   Principal Technologist
Office of the CTO   BEA Systems






Re: Feed History -02

2005-08-03 Thread Mark Nottingham


So, if I read you correctly, it sounds like you have a method whereby  
a 'top20' feed wouldn't need history:prev to give the kind of history  
that you're thinking of, right?


If that's the case, I'm tempted to just tweak the draft so that  
history:stateful is optional if history:prev is present. I was  
considering dropping stateful altogether, but I think something is  
necessary to explicitly say don't try to keep a history of my feed.  
My latest use case for this is the RSS feed that Netflix provides to  
let you keep an eye on your queue (sort of like top20, but more  
specialised).


Sound good?


On 29/07/2005, at 6:39 AM, Henry Story wrote:



Below I think I have worked out how one can in fact have a top20  
feed, and
I show how this can be combined without trouble with the  
history:next ...

link...


On 29 Jul 2005, at 13:12, Eric Scheid wrote:



On 29/7/05 7:57 PM, Henry Story [EMAIL PROTECTED] wrote:



1- The top 20 list: here one wants to move to the previous top   
20 list and
think of them as one thing. The link to the next feed is not  
meant to be
additive. Each feed is to be seen as a whole. I have a little  
trouble still

thinking of  these as feeds, but ...




What happens if the publisher realises they have a typo and need  
to emit an
update to an entry? Would the set of 20 entries (with one entry  
updated) be

seen as a complete replacement set?



Well if it is a typo and this is considered to be an insignificant  
change then
they can change the typo in the feed document and not need to  
change any updated

time stamps.


The way I see it, maybe a better way would be to have a sliding  
window feed
where each entry points to another Atom Feed Document with it's  
own URI, and
it is that second Feed Document which contains the individual  
items (the top

20 list).



This is certainly closer to my intuitions too.  A top 20 something  
is *not* a feed.
Feed entries  are not ordered, and are not  meant to be thought of  
as a closed

collection. At least this is my initial intuition. BUT

I can think of a solution like the following: Let us imagine a top  
20 feed where
the resources being described by the entries are the position in  
the top list. So

we have entries with ids such as

http://TopOfThePops.co.uk/top20/Number1
http://TopOfThePops.co.uk/top20/Number2
http://TopOfThePops.co.uk/top20/Number3
...

Each of these resources describes the songs that is at a certain  
rank in the top of
the pops chart. Each week the song in that rank may change. When a  
change occurs in the

song at a certain rank the top 20 feed with id

http://TopOfThePops.co.uk/top20/Feed

will contain a new entry such as

   entry
titleTop of the pops entry number 1/title
link href=http://TopOfThePops.co.uk/top20/Number1//
idhttp://TopOfThePops.co.uk/top20/Number1/id
updated2005-07-05T18:30:00Z/updated
summaryTop of the pops winner for the week starting 5 July  
2005/summary

  /entry

A client that would subscribe to such a feed would automatically  
get updates every
week for each of the top 20 resources. But the feed could be  
structured exactly like

I suggest in 2.

So for a top 2 feed (20 is a bit to long for me)

feed
   title type=textMy top 2 Software Books/title
   idhttp://bblfish.net/blog/top2/id
   history:prevhttp://bblfish.net/blog/top2/ 
Archive-2005-07-18.atom/history:prev

   ...
   entry
titleMy Top 1 book/title
link href=http://bblfish.net/blog/top2/Number1//
idhttp://bblfish.net/blog/top2/Number1/id
updated2005-07-25T18:30:00Z/updated
summaryMy top 1 book is Service Oriented Computing by Wiley/ 
summary

   /entry
   entry
titleMy Top 2 book/title
link href=http://bblfish.net/blog/top2/Number2//
idhttp://bblfish.net/blog/top2/Number2/id
updated2005-07-25T18:30:00Z/updated
summaryMy second top book is xml in a Nutshell/summary
  /entry
/feed

The above representation of the http://bblfish.net/blog/top2 feed  
points to the

archive feed http://bblfish.net/blog/top2/Archive-2005-07-18.atom

feed
   title type=textMy top 2 Software Books/title
   idhttp://bblfish.net/blog/top2/id
   ...
   entry
titleMy Top 1 book/title
link href=http://bblfish.net/blog/top2/Number1//
idhttp://bblfish.net/blog/top2/Number1/id
updated2005-07-18T18:30:00Z/updated
summaryMy top 1 book is Java 2D Graphics/summary
   /entry
   entry
titleMy Top 2 book/title
link href=http://bblfish.net/blog/top2/Number2//
idhttp://bblfish.net/blog/top2/Number2/id
updated2005-07-18T18:30:00Z/updated
summaryMy second top book is xml in a Nutshell/summary
  /entry
/feed

 As you notice only the top1 has changed from the first week but  
not the
second, and yet in both feeds there is an entry for both. A non  
change can
sometimes be an important event. But this is really up to the feed  
creator
to choose how to present his feeds. He could easily have had only  
had the

entry for 

Re: Feed History -02

2005-07-29 Thread Henry Story


I get the feeling that we should perhaps first list the main types of  
history archives, and

then deal with each one separately. I can see 3:

1- The top 20 list: here one wants to move to the previous top  
20 list and think of them
   as one thing. The link to the next feed is not meant to be  
additive. Each feed is to
   be seen as a whole. I have a little trouble still thinking of  
these as feeds, but ...
2- The history of changes link: the idea here is that a feed is  
a list of state changes
  to web resources. The link to the history just increases the  
length of the feed by
  pointing to the next feed page. These two pages could have  
been merged to create one
  feed page with the same end result. By moving back through the  
feed history one gets
  to see the complete history of changes to the resources  
described by the feed.
3- A listing of the current states of the resources: here the  
end user is searching for
  all the blog entries as they currently are. No feed id is ever  
duplicated across the chain.


I think each of these has its uses.
 1. this use case is clear and well understood.
 2. is very useful for 'static' blog feeds written from a remote  
server, as once a feed has
been archived it no longer needs to be rewritten. It is also  
very useful in helping a
client synchronize with the changes to the feed. If a client is  
off line for a few months
or if there are a lot of changes in a feed then the client just  
needs to follow this type
of link to catch up with all the changes. If the entries are  
describing state changes to
resources such as stock market valuations, then this would allow  
one to get all the

valuations for a company going back in time.
 3. It is easier for this to be generated dynamically upon request.  
It is not really a history
of changes, but rather a list of resources. In a feed concerning  
a blog these two can of
course easily be confused as we think of blogs as being created  
sequentially. We then think
of the feed history as the history of the blog entry publication  
dates. If an old blog entry
gets changed then the page on which that blog entry originally  
appeared in the feed would
get changed just as the permalink to the blog post itself gets  
changed.
This type of list is of course not very useful in keeping  
someone updated as to the changes
in a feed, as the whole linked feeds would need to be stepped  
through to see if anything has

changed.
I think it is with this version that the proposals of having a  
separate document that points
to all the entries makes most sense. And this case also seems to  
be worked upon in the API

I think.

more below...

On 23 Jul 2005, at 18:14, Mark Nottingham wrote:



On 19/07/2005, at 2:04 AM, Henry Story wrote:


Clearly the archive feed will work best if archive documents, once  
completed (containing a
given number of entries) never change. Readers of the archive will  
have a simple way to know when
to stop reading: there should never be a need to re-read an  
archive page - they just never change.


The archive provides a history of the feed's evolution. Earlier  
changes to the resources
described by the feed will be found in older archive documents and  
newer changes in the later
ones. One should expect some entries to be referenced in multiple  
archive feed documents. These

will be entries that have been changed over time.

Archives *should not* change. I think any librarian will agree  
with that.




I very much agree that this is the ideal that should be striven for.

However, there are some practical problems with doing it in this  
proposal.


First of all, I'm very keen to make it possible to implement  
history with currently-deployed feed-generating software; e.g.,  
Moveable Type. MT isn't capable of generating a feed where changed  
entries are repeated at the top out of the box, AFAIK.


So is it that MT can only generate feed links such as describe in 3  
above?


Even if it (and other software) were, it would be very annoying to  
people whose feed software doesn't understand this extension; their  
show me the latest entries in the blog feed would become show me  
the latest changed entries in the blog, and every time an entry  
was modified or spell-checked, it would show up at the top.


Well in many cases this is exactly what I want. If someone makes a  
big change to a resource that
was created one year ago (say your first blog post) then I think this  
is well worth knowing about.
Especially if I have written a comment about it. Of course if the  
change is not *significant* then
the change to the old feed archive would not be significant either.  
So perhaps I should have said

that archives should not change in any *significant* way.

So, it's a matter of enabling graceful deployment. Most of the  
reason I have the fh:stateful flag in there is to allow 

Re: Feed History -02

2005-07-29 Thread Henry Story


Below I think I have worked out how one can in fact have a top20  
feed, and
I show how this can be combined without trouble with the  
history:next ...

link...


On 29 Jul 2005, at 13:12, Eric Scheid wrote:


On 29/7/05 7:57 PM, Henry Story [EMAIL PROTECTED] wrote:


1- The top 20 list: here one wants to move to the previous top  20  
list and
think of them as one thing. The link to the next feed is not meant  
to be
additive. Each feed is to be seen as a whole. I have a little  
trouble still

thinking of  these as feeds, but ...



What happens if the publisher realises they have a typo and need to  
emit an
update to an entry? Would the set of 20 entries (with one entry  
updated) be

seen as a complete replacement set?


Well if it is a typo and this is considered to be an insignificant  
change then
they can change the typo in the feed document and not need to change  
any updated

time stamps.

The way I see it, maybe a better way would be to have a sliding  
window feed
where each entry points to another Atom Feed Document with it's own  
URI, and
it is that second Feed Document which contains the individual items  
(the top

20 list).


This is certainly closer to my intuitions too.  A top 20 something is  
*not* a feed.
Feed entries  are not ordered, and are not  meant to be thought of as  
a closed

collection. At least this is my initial intuition. BUT

I can think of a solution like the following: Let us imagine a top 20  
feed where
the resources being described by the entries are the position in the  
top list. So

we have entries with ids such as

http://TopOfThePops.co.uk/top20/Number1
http://TopOfThePops.co.uk/top20/Number2
http://TopOfThePops.co.uk/top20/Number3
...

Each of these resources describes the songs that is at a certain rank  
in the top of
the pops chart. Each week the song in that rank may change. When a  
change occurs in the

song at a certain rank the top 20 feed with id

http://TopOfThePops.co.uk/top20/Feed

will contain a new entry such as

   entry
titleTop of the pops entry number 1/title
link href=http://TopOfThePops.co.uk/top20/Number1//
idhttp://TopOfThePops.co.uk/top20/Number1/id
updated2005-07-05T18:30:00Z/updated
summaryTop of the pops winner for the week starting 5 July  
2005/summary

  /entry

A client that would subscribe to such a feed would automatically get  
updates every
week for each of the top 20 resources. But the feed could be  
structured exactly like

I suggest in 2.

So for a top 2 feed (20 is a bit to long for me)

feed
   title type=textMy top 2 Software Books/title
   idhttp://bblfish.net/blog/top2/id
   history:prevhttp://bblfish.net/blog/top2/ 
Archive-2005-07-18.atom/history:prev

   ...
   entry
titleMy Top 1 book/title
link href=http://bblfish.net/blog/top2/Number1//
idhttp://bblfish.net/blog/top2/Number1/id
updated2005-07-25T18:30:00Z/updated
summaryMy top 1 book is Service Oriented Computing by Wiley/ 
summary

   /entry
   entry
titleMy Top 2 book/title
link href=http://bblfish.net/blog/top2/Number2//
idhttp://bblfish.net/blog/top2/Number2/id
updated2005-07-25T18:30:00Z/updated
summaryMy second top book is xml in a Nutshell/summary
  /entry
/feed

The above representation of the http://bblfish.net/blog/top2 feed  
points to the

archive feed http://bblfish.net/blog/top2/Archive-2005-07-18.atom

feed
   title type=textMy top 2 Software Books/title
   idhttp://bblfish.net/blog/top2/id
   ...
   entry
titleMy Top 1 book/title
link href=http://bblfish.net/blog/top2/Number1//
idhttp://bblfish.net/blog/top2/Number1/id
updated2005-07-18T18:30:00Z/updated
summaryMy top 1 book is Java 2D Graphics/summary
   /entry
   entry
titleMy Top 2 book/title
link href=http://bblfish.net/blog/top2/Number2//
idhttp://bblfish.net/blog/top2/Number2/id
updated2005-07-18T18:30:00Z/updated
summaryMy second top book is xml in a Nutshell/summary
  /entry
/feed

 As you notice only the top1 has changed from the first week but not  
the
second, and yet in both feeds there is an entry for both. A non  
change can
sometimes be an important event. But this is really up to the feed  
creator
to choose how to present his feeds. He could easily have had only had  
the

entry for http://bblfish.net/blog/top2/Number1 in the first feed.

Looking at it this way, there really seems to be no incompatibility  
between
a top 20 feed and the history:next ... link. My talk about archives  
not

changing should be more precisely about archives not changing in any
significant way. And this advice could be moved to an implementors  
section

and be encoded in HTTP by simply giving archive pages an infinitely long
expiry date.

Someone could subscribe to that second feed and poll for updates,  
and all
they'll ever see are updates to the 20 items there, not the 20  
items from

the next week/whatever.

The idea of feeds linked to feeds has lots of utility -- 

Re: nested feeds (was: Feed History -02)

2005-07-29 Thread Eric Scheid

On 29/7/05 11:39 PM, Henry Story [EMAIL PROTECTED] wrote:

 Below I think I have worked out how one can in fact have a top20  feed, and I
 show how this can be combined without trouble with the  history:next ...
 link...
 
 
 On 29 Jul 2005, at 13:12, Eric Scheid wrote:
 
 On 29/7/05 7:57 PM, Henry Story [EMAIL PROTECTED] wrote:
 
 
 1- The top 20 list: here one wants to move to the previous top  20  list and
 think of them as one thing. The link to the next feed is not meant  to be
 additive. Each feed is to be seen as a whole. I have a little  trouble still
 thinking of  these as feeds, but ...
 
 
 What happens if the publisher realises they have a typo and need to  emit an
 update to an entry? Would the set of 20 entries (with one entry  updated) be
 seen as a complete replacement set?
 
 Well if it is a typo and this is considered to be an insignificant  change
 then they can change the typo in the feed document and not need to change  any
 updated time stamps.

Misspelling the name of the artist for the top 20 songs list is not
insignificant. Even worse fubars are possible too -- such as attributing the
wrong artist/author to the #1 song/book (and even worse: leaving off a
co-author).

 The way I see it, maybe a better way would be to have a sliding  window feed
 where each entry points to another Atom Feed Document with it's own  URI, and
 it is that second Feed Document which contains the individual items  (the top
 20 list).
 
 This is certainly closer to my intuitions too.  A top 20 something is  *not* a
 feed. Feed entries are not ordered, and are not meant to be thought of as a
 closed collection. At least this is my initial intuition. BUT

Not all Atom Feed Documents are feeds, some are static collections of
entries. We keep tripping over this :-(

 I can think of a solution like the following: Let us imagine a top 20  feed
 where the resources being described by the entries are the position in the
 top list. So we have entries with ids such as
 
 http://TopOfThePops.co.uk/top20/Number1
 http://TopOfThePops.co.uk/top20/Number2
 http://TopOfThePops.co.uk/top20/Number3 ...

 will contain a new entry such as
 
   entry
titleTop of the pops entry number 1/title
link href=http://TopOfThePops.co.uk/top20/Number1//
idhttp://TopOfThePops.co.uk/top20/Number1/id
updated2005-07-05T18:30:00Z/updated
summaryTop of the pops winner for the week starting 5 July
 2005/summary
  /entry

The problem here is that this doesn't describe the referent, it only
describes the reference. I want to see top 20 feeds where each entry links
to the referent in question. For example, the Amazon Top 10 Selling Books
feed would link to the book specific page at Amazon, not to some page saying
the #3 selling book is at the other end of this link.

 The idea of feeds linked to feeds has lots of utility -- feeds of  comments
 for one, and even a feed of feeds available on the site.
 
 I completely agree. And remember for any two things there is at least one way
 they are related. And there are many different ways feeds can be related  to
 each other. A feed may be an archival continuation of one - which is what the
 history:next ... link in my opinion addresses, but there are many other ways
 one can relate feeds.

I did a grep of an archive of atom-syntax messages .. lots of interesting
possibilities in there. James has set a nice example to copy from, so expect
a few I-D from me.

 Of the above, the mechanism of a single URI which redirects to the  current
 issue is a situation which would still need a flag indicating that the
 appropriate thing to do is to not persist older entries.
 
 I am starting to wonder whether this is really needed now that I have  looked
 at the top20 example I gave above.

Consider the Nature case study. They have separate feed documents for each
issue, but just one public URI published. The things which are entries are
not Top N things but entries in a Table of Contents, and it's useful to be
able to aggregate that list of articles over time.

 The other structure of feeds linking to feeds would require the  aggregator
 be able to do something useful with such links, but this can be  generalised
 and thus be useful for many purposes. As it is, right now with NNW  I can do
 something useful with such a feed: drag  drop the item headline  link to my
 subscriptions pane to subscribe to that feed and view the entries  therein.
 
 I myself have no problem with feeds being entries, feeds pointing to  other
 feeds, or anything like that. A feed is a resource. It can change. A  feed is
 simply a set of state changes to resources. It is that general.

What's needed is a sea change in the concept model aggregators have
regarding feed URIs.

Right now it's exactly as if you have to Bookmark a web page before you can
view it -- whereas I'd like to see feed browser where I can clickity click
to wherever without having to first clutter my subscriptions list, but if I
happen to like the 

Re: Feed History -02

2005-07-24 Thread Dave Pawson

On Sat, 2005-07-23 at 09:14 -0700, Mark Nottingham wrote:

  Archives *should not* change. I think any librarian will agree with  
  that.
 
 I very much agree that this is the ideal that should be striven for.


 The underlying problem, I think, is that different feeds have  
 different semantics.

I think that's the bottom line Mark. No matter what, people
are going to do different things with 'history'.

Trying to pin down how it should all work seems to increase
the complexity by an order, never a good thing IMHO.

How about, you want my history, you make of it what you will.
I only guarantee its valid atom?
 At least that way, processing older feed material can 
develop based on a sound (and clearly understood) foundation.

regards DaveP




Re: Feed History -02

2005-07-23 Thread Mark Nottingham



On 19/07/2005, at 2:04 AM, Henry Story wrote:

Clearly the archive feed will work best if archive documents, once  
completed (containing a
given number of entries) never change. Readers of the archive will  
have a simple way to know when
to stop reading: there should never be a need to re-read an archive  
page - they just never change.


The archive provides a history of the feed's evolution. Earlier  
changes to the resources
described by the feed will be found in older archive documents and  
newer changes in the later
ones. One should expect some entries to be referenced in multiple  
archive feed documents. These

will be entries that have been changed over time.

Archives *should not* change. I think any librarian will agree with  
that.


I very much agree that this is the ideal that should be striven for.

However, there are some practical problems with doing it in this  
proposal.


First of all, I'm very keen to make it possible to implement history  
with currently-deployed feed-generating software; e.g., Moveable  
Type. MT isn't capable of generating a feed where changed entries are  
repeated at the top out of the box, AFAIK.


Even if it (and other software) were, it would be very annoying to  
people whose feed software doesn't understand this extension; their  
show me the latest entries in the blog feed would become show me  
the latest changed entries in the blog, and every time an entry was  
modified or spell-checked, it would show up at the top.


So, it's a matter of enabling graceful deployment. Most of the reason  
I have the fh:stateful flag in there is to allow people to explicitly  
say I don't want you to do history on this feed because so many  
aggregators are already doing history in their own way.


The underlying problem, I think, is that different feeds have  
different semantics. Some will want every change to be included,  
others won't; for example, a blog probably doesn't need every single  
spelling correction propagated. There are some fundamental questions  
about the nature of a feed that need to be answered (and, more  
importantly, agreed upon) before we get there; for example, we now  
say that the ordering isn't significant by default; while that's  
nice, most software is going to infer something from it, so we need  
an extension to say 'sort by this', *and* have that extension widely  
deployed.


I tried to approach these problems when I wrote the original proposal  
for this in Pace form; I got strong pushback on defining a single  
model for a feed's state. Given that, as well as the deployment  
issues, I intentionally de-coupled the state reconstruction (this  
proposal) from the state model (e.g., ordering, deletion, exact  
semantics of an archive feed, etc.), so that they could be separately  
defined.


Cheers,

--
Mark Nottingham http://www.mnot.net/



Re: Feed History -02

2005-07-22 Thread Stefan Eissing



Am 21.07.2005 um 16:13 schrieb Mark Nottingham:



On 19/07/2005, at 1:48 AM, Stefan Eissing wrote:
[...]


I have the feeling that clients will need to protect themselves from 
servers with almost infinite histories. So a client will probably 
offer a XX days into the past, max NN entries setting in its UI. Maybe 
that is all that's needed.


Good question. I was thinking roughly along these lines in the 
Security Considerations, but didn't want to lead people too much.




How about:
In case feeds are served via HTTP, server implemenations SHOULD 
offer ETag and Last-Modified headers on history documents (see RFC 
2616 xxx). Clients SHOULD persist ETag and Last-Modified information 
and use If-* headers to ease server load on history synchronization.


Hmm. Maybe not SHOULDs, but some prose might lead people in the right 
direction, perhaps an 'Implementer Notes' section?


Ok, a SHOULD is not really required. Let common sense prevail.

The remaining question for me is if the fh:prev links are pointing to 
history records or if they are just a way to split a large feed into 
chunks.


The more I think about it, the more I lean towards leaving the spec as 
is. Since clients do not trust servers very much, a clever client will 
most likely implement a synch strategy which frequently checks if the 
prev uris change and once a day/after a week absence traverse the prev 
chain and retrieve all documents.


//Stefan



Re: Feed History -02

2005-07-19 Thread Stefan Eissing



Am 18.07.2005 um 23:21 schrieb Mark Nottingham:

On 18/07/2005, at 2:17 PM, Stefan Eissing wrote:


On a more semantic issue:

The described sync algorithm will work. In most scenarios the abort 
condition (e.g. all items on a historical feed are known) will also 
do the job. However this still means that clients need to check the 
first fh:prev document if they know all entries there - if my 
understanding is correct.


This is one of the unanswered questions that I left out of scope. The 
consumer can examine the previous archive's URI and decide as to 
whether it's seen it or not before, and therefore avoid fetching it if 
it already has seen it. However, in this approach, it won't see 
changes that are made in the archive (e.g., if a revision -- even a 
spelling correction -- is made to an old entry); to do that it either 
has to walk back the *entire* archive each time, or the feed has to 
publish all changes -- even to old entries -- at the head of the feed.


I left it out because it has more to do with questions about entry 
deleting and ordering than with recovering state. it's an arbitrary 
decision (I had language about this in the original Pace I made), but 
it seemed like a good trade-off between complexity and capability.


It is a valid starting point. I am just wondering what consequences it 
has on client implementations. Let's say CNN goes stateful, how would a 
client handle a history which soon consists of thousands of entries. 
How would a server best offer such a history to avoid clients 
retrieving it over and over again. Probably nobody has a good idea on 
that one, or?


I have the feeling that clients will need to protect themselves from 
servers with almost infinite histories. So a client will probably offer 
a XX days into the past, max NN entries setting in its UI. Maybe that 
is all that's needed.


How about:
In case feeds are served via HTTP, server implemenations SHOULD offer 
ETag and Last-Modified headers on history documents (see RFC 2616 xxx). 
Clients SHOULD persist ETag and Last-Modified information and use If-* 
headers to ease server load on history synchronization.


//Stefan



Re: Feed History -02

2005-07-19 Thread Henry Story



On 18 Jul 2005, at 23:21, Mark Nottingham wrote:

On 18/07/2005, at 2:17 PM, Stefan Eissing wrote:


On a more semantic issue:

The described sync algorithm will work. In most scenarios the  
abort condition (e.g. all items on a historical feed are known)  
will also do the job. However this still means that clients need  
to check the first fh:prev document if they know all entries there  
- if my understanding is correct.




This is one of the unanswered questions that I left out of scope.  
The consumer can examine the previous archive's URI and decide as  
to whether it's seen it or not before, and therefore avoid fetching  
it if it already has seen it. However, in this approach, it won't  
see changes that are made in the archive (e.g., if a revision --  
even a spelling correction -- is made to an old entry); to do that  
it either has to walk back the *entire* archive each time, or the  
feed has to publish all changes -- even to old entries -- at the  
head of the feed.


Clearly the archive feed will work best if archive documents, once  
completed (containing a
given number of entries) never change. Readers of the archive will  
have a simple way to know when
to stop reading: there should never be a need to re-read an archive  
page - they just never change.


The archive provides a history of the feed's evolution. Earlier  
changes to the resources
described by the feed will be found in older archive documents and  
newer changes in the later
ones. One should expect some entries to be referenced in multiple  
archive feed documents. These

will be entries that have been changed over time.

Archives *should not* change. I think any librarian will agree with  
that.



I left it out because it has more to do with questions about entry  
deleting and ordering than with recovering state. it's an arbitrary  
decision (I had language about this in the original Pace I made),  
but it seemed like a good trade-off between complexity and capability.


Does that make sense, or am I way off-base?

Is it worthy to think of something to spare clients and servers  
this lookup? Are the HTTP caching and If-* header mechanisms good  
enough to save network bandwidth?


An alternate stratgey would be to require that fh:prev documents  
never change once created. Then a client can terminate the sync  
once it sees a URI it already knows. And most clients would not do  
more lookups than they are doing now...




I think this would be the correct strategy.


Henry Story



Re: Feed History -02

2005-07-19 Thread Henry Story



On 19 Jul 2005, at 01:52, A. Pagaltzis wrote:


* Mark Nottingham [EMAIL PROTECTED] [2005-07-18 23:30]:


This is one of the unanswered questions that I left out of
scope. The  consumer can examine the previous archive's URI and
decide as to  whether it's seen it or not before, and therefore
avoid fetching it  if it already has seen it. However, in this
approach, it won't see  changes that are made in the archive
(e.g., if a revision -- even a  spelling correction -- is made
to an old entry); to do that it either  has to walk back the
*entire* archive each time, or the feed has to  publish all
changes -- even to old entries -- at the head of the feed.



These are the kinds of things my “hub archive feed” situation was
supposed to address. Because the links are all in one place, the
consumer only has to suck down one document in order to be
informed of all archive feeds and being able to decide which ones
he wants to re-/get.


I wonder if what you are trying to describe here is not a different
concept altogether from an archive feed. I guess that both are  
completely

orthogonal concepts.

Feeds tend to specialize in a number of resources they track. What would
also be useful would be a document that  described the resources   
tracked by a
feed. This would be closer to a directory listing. It would help  
point to the

current state of the resources tracked by the feed.

So when one subscribed to a feed one could then quickly get a list of  
all
the resources that the feed had responsibility for. As this could be  
quite large
some form of navigation may be necessary. Perhaps this is the type of  
thing

that the protocol group is working on.

Henry Story



Regards,
--
Aristotle Pagaltzis // http://plasmasturm.org/







Re: Feed History -02

2005-07-19 Thread Antone Roundy


On Monday, July 18, 2005, at 01:59  AM, Stefan Eissing wrote:
Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. 
As an alternative one could drop fh:stateful and define that an empty 
fh:prev (refering to itself) is the last document in a stateful feed. 
That would eliminate the cases of wrong mixes of fh:stateful and 
fh:prev.


The problem is that an empty @href in fh:prev is subject to xml:base 
processing, and who knows what the current xml:base is going to be when 
you get to it.  Is there a way to explicitly make xml:base undefined?  
If I'm not mistaken xml:base= doesn't do it--it just adds nothing to 
the existing xml:base.  If there is a way, you could say link 
rel=fhprev href= xml:base=[whatever value sets it to 
undefined] /, but otherwise, using an empty @href is probably 
overloading the wrong attribute.  A different @rel value like 
fh:noprev (with an empty link, since it doesn't matter what it 
actually points to) might be a step up, but using any kind of link to 
indicate the lack of a link is a little odd.




Re: Feed History -02

2005-07-19 Thread Antone Roundy


On Tuesday, July 19, 2005, at 12:29  PM, Antone Roundy wrote:

On Monday, July 18, 2005, at 01:59  AM, Stefan Eissing wrote:
Ch 3. fh:stateful seems to be only needed for a newborn stateful 
feed. As an alternative one could drop fh:stateful and define that an 
empty fh:prev (refering to itself) is the last document in a stateful 
feed. That would eliminate the cases of wrong mixes of fh:stateful 
and fh:prev.


The problem is that an empty @href in fh:prev is subject to xml:base 
processing, and who knows what the current xml:base is going to be 
when you get to it.  Is there a way to explicitly make xml:base 
undefined?  If I'm not mistaken xml:base= doesn't do it--it just 
adds nothing to the existing xml:base.  If there is a way, you could 
say link rel=fhprev href= xml:base=[whatever value sets it to 
undefined] /, but otherwise, using an empty @href is probably 
overloading the wrong attribute.  A different @rel value like 
fh:noprev (with an empty link, since it doesn't matter what it 
actually points to) might be a step up, but using any kind of link to 
indicate the lack of a link is a little odd.


Yikes, I should have caught up on the xml:base thread first!  Looks 
like the jury's out, or at least hung, on this issue.




Re: Feed History -02

2005-07-18 Thread Henry Story


Sorry I did not participate in the previous discussion for format 00.  
I only just

realized this was going on. What is clear is that this is really needed!

I agree with Stefan Eissing's random thought that it may not be a  
good idea to use
Atom for a top 10 feed. Atom entries are not ordered in a feed for  
one. Also as I
understand it an entry in a feed is best thought of as a state of an  
external resource
at a time. Making a feed of the top x entries is to use the feed as a  
closed collection

whereas I think it is correctly interpreted as an open one.

If that is right, and so fh:stateful is not needed, then would it not  
be simpler to

extend the link element in the following really simple way:

link rel=http://purl.org/syndication/history/1.0; type=application/ 
atom+xml

  href=http://example.org/2003/11/index.atom; /

Just a thought.

In any case I really look forward to having this functionality.  
Thanks a lot for the
huge effort you have put into presenting this idea so clearly and  
with such patience.


Henry Story


On 18 Jul 2005, at 09:59, Stefan Eissing wrote:


Am 16.07.2005 um 17:57 schrieb Mark Nottingham:


The Feed History draft has been updated to -02;

  http://ftp.ietf.org/internet-drafts/draft-nottingham-atompub- 
feed-history-02.txt


The most noticeable change in this version is the inclusion of a  
namespace URI, to allow implementation.


I don't intend to update it for a while, so as to gather  
implementation feedback.




Just a couple of thoughts on reading the document:

Ch 3. fh:stateful seems to be only needed for a newborn stateful  
feed. As an alternative one could drop fh:stateful and define that  
an empty fh:prev (refering to itself) is the last document in a  
stateful feed. That would eliminate the cases of wrong mixes of  
fh:stateful and fh:prev.


Ch 5. inserting pseudo-entries into an incomplete feed: would it  
make sense to have a general way to indicate such pseudo entries? A  
feed entry can also get lost at the publisher and the publisher  
might want to indicate that there once was a feed entry, but that  
he no longer has the (complete) document.


//Stefan

Random thoughts:
The example of a top 10 feed (Ch 1) needs some thinking: there  
are quite some people interested in the history of top 10 when it  
comes to music charts. One _could_ make this an atom feed and use  
the feed history to go back in time. But the underlying model is  
different from the one atom has, so maybe its not such a good idea  
after all. (Is there any ordering in a feed, btw.? I know a client  
can sort by date, but does someone rely on document order of xml  
elements?)






Re: Feed History -02

2005-07-18 Thread James M Snell


Henry Story wrote:



Sorry I did not participate in the previous discussion for format 00.  
I only just

realized this was going on. What is clear is that this is really needed!

I agree with Stefan Eissing's random thought that it may not be a  
good idea to use
Atom for a top 10 feed. Atom entries are not ordered in a feed for  
one. Also as I
understand it an entry in a feed is best thought of as a state of an  
external resource
at a time. Making a feed of the top x entries is to use the feed as a  
closed collection

whereas I think it is correctly interpreted as an open one.

I disagree. Atom could be used very easily for a top 10 feed.  What is 
needed is a simple extension that provides rank orderings for entries.  
Something as simple as the following would work...


feed ...
 ...
 entry
   ...
   r:index1/r:index
 /entry
 entry
   r:index2/r:index
 /entry
 entry
   r:index3/r:index
 /entry
/feed

If that is right, and so fh:stateful is not needed, then would it not  
be simpler to

extend the link element in the following really simple way:

link rel=http://purl.org/syndication/history/1.0; type=application/ 
atom+xml

  href=http://example.org/2003/11/index.atom; /

I actually can't what my opinion on this used to be :-( but right now 
I'm thinking that a custom link relation is the right approach.


- James



Re: Feed History -02

2005-07-18 Thread Mark Nottingham



On 18/07/2005, at 11:10 AM, James M Snell wrote:
Ch 3. fh:stateful seems to be only needed for a newborn stateful  
feed.  As an alternative one could drop fh:stateful and define  
that an empty  fh:prev (refering to itself) is the last document  
in a stateful feed.  That would eliminate the cases of wrong mixes  
of fh:stateful and  fh:prev.



+1. After going through this, fh:stateful really doesn't seem to be  
necessary.  the presence of fh:prev would be sufficient to indicate  
that the feed has a history and a blank fh:prev would work fine to  
indicate the end of the history.


I thought about the comments on the plane yesterday, and I agree.  
However, I'm wary of special URI values; also, I want to preserve  
stateful=false.


So, what about saying that you can omit fh:stateful *if* fh:prev is  
in the feed?



--
Mark Nottingham http://www.mnot.net/



Re: Feed History -02

2005-07-18 Thread James M Snell


Mark Nottingham wrote:



On 18/07/2005, at 11:10 AM, James M Snell wrote:

Ch 3. fh:stateful seems to be only needed for a newborn stateful  
feed.  As an alternative one could drop fh:stateful and define  that 
an empty  fh:prev (refering to itself) is the last document  in a 
stateful feed.  That would eliminate the cases of wrong mixes  of 
fh:stateful and  fh:prev.



+1. After going through this, fh:stateful really doesn't seem to be  
necessary.  the presence of fh:prev would be sufficient to indicate  
that the feed has a history and a blank fh:prev would work fine to  
indicate the end of the history.



I thought about the comments on the plane yesterday, and I agree.  
However, I'm wary of special URI values; also, I want to preserve  
stateful=false.


So, what about saying that you can omit fh:stateful *if* fh:prev is  
in the feed?



--
Mark Nottingham http://www.mnot.net/


I would say that if fh:prev is present, stateful=true is assumed.  if 
fh:prev is not present, stateful=false is assumed.  Omit fh:prev in the 
final feed in the chain and you know you've reached the end.


- James



Re: Feed History -02

2005-07-18 Thread James M Snell


There is precedence for using atom:link in RSS feeds.  see 
http://feeds.feedburner.com/ITConversations-EverythingMP3.  I really 
don't think it's a problem.


Mark Nottingham wrote:



That's what I originally did, but I have a rather strong preference  
to make a single syntax work in RSS and Atom. atom:link is  
(naturally) specific to Atom, and people will balk at using the atom  
namespace in RSS feeds.


That's not to say that every Atom extension should be usable in RSS,  
but I think this one is simple -- and valuable -- enough to do it;  
and, I don't see any technical benefit to using the link relation.


Cheers,

On 18/07/2005, at 12:32 PM, Henry Story wrote:

If that is right, and so fh:stateful is not needed, then would it  
not be simpler to

extend the link element in the following really simple way:

link rel=http://purl.org/syndication/history/1.0;  
type=application/atom+xml

  href=http://example.org/2003/11/index.atom; /

Just a thought.




--
Mark Nottingham http://www.mnot.net/






Re: Feed History -02

2005-07-18 Thread Mark Nottingham



On 18/07/2005, at 2:17 PM, Stefan Eissing wrote:


On a more semantic issue:

The described sync algorithm will work. In most scenarios the abort  
condition (e.g. all items on a historical feed are known) will also  
do the job. However this still means that clients need to check the  
first fh:prev document if they know all entries there - if my  
understanding is correct.


This is one of the unanswered questions that I left out of scope. The  
consumer can examine the previous archive's URI and decide as to  
whether it's seen it or not before, and therefore avoid fetching it  
if it already has seen it. However, in this approach, it won't see  
changes that are made in the archive (e.g., if a revision -- even a  
spelling correction -- is made to an old entry); to do that it either  
has to walk back the *entire* archive each time, or the feed has to  
publish all changes -- even to old entries -- at the head of the feed.


I left it out because it has more to do with questions about entry  
deleting and ordering than with recovering state. it's an arbitrary  
decision (I had language about this in the original Pace I made), but  
it seemed like a good trade-off between complexity and capability.


Does that make sense, or am I way off-base?


Is it worthy to think of something to spare clients and servers  
this lookup? Are the HTTP caching and If-* header mechanisms good  
enough to save network bandwidth?


An alternate stragety would be to require that fh:prev documents  
never change once created. Then a client can terminate the sync  
once it sees a URI it already knows. And most clients would not do  
more lookups than they are doing now...




--
Mark Nottingham http://www.mnot.net/



Re: Feed History -02

2005-07-18 Thread Stefan Eissing


On a more semantic issue:

The described sync algorithm will work. In most scenarios the abort 
condition (e.g. all items on a historical feed are known) will also do 
the job. However this still means that clients need to check the first 
fh:prev document if they know all entries there - if my understanding 
is correct.


Is it worthy to think of something to spare clients and servers this 
lookup? Are the HTTP caching and If-* header mechanisms good enough to 
save network bandwidth?


An alternate stragety would be to require that fh:prev documents never 
change once created. Then a client can terminate the sync once it sees 
a URI it already knows. And most clients would not do more lookups than 
they are doing now...


//Stefan 



Re: Feed History -02

2005-07-18 Thread Mark Nottingham



On 18/07/2005, at 1:29 PM, Stefan Eissing wrote:


I agree that special URIs are not that great either. Another idea  
might be nested elements:


stateful feed:  fh:historyfh:prevhttp://example.org/thingie1.1/ 
fh:prev/fh:history

stateful initial feed: fh:history/
stateless feed: fh:historyfh:none//fh:history


Hmm. My thinking was that allowing stateful to be omitted would be  
concise and unambiguous; to compare,


stateful feed: fh:prevhttp://example.org/thingie1.1/fh:prev
stateful initial feed: fh:statefultrue/fh:stateful
stateless feed: fh:statefulfalse/fh:stateful


--
Mark Nottingham http://www.mnot.net/



Re: Feed History -02

2005-07-18 Thread Stefan Eissing



Am 18.07.2005 um 19:33 schrieb Mark Nottingham:

On 18/07/2005, at 1:29 PM, Stefan Eissing wrote:


I agree that special URIs are not that great either. Another idea  
might be nested elements:


stateful feed:   
fh:historyfh:prevhttp://example.org/thingie1.1/fh:prev/fh: 
history

stateful initial feed: fh:history/
stateless feed: fh:historyfh:none//fh:history


Hmm. My thinking was that allowing stateful to be omitted would be  
concise and unambiguous; to compare,


stateful feed: fh:prevhttp://example.org/thingie1.1/fh:prev
stateful initial feed: fh:statefultrue/fh:stateful
stateless feed: fh:statefulfalse/fh:stateful


Fine with me. As I said the discussion has reached the syntactic  
sugar level and your proposal has the same semantics and no nested  
elements. To be clear I would advise clients that fh:prev takes  
precedence over any fh:stateful information and then any ambiguity is  
resolved.


//Stefan



Re: Feed History -02

2005-07-18 Thread James M Snell


Heh... the same questions could be asked about a lot of stuff embedded 
in RSS but that's not the issue ;-) ... fh:prev works fine.  There 
really isn't a strong argument in favor of link.  I have my own personal 
preferences but those are actually quite irrelevant :-)I'll still 
maintain that fh:stateful is quite unnecessary but doesn't hurt anything 
if it's used -- you'll just need to be explicit about such things as 
what if a fh:prev is used without a fh:stateful being present.


Mark Nottingham wrote:

Not a problem, as such, but I don't see any benefit to reuse. It also  
begs the question of what an atom element in a non-Atom document  
means, how it's processed, etc.



On 18/07/2005, at 1:03 PM, James M Snell wrote:

There is precedence for using atom:link in RSS feeds.  see http:// 
feeds.feedburner.com/ITConversations-EverythingMP3.  I really don't  
think it's a problem.




--
Mark Nottingham http://www.mnot.net/






Re: Feed History -02

2005-07-18 Thread Stefan Eissing



Am 18.07.2005 um 18:59 schrieb James M Snell:


Mark Nottingham wrote:



On 18/07/2005, at 11:10 AM, James M Snell wrote:

Ch 3. fh:stateful seems to be only needed for a newborn stateful   
feed.  As an alternative one could drop fh:stateful and define   
that an empty  fh:prev (refering to itself) is the last document   
in a stateful feed.  That would eliminate the cases of wrong mixes   
of fh:stateful and  fh:prev.



+1. After going through this, fh:stateful really doesn't seem to be   
necessary.  the presence of fh:prev would be sufficient to indicate   
that the feed has a history and a blank fh:prev would work fine to   
indicate the end of the history.



I thought about the comments on the plane yesterday, and I agree.   
However, I'm wary of special URI values; also, I want to preserve   
stateful=false.


So, what about saying that you can omit fh:stateful *if* fh:prev is   
in the feed?



--  
Mark Nottingham http://www.mnot.net/



I would say that if fh:prev is present, stateful=true is assumed.  if  
fh:prev is not present, stateful=false is assumed.  Omit fh:prev in  
the final feed in the chain and you know you've reached the end.


stateful gives a hint to a client about caching entries and maybe their  
representation in a user interface. It may be desired to see that a new  
feed is stateful(or not) even if it has no history yet. That is why I  
came up with the empty prev link as a suggestion.


I agree that special URIs are not that great either. Another idea  
might be nested elements:


stateful feed:   
fh:historyfh:prevhttp://example.org/thingie1.1/fh:prev/fh: 
history

stateful initial feed: fh:history/
stateless feed: fh:historyfh:none//fh:history

So much for the syntactic sugar...

//Stefan



Feed History -02

2005-07-16 Thread Mark Nottingham


The Feed History draft has been updated to -02;

  http://ftp.ietf.org/internet-drafts/draft-nottingham-atompub-feed- 
history-02.txt


The most noticeable change in this version is the inclusion of a  
namespace URI, to allow implementation.


I don't intend to update it for a while, so as to gather  
implementation feedback.


Cheers,

--
Mark Nottingham http://www.mnot.net/