Re: Current and permalink link rel values

2007-02-23 Thread Antone Roundy


On Feb 23, 2007, at 7:16 AM, Elliotte Harold wrote:
I'd like to add multiple links to my feed for both the current  
version of the story and the permalink. E.g.

...

link href=http://www.cafeconleche.org/#February_22_2007_30633/
link rel=permalink href=http://www.cafeconleche.org/oldnews/ 
news2007February22.html#February_22_2007_30633/


Both of those would probably be best described as alternate links.   
The second one in particular is what alternate was intended to be  
used for.  However, RFC 4287 contains the following:


   o  atom:entry elements MUST NOT contain more than one atom:link
  element with a rel attribute value of alternate that has the
  same combination of type and hreflang attribute values.

So you couldn't keep both as alternate links.  In my opinion, you  
should use the second one (the longer lasting one) only, and omit the  
first (which is going to become invalid as soon as the entry falls  
off the page anyway -- anyone who used it to get to your page and  
bookmarked it, and anyone who follows it from a cached copy of your  
feed isn't going to be able to find the entry without a lot of  
needless digging through your archives). You should have a link to  
http://www.cafeconleche.org/ at the feed level.  While that won't  
link directly to that entry, it'll get people to it as long as it's  
on that page.




Re: Query re: support of Media RSS extensions inside Atom feeds

2007-02-10 Thread Antone Roundy


On Feb 9, 2007, at 9:23 PM, John Panzer wrote:
Does anyone know of any issues with placing Yahoo! Media RSS  
extensions (which seem to fit the requirments for Atom extensions  
to me) inside an Atom feed?  Secondarily, do feed readers in  
general recognize MRSS inside either Atom or RSS?  Looking for  
field experience/implementor intentions here.


CaRP partially supports Media RSS in RSS (it doesn't directly support  
Atom at all, and Grouper, the companion script that converts Atom to  
RSS for it doesn't yet have Media RSS support, though I may add it in  
the next update).  It only looks at elements pointing to images  
(@type=image/*) and their types, heights and widths.  I added this  
in response to user requests--primarily, I believe, for use with  
Flickr feeds.


Antone



Re: AD Evaluation of draft-ietf-atompub-protocol-11

2006-12-16 Thread Antone Roundy


I'm not subscribed to the APP mailing list, so hopefully this isn't  
all redundant:


On 12/15/06, Lisa Dusseault [EMAIL PROTECTED] wrote:
A model where servers aren't required to keep such information  
won't, in

practice, allow that kind of extension. If clients can't rely on their
markup getting stored, then clients can't extend Atom unilaterally  
using XML

markup.


There are two different issues here, which I think has been  
mentioned, but which might bear being clearly stated:


1) Do servers have to keep all extension data?

2) Can a server accept an entry while discarding some or all  
extension data, or do they have to reject the entry and return an  
error code?


I think the answer to the first question is clearly no--servers  
shouldn't be required to store all arbitrary data that is sent to  
them.  So the questions are:


1) Which hurts more--data loss or rejected entries?

2) Is there any way to reduce that pain?

The pain of data loss is obvious--the data is lost.  The pain of  
rejected entries is having to fix and repost them or decide not to  
try again.


In either case, it might be useful to be able to query the server  
somehow to find out what it will and won't preserve.  If data is  
discarded, you can figure that out after the fact by loading the  
resulting entry and checking whether the data is all there, but one  
might prefer to know ahead of time if something is going to be lost  
in order to be able to decide whether to post it or not.  If the  
entry is just going to be rejected, unless there's a way for the  
server to communicate exactly which data it had issues with, fixing  
the data to make it acceptable could be extremely difficult (Hmm,  
I'll leave this data out and try again...nope, still rejected. I'll  
put that back in and leave this out...nope. I'll take both  
out...nope. I'll put both back in and take yet another piece of data  
out...).


So, how might a client query a server to see what it will preserve?   
A few possibilities:


1) Have some way to request some sort of description of what will and  
won't be preserved and what might be altered.  I don't know how one  
would go about responding to such an inquiry except to basically send  
back a list of what will be preserved, including some way to say  
I'll preserve unknown attributes here, I'll preserve unknown child  
elements (and their children) here, I'll store up to 32767 bytes  
here, etc.  If there is any known extension markup that a server  
wants to explicitly state that it won't preserve, there may need to  
be a way to do that too.


2) Have a way to do a test post, where one posts the data one is  
considering posting (or something structurally identical), but says  
don't store this--just tell me what you WOULD store.  The response  
could include what would be returned if one were to load the data  
after it being stored, or it could be some sort of list of anything  
that would be discarded or altered.


3) (I get the impression this could be done without requiring  
changes--is this the sort of process that has already been  
selected?)  Post the data as a draft, reload it to see if it's all  
still there.  If so, or if what has been preserved is acceptable,  
change it's status to published or whatever it's called.  If not  
delete it and give up or take whatever other action is appropriate.



My impression is that data loss would be less painful and more easily  
dealt with than rejection of entries that won't be completely preserved.


...but I haven't followed the discussion, so what do I know.



Re: PaceEntryMediatype

2006-12-06 Thread Antone Roundy


On Dec 6, 2006, at 12:14 PM, Jan Algermissen wrote:
Following a link is not the same thing as subscribing to something.  
The act of subscribing is a local activity performed by the user  
agent. What you do when you follow the link to a feed is a GET.  
Your agent then decides if subscribing to that resource is a good  
idea. To make that decision, the agent has to look at the  
representation and the it is insignificant overhead to see if the  
thing returnes feed or entry.


...

Maybe I want to monitor a single media resource; an Atom media  
entry would be an ideal thing to do so (I'd rather look at the meta  
data than at the media resource upon each poll).


 I'd say: stick with the one media type that is currently there -  
there is no problem, just misconception about what it means to  
subscribe.


A few reasons why a user agent might want to be able to tell the  
difference between a link to a feed and a link to an entry beforehand  
is in order to:


1) be able to ignore the link to the entry (ie. not present it to the  
user) if the user agent doesn't handle entry documents (rather than  
presenting it as a subscribe link, only to have to say sorry, it's  
not a feed after the user tries to subscribe).


2) be able to say subscribe to links to feeds, and monitor links  
to entries (the user may not be interested in monitoring a single  
entry for changes--if they can't tell that that's what the link is  
for, they may end up needlessly doing so but think that they've added  
another feed to their subscription list).





Re: PaceEntryMediatype

2006-12-06 Thread Antone Roundy


On Dec 6, 2006, at 4:26 PM, Jan Algermissen wrote:
Most feed readers knows how to handle feeds, but have no idea how  
to handle entries.


So they should be fixed, should they not?


If the purpose of a feed reader is to subscribe to feeds and bring  
new and updated entries to the user's attention, then if they don't  
also handle the monitoring of single entry documents (interesting to  
some people in some cases, but I doubt interesting to most people),  
that's not necessarily something that needs fixing.



They seem to only have implemented half a media type.


...or they've implemented all of what should be covered by one media  
type.




Re: PaceEntryMediatype

2006-12-01 Thread Antone Roundy


On 12/1/06, Mark Baker [EMAIL PROTECTED] wrote:

On 11/30/06, Thomas Broyer [EMAIL PROTECTED] wrote:
All a media type tells you (non-authoritatively too) is the spec you
need to interpret the document at the other end of the link.  That has
very little to do with the reasons that you might want to follow the
link, subscribe to it, etc..  Which is why you need a mechanism
independent from the media type.  Like link types.


Now that this has sunk in, it makes a lot of sense--the @rel value  
says you can subscribe to that, that is an alternative  
representation of this, that is where you'd go to edit this, and  
so on.  The media type helps the user agent figure out whether it has  
the capability to do those things.  For example, a feed reader that  
only handles RSS could ignore subscription links to resources of type  
application/atom+xml (ie. not present the subscription option to  
the user).  The subscribe to hAtom feed case where @type is text/ 
html might be a little difficult to make a decision on, because  
there's no indication of what microformat is being used by the  
feed (or even if there's a microformat in use at all--maybe it  
really is just an HTML page, and subscribing to it just means  
watch for changes to the entire document).  But in the case of bare  
syndication formats, things should be clear enough.


So if it really is possible to do option 5 (new media type for entry  
documents, and @rel values to solve the rest of the issues), and do  
it cleanly, then that'd be my first choice.  If that's doomed (due to  
a need to be backwards compatible with existing practice) to be a  
mess of ambiguities and counter-intuitivities (eg. alternate means  
subscribe when combined with a syndication type, except when it  
might really mean alternate because it points to a feed archive  
document, but anything with feed in it always means subscribe...)  
then oh my.


One problem that I hadn't really thought clearly about till right now  
is that understanding the nature of the think linked TO may require  
some understanding of the nature of the thing linked FROM.  For  
example, an alternate link from a typical blog homepage to its feed  
really does point to the same thing in an alternative format.  Both  
are live documents in which new data gets added to the top, and old  
data drops off the bottom.  But if you don't know that the webpage is  
a live document, you wouldn't know whether the link pointed to a  
static or live document.  alternate is perfectly accurate, but it's  
not helpful enough.  subscribe would be much more explicit.


Which raises the question of how to point to a static alternative  
representation of the data currently found in the document.   
alternate WOULD be a good word to use for that except that it's  
already being used to point to live feeds.  An option that would  
almost surely cause confusion would be to use alternative for  
static alternative representations.  The meaning of static wouldn't  
exactly be intuitively clear.  Maybe something more long-winded like  
(oh no! hyphenation!) static-alternate would do.  Or would static  
alternate (and alternate static and static foo alternate,etc.,  
or perhaps archive alternate, etc.) be better?  For backwards  
compatibility (at least with UAs that don't expect only one value in  
@rel), subscribe alternate (and alternate subscribe, etc.) could  
be used rather than simply subscribe.


BTW, am I remembering correctly that feed is being promoted for use  
the way I'm considering subscribe above?  If it's not already in  
use, I'd thinK subscribe would be much better than feed, because  
feed could as easily mean archive feed as subscription feed-- 
it's just not explicit enough.


But perhaps this discussion all belongs in a different venue anyway...


But before I end, what about the question of a different media type  
for entry documents?  For the APP accept element issue, it sounds  
like maybe they do.  But for autodiscovery, maybe they don't.   
Perhaps neither @type nor @rel is the place to distinguish, for  
example, between the edit links for entries, their parent feeds,  
their per-entry comment feeds, monolithic comment feeds, etc.  (A  
media type for entry documents would only help with one of those).   
Perhaps that is the domain of @title (title=Edit this entry, etc.)   
Do UAs really need to know the difference, or do only the users need  
to know?  Would making that information machine readable be worth the  
pain involved (rel=edit monolithic parent comments???)



Okay, that's all I can take for now.



Re: PaceEntryMediatype

2006-11-30 Thread Antone Roundy


On Nov 30, 2006, at 2:13 AM, Jan Algermissen wrote:

On Nov 29, 2006, at 7:22 PM, James M Snell wrote:

One such problem occurs in atom:link and atom:content elements.
Specifically:

  atom:link type=application/atom+xml href=a.xml /
  atom:content type=application/atom+xml src=b.xml /

Given no other information I have no way of knowing whether these are
references to Feed or Entry documents.


And what is the problem with that?


Here's one problem: in this and the autodiscovery case, the UA can't  
tell without fetching the remote resource whether it's appropriate to  
display a subscribe link.  In fact, even if the remote resource is  
a feed, it may not be appropriate to subscribe to, because it may be  
an archive document rather than the live end of a feed.


Of the options presented, I'd favor adding a type parameter to  
application/atom+xml.  In addition to feed and entry, we may want  
archive.




Re: PaceEntryMediatype

2006-11-30 Thread Antone Roundy


Summary of thoughts and questions:

*** Problems with the status quo ***

A) Consumers don't have enough information (without retrieving the  
remote resource) to determine whether to treat a link to an Atom  
document as a link to a live feed, a feed archive, or an entry.  (Is  
it appropriate to poll the link repeatedly?  How should information  
about the link be presented to the user?)


B) APP servers can't communicate whether they will accept feed  
documents or only entry documents.



*** Possible solutions ***

1) Add a type parameter to the existing media type:

+ With the exception of a few details, the documents are all exactly  
the same format (does it contain a feed element, or does it start at  
the entry element, is it a live feed document or an archive, etc.),  
so a single media type makes the most sense (definitely for live  
feeds vs. archives, less certainly for feeds vs. entries).


- Some existing applications will ignore the parameter and may handle  
links to non-live-feeds inappropriately


- Some existing applications may not recognize application/atom 
+xml;type=feed as something appropriate to handle the same way they  
handle application/atom+xml now.


? I haven't been following development of the APP, so forgive my  
ignorance, but can parameters be included in the accept element?



2) Create (a) new media type(s) (whether like application/atomentry 
+xml or application/atom.entry+xml):


+ Applications that currently treat all cases of application/atom+xml  
the same would ignore non-feed links until they were updated to do  
something appropriate with the new media type.


- Differentiating between live feeds and archives by media type seems  
really wrong since their formats are identical.  This isn't as big a  
negative for entry documents, but it still seems suboptimal to me.


- If a media type were created for archive documents, would APP  
accept including application/atom+xml imply acceptance of archive  
documents too?  Neither yes nor no feels like a satisfying answer.



3) Use @rel values to differentiate:

- That territory is already a bit of a mess, what with feed vs.  
alternate vs. alternate feed vs. feed alternate -- why make it  
worse?


+ That territory is already a bit of a mess, what with feed vs.  
alternate vs. alternate feed vs. feed alternate -- why not work  
on all these messy problems in the same place?


- That wouldn't help with the APP accept issue.


4) Create a new media type for entry documents, and add a parameter  
to application/atom+xml to differentiate between live and archive  
feeds (and for any other documents that have the identical format,  
but should be handled differently in significant cases).


- Doesn't prevent existing apps that ignore the parameter from  
polling archive documents.


+ Does solve the rest of the problems without the negatives of #2 above.


5) Create a new media type for entry documents, and use @rel values  
to solve issues that doesn't solve:


+/- Messy territory


If we were starting from scratch, I'd probably vote for #1.  Since  
we're not, I'd vote for #4 first, and perhaps #5 second, but I'd have  
to think about #5 more first.


Antone



Re: atom license extension (Re: [cc-tab] *important* heads up)

2006-09-06 Thread Antone Roundy


On Sep 6, 2006, at 7:51 AM, James M Snell wrote:
The problem with specifying a per-feed default license is that  
there is
currently no way of explicitly indicating that an entry does not  
have a

license or that any particular entry should not inherit the default
feed-level license.


With respect to atom:rights (from RFC 4287 section 4.2.10):

   If an atom:entry element does not contain an atom:rights element,
   then the atom:rights element of the containing atom:feed element, if
   present, is considered to apply to the entry.

Thus, at the entry level, atom:rights / would (certainly ought to!)  
detach a feed level atom:rights element from the entry without  
replacing it with anything.  With link rel=license..., I'm not  
sure how you'd do the same thing.  Is it possible to specify a null  
URI?  link rel=license href= / points to the in-scope xml:base  
URI, right?  Perhaps the specification could define a null license  
URI.


With respect to the issue of aggregate feeds, I had thought that the  
existence of an atom:source element at the entry level blocked any  
inheritance of the feed metadata, but looking at RFC 4287, I don't  
see that explicitly stated.  Certainly if atom:source contains  
atom:rights, then that element overrides the feed-level atom:rights  
of the aggregate feed, but if neither atom:source nor atom:entry  
contains an atom:rights element, what then?  Perhaps in that case,  
the aggregator should add atom:rights / as a child of atom:source  
(I'd think that preferable to adding it as a child of atom:entry).


On Sep 6, 2006, at 4:38 AM, Thomas Roessler wrote:

So, here's the proposal:

- Use link rel=license/ for entry licenses -- either on the feed
  level, setting a default analogous to what atom:rights does, or on
  the element level.

- Introduce link rel=collection-license/ (or whatever else you
  find suitable) for licenses about the collection, to be used only
  on the feed level.


If there's a @rel=license at the feed level, but no rel=collection- 
license, does the @rel=license also become a collection- 
license?  (People who don't read the spec would probably think so).   
If there is no license for the collection, but one wishes to specify  
a default license for the entries, a null license would once again  
be useful.


Antone



Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests

2006-06-28 Thread Antone Roundy


On Jun 28, 2006, at 12:06 PM, A. Pagaltzis wrote:

* James M Snell [EMAIL PROTECTED] [2006-06-28 20:00]:

A. Pagaltzis wrote:

* James M Snell [EMAIL PROTECTED] [2006-06-28 14:35]:

Hiding the div completely from users of Abdera would mean
potentially losing important data (e.g. the div may contain
an xml:lang or xml:base) or forcing me to perform additional
processing (pushing the in-scope xml:lang/xml:base down to
child elements of the div.


How is that any different from having to find ways to pass
any in-scope xml:lang/xml:base down to API clients when the
content is type=html or type=text? I hope you didn’t punt
on those?


Our Content interface has methods for getting to that
information.


Then stripping the `div` is not an issue, is it?


Consider this:

entry xml:lang=en xml:base=http://example.com/foo/;
...
content type=xhtml
		xhtml:div xml:lang=fr xml:base=http://example.com/ 
feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div

/content
/entry

Whether there's a problem depends on whether one requests the  
xml:base, xml:lang, or whatever for the atom:content element itself  
or for the CONTENT OF the atom:content element, in which case the  
library could return the values it got from the xhtml:div.  Except in  
unusual cases like this, the result would be identical.


Certainly a distinction could be made between how an XML library  
would handle this vs. how an Atom library would handle it.  An Atom  
processing library might be expected to be able to do things like:


* give me the raw contents of the atom:content element
* give me the contents of the atom:content element converted to well- 
formed XHTML (whether it started as text, escaped tag soup, or inline  
xhtml)


In the former case, keeping the div feels like the right thing to do-- 
the consuming app would have to know to remove it.  In the latter  
case, removing the div from xhtml content feels like the right thing  
to do.  But unless the library gives me the xml:base, for example,  
which applies to the content of the atom:content element (as  
converted to well-formed xhtml or whatever), as opposed to the  
xml:base which applied to the atom:content element itself, there's  
potential for trouble.


...now that I think about it, if the library always returns the  
xml:base which applies to the content of the element, that could  
cause trouble too:


entry xml:lang=en xml:base=http://example.com/;
...
content type=xhtml
		xhtml:div xml:lang=fr xml:base=feu/xhtml:a  
href=axe.htmlaxe/xhtml:a/xhtml:div

/content
/entry

Here, if I get xml:base for the content of content, it will be  
http://example.com/feu/;.  Then, if I get the raw content of the  
element, strip the div, and apply xml:base myself, I'll erroneously  
use http://example.com/feu/feu/; as the base URI unless I know to  
ignore the xml:base attribute on the div.




Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests

2006-06-28 Thread Antone Roundy


On Jun 28, 2006, at 3:10 PM, Robert Sayre wrote:

The content in the entries below should be handled the same way:

entry xml:lang=en xml:base=http://example.com/foo/;
  ...
  content type=xhtml
  xhtml:div xml:lang=fr xml:base=http://example.com/
feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div
  /content
/entry

entry xml:lang=en xml:base=http://example.com/foo/;
  ...
  content type=xhtml xml:lang=fr xml:base=http:// 
example.com/

feu/
  xhtml:div xhtml:a href=axe.htmlaxe/xhtml:a/ 
xhtml:div

  /content
/entry


Of course the end result of both should be identical.  Is that what  
you mean by should be handled the same way?  The question is, if  
the xhtml:div is stripped by the library before handing it off to the  
app, how is the app going to get the attributes that were on the  
div?  Is the library going to push those values down into the content  
or act as if they were on the atom:content element (or something  
similar to that)?


BTW, it just occurred to me that pushing them down into the content  
won't work.  Here's an example where that would fail:


entry xml:lang=en
  ...
  content type=xhtml
  xhtml:div xml:lang=frOui!/xhtml:div
  /content
/entry

Notice that there are no elements inside the xhtml:div for xml:lang  
to be attached to (and even if there were any, any text appearing  
outside of them would not have the correct xml:lang attached to it).


So it looks like the options (both of a which a single library could  
support, of course) are:


* Strip the div, but provide a way to get the attributes that were on it
or
* Leave the div



Re: Feed Thread in Last Call

2006-05-18 Thread Antone Roundy


On May 18, 2006, at 8:10 AM, Brendan Taylor wrote:

Do you have any suggestions about how this metadata could be included
without changing the content of the feed? AFAICT the only solution  
is to

not use the attributes (which aren't required, of course).


If it's in the feed document and it gets updated other than when the  
entry itself is updated (...and it wouldn't be of much use if it were  
only updated when the entry was updated), it's going to result in  
data getting re-fetched when nothing but the comment count and  
timestamp change.  I don't see any way around that.  So if you really  
want a way to publish comment counts and timestamps without causing  
lots of unchanged data from getting refetched, you're going to have  
to separate that data out of the feed. Here's pseudo-XML for a  
possible approach:


feed ...
...
link rel=comment-tracking href=... /
...
entry
idfoo/id
...
/entry
entry
idbar/td
...
/entry
...
/feed

and in another document:

ct:comment-tracking xmlns:ct=... xmlns:atom=... ...
atom:link rel=related href=URL of the feed ... /
ct:entry ref=foo
		atom:link rel=comments href=... type=... hreflang=...  
ct:count=5 ct:when=... /
		atom:link rel=comments href=... type=... hreflang=...  
ct:count=3 ct:when=... /

/ct:entry
ct:entry ref=bar
		atom:link rel=comments href=... type=... hreflang=...  
ct:count=0 ct:when=... /
		atom:link rel=comments href=... type=... hreflang=...  
ct:count=1 ct:when=... /

/ct:entry
...
/ct:comment-tracking

Of course the comment tracking document wouldn't only be  
authoritative for feeds that pointed to it with a comment-tracking link.


This would require an extra subscription to track the comments, as  
well as understanding an additional format (as opposed to just an  
additional extension--either approach requires SOME additional work),  
but it would prevent unnecessary downloads by clients that aren't  
aware of it, and would reduce the bandwidth used by those that are.


This approach could be generalized to enable offloading of other  
metadata that's more volatile than the entries themselves.


Antone



Re: Feed Thread in Last Call

2006-05-18 Thread Antone Roundy


On May 18, 2006, at 12:31 PM, A. Pagaltzis wrote:

Actually, you don’t really need another format. There’s no reason
why you couldn’t use atom:feed in place of your hypothetical
ct:comment-tracking. :-) Your ct:entry elements could almost be
atom:entry ones instead, too, except that assigning them titles
and IDs feels like overkill.
The point of the whole exercise is to create a lightweight document  
for volatile metadata. If it's an atom:feed, you have to include a  
lot of stuff that's not needed here--atom:title, atom:updated,  
atom:author, and atom:summary or atom:content.  Also, you'd need to  
have an atom:id for each entry in addition to the @ref pointing to  
the entry that it talks about.



The real cost is not the cost of an extra format, but that
implementations then need to understand the FTE in order to know
to poll an extra document to retrieve the out-of-band metadata.
Sure, but if they don't understand FTE, they wouldn't know what to do  
with the extra metadata anyway even if it were in the main feed.   
They MIGHT be able to do some generic processing of the comments  
link, but the reliability of any generic processing algorithm for  
unknown link types is questionable since we left atom:link open to  
all sorts of uses.  And you COULD keep the comments links in the main  
feed but just leave off @count and @when for the benefit of apps that  
don't process the sibling document.


On May 18, 2006, at 11:48 AM, Antone Roundy wrote:
This approach could be generalized to enable offloading of other  
metadata that's more volatile than the entries themselves.
I don't know yet what other metadata might be handled this way, but  
here's slightly revised pseudo-XML that makes it more general and  
adds a few useful things:


feed ...
idfoobar/id
...
link rel=volatile href=... /
...
entry
idfoo/id
...
/entry
entry
idbar/td
...
/entry
...
/feed

v:volatile ref=foobar xmlns:v=... xmlns=http://www.w3.org/2005/ 
Atom xmlns:thr=...!-- @ref could be omitted if using with RSS --
	link rel=related href=URL of the feed ... /!-- don't really  
need something different from related, right? --

updated.../updated
	v:entry ref=foo!-- @ref could be a guid if using with an RSS  
2.0 feed, though we all know that RSS 2.0 guids are misused in ways  
that might make the connection unreliable --

updated.../updated
		link rel=comments href=... type=... hreflang=...  
thr:count=5 thr:when=... /
		link rel=comments href=... type=... hreflang=...  
thr:count=3 thr:when=... /

/v:entry
v:entry ref=bar
updated.../updated
		link rel=comments href=... type=... hreflang=...  
thr:count=0 thr:when=... /
		link rel=comments href=... type=... hreflang=...  
thr:count=1 thr:when=... /

/v:entry
...
/v:volatile




Re: Does xml:base apply to type=html content?

2006-03-31 Thread Antone Roundy


On Mar 31, 2006, at 7:01 AM, A. Pagaltzis wrote:

* M. David Peterson [EMAIL PROTECTED] [2006-03-31 07:55]:

I speaking in terms of mashups... If a feed comes from one
source, then I would agree...  but mashups from both a
syndication as well as an application standpoint are become the
primary focus of EVERY major vendor. Its in this scenario that
I see the problem of assuming the xml:base in current context
has any value whatsoever.


No. That is only a problem if you just mash markup together
without taking care to preserve base URIs by adding xml:base
at the junction points as necessary.

Copying an atom:entry from one feed to another correctly requires
that you query the base URI which is in effect in the scope of
the atom:entry in the source feed, and add an xml:base attribute
to that effect to the copied atom:entry in the destination feed.
If you do this, any xml:base attributes within the copy of the
atom:entry will continue to resolve correctly.

It’s much easier to get right than copying markup without
violating namespace-wellformedness, even.


Exactly.  When creating a mashup feed, there are any number of things  
that the ... masher(?) has to be careful of--for example:


* Getting namespace prefixes right
* Creating an atom:source element and putting the right data into it
* Ensuring that all entries use the same character encoding
* Ensuring that the xml:lang in context is correct
* Ensuring that the xml:base in context is correct
* If any of the source data isn't Atom, ensuring that all the  
required elements exist (...even if the source data IS Atom--you  
never know when you're going to aggregate from an invalid Atom feed-- 
then you have to decide whether to fix the entry or drop it to make  
your output correct)


If we start assuming that mashers can't do those correctly, then we  
may as well not be using Atom, or even XML.  If we did a proper job  
of specifying Atom, then we should be able to hold publishers' feet  
to the fire and make them get their feeds right.  In Atom, xml:base  
is the mechanism used to determine base URIs.




Re: Does xml:base apply to type=html content?

2006-03-31 Thread Antone Roundy


On Mar 30, 2006, at 10:30 PM, James M Snell wrote:

Antone Roundy wrote:

[snip]
2) If you're consuming Atom and you encounter a relative URI, how  
should

you choose the appropriate base URI with which to resolve it?

I think there are only three remotely possible answers to #2:  
xml:base
(including the URI from which the feed was retrieved if xml:base  
isn't

explicitly defined), the URI of the self link, and the URI of the
alternate link.  Given that Atom explicitly supports xml:base, if  
it's

explicitly defined, it's difficult to justify ignoring it in favor of
anything else.


There is no basis in any of the specs for using the URI of the self or
alternate link as a base uri for resolving relative references in the
content.  The process for resolving relative references is very  
clearly

defined.


Right--my point is:

1) If the original publisher made the mistake of using relative  
references without explicitly setting xml:base (figuring that  
consumers could resolve the references relative to the location of  
the feed), and then the feed got moved or mirrored, one would  
certainly fail at finding the things the publisher intended to point  
to if the URI from which the feed was retrieved was used as the base  
URI, but might succeed by using the self link as the base URI.  (I do  
not advocate doing this as default behavior, as stated below).


2) If the original publisher made the mistake of not even thinking  
about relative references in the content and therefore didn't set  
xml:base, the relative references may very well be relative to the  
location pointed to by the alternate link.  For example, the person  
generating the content may have been thinking my blog entry will  
appear at http://example.org/blog/2006/03/foo.html, so I can use the  
relative URL ../../../img/button.gif to point to the image at  
http://example.org/img/button.gif;.  If the alternate link points to  
http://example.org/blog/2006/03/foo.html, then the consumer that  
wants to find the image will only succeed by using the alternate link  
as the base URI.  (I do not advocate doing this as default behavior,  
as stated below).


Moral of this story: failing to explicitly set xml:base is bad  
because it tempts consumers to ignore the spec in order to get what  
they want.  I do not advocate ignoring the spec as default behavior.   
But honestly, I might give the user of a consuming application the  
option of overriding the default behavior on specific feeds if they  
know that the publisher makes the mistake of publishing links  
relative to the self or alternate link without setting xml:base.  I'd  
LIKE to be able to hold the publisher's feet to the fire and make  
them fix the feed, but sometimes my users hold MY feet to the fire  
and make me give them usable workarounds.


Antone



xml:base in your Atom feed

2006-03-31 Thread Antone Roundy


Sam,

Funny that this should come up today given the recent discussion on  
the mailing list--NetNewsWire isn't getting the links in your Atom  
feed right.  I looked at the source, and it's clearly a NetNewsWire  
bug since it's not even trying to resolve relative to the URI from  
which it retrieves the feed.  In fact it appears to be resolving  
relative to the alternate link (link href=/blog//), and not doing  
such a good job of it--for example, instead of http:// 
www.intertwingly.net/blog/2006/03/31/Rogers-Switches, it's pointing  
to http:/blog/blog/2006/03/31/Rogers-Switches--but I wonder whether  
it would get it right if you set xml:base explicitly.


Antone



Re: xml:base in your Atom feed

2006-03-31 Thread Antone Roundy


On Mar 31, 2006, at 4:12 PM, Sam Ruby wrote:

Antone Roundy wrote:

Sam,

Funny that this should come up today given the recent discussion  
on  the

mailing list--NetNewsWire isn't getting the links in your Atom  feed
right.


There is an off chance that I have been following the list.  ;-)


I certainly didn't mean to imply that you weren't--I just wanted to  
point out what I'm seeing in case you didn't know that this  
particular feed reader is having this particular problem today.  And  
I thought it might be of interest to the WG to know what NNW is doing  
given that it's doing something that has been argued against within  
the last 24 hours.


I don't remember which version of your feed I was subscribed to  
before--perhaps I wasn't subscribed to the Atom feed and NNW updated  
my subscription when you redirected to it. So I don't know whether  
you purposely removed xml:base to see what chaos would ensue, or  
whether it hasn't been there all along and I just haven't seen the  
problem since I was subscribed to a different version.




Re: atom:name ... text or html?

2006-03-23 Thread Antone Roundy


On Mar 23, 2006, at 9:48 AM, James Holderness wrote:
Hahaha! It's RSS all over again. In the words of Mark Pilgrim:  
Here's something that might be HTML. Or maybe not. I can't tell  
you, and you can't guess. :-)


Seriously though, the atom:name element is described as a human- 
readable name, so unless your name really is Betrand  
Cafeacture; that can't be right. If RFC4287 had intended to allow  
markup in the element it would have used atomTextConstruct.


I agree with James here--if we had intended for the name to be able  
to include markup, we should have used the construct we created to  
allow that.  This from RFC 4287 (section 3.2):


   element atom:name { text }

would have been this:

   element atom:name { atomTextConstruct }

if we had intended for it to be able to contain anything but literal  
text after XML un-escaping, right?


On Mar 23, 2006, at 9:57 AM, Eric Scheid wrote:
It's true that XML has only a half dozen or so entities defined,  
meaning
most interesting entities from html can't exist in XML ... unless  
maybe they

are wrapped like in CDATA block like above?
If they're wrapped in a CDATA block, then they don't trigger an XML  
parsing error, but wrapping something in CDATA isn't a license to  
enter data in a format other than what the RFC allows.


I'm getting the data by scraping an html page, so I'm expecting it  
to be

acceptable html code, including html entities.
You, the producer, are getting the data from an HTML page, so you  
should certainly be prepared to handle HTML entities in it. But you  
the Atom publisher are responsible for making sure that you've made  
any changes to the data that are necessary for it to be proper Atom  
before you publish it. The consumer of the Atom feed doesn't know  
where you got the data, and thus can't be expected to decide how to  
process it based on where you got it.




Re: Feed paging and atom:feed/atom:id

2006-03-10 Thread Antone Roundy


On 10 Mar 2006, at 18:44, James M Snell wrote:

If the feeds have the same atom:id, I would submit that they form a
single logical feed.  Meaning that all of the feed documents in an
incremental feed (using Mark's Feed History terminology) SHOULD use  
the

same atom:id value.  This is the way I have implemented paging in our
APP implementation.  If the linked feeds have different atom:id  
values,

they should represent different logical feeds.


Agreed.  From 4.2.6:

   Put another way, an atom:id element
   pertains to all instantiations of a particular Atom entry or feed;
   revisions retain the same content in their atom:id elements.

All the Atom Feed Documents representing one incremental feed (or  
parts of one incremental feed) are instantiations of a particular  
Atom ... feed, are they not?  So they should have the same value in  
atom:id.  If they don't, then they can't be considered instantiations  
of the same Atom feed.




Re: IE7 Feed Rendering Issue

2006-03-09 Thread Antone Roundy


On Mar 9, 2006, at 12:07 PM, James M Snell wrote:

As an alternative, Feed Readers can provide publishers with a way of
specifying optionally applied styling for feeds and entries.. e.g.,

feed
  ...
  link rel=stylesheet type=... href=... /
  ...
  entry
...
link rel=stylesheet type=... href=... /
...
  /entry
/feed


Given my opinion on the use of the link element, I suppose I should  
propose an alternative:


ext:style type=text/css
...
/ext:style

or

ext:style src=http://...; /

Either method permitted, like how we do atom:content.  'type=text/ 
css' optional, or is it needed?  Warning to those daring to try the  
second that some feed readers won't bother downloading the external  
file.  Warning to publishers that if they specify styles for body,  
for example, some readers may say there's no body element in the  
content, so I'll ignore this rule (so put the content in a container  
with an ID or class and set the style for that instead), and others  
may say how dare you try to take over the styling of the body when  
the body element isn't allowed in the content, I'll ignore this  
rule, and others may just ignore all or some of it for whatever  
reason they wish.  Can be at feed or entry level and be intended for  
application to its siblings and their children (those with textual  
content only--and of course, some clients may not apply it to all  
siblings and children even if they are textual).  If we really want  
to get fancy (big if), we could add @apply-to=content, but then you  
get into the qnames in attributes problem...  Or we could specify  
that it only applies to atom:content and perhaps atom:summary (and  
any extension element that explicitly specifies that it applies).


Well, that's enough off the top of my head.

Antone



Re: Link rel attribute stylesheet

2006-02-27 Thread Antone Roundy


On Feb 26, 2006, at 9:10 PM, James Yenne wrote:
My feeds contain a generic xml-stylesheet, which formats the feed  
for display along with a feed-specific css.  Since xsl processors  
do not have a standard way to pass parameters to xsl stylesheets, I  
provide this feed-specific css to the xsl processor in the feed as  
a link with rel=stylesheet.  Generating xhtml with this xsl/css  
solution works for rendering both in IE6 and FF1.5.  (Why does IE7  
rip out xml-stylesheet directives?)


A link rel=stylesheet seems to be the most efficient solution,  
however, a fully qualified URI relation does the job too.  I would  
like to request a stylesheet link relation be added to the IANA  
List of Relations and supported in the validators.  Thoughts?


One problem with this is that there's no machine readable way without  
an extension attribute to indicate what format the stylesheet is  
going to transform the data to.  If you're going to add an extension  
attribute, I'd suggest just making the whole thing an extension  
element instead.


Of course, my opinion is partly based on my preference which was  
rejected by the group for limiting the link element to links intended  
for traversal, so maybe that doesn't matter.  But certainly the  
possibility should be considered that this is stretching the use of  
the link element beyond what it was designed for.


Antone



Re: Link rel attribute stylesheet

2006-02-27 Thread Antone Roundy


On Feb 27, 2006, at 8:29 AM, M. David Peterson wrote:
When you say what it was designed for can you be specific as to  
what that definition is?
Well, we failed to gain consensus on that.  Some of us wanted it to  
be used only for links intended to be traversed by the user (like the  
a element in HTML with an href attribute--the link is there so that  
the user can click it and get to the linked resource).  Others didn't  
want this limitation, but wanted the link to be resolvable (eg., no  
tag: URIs).  Others wanted to be able to stick any URI in it.  So  
there is no tightly defined what it was designed for.


I'm just saying that if an extra attribute is required to  
disambiguate what's being pointed to in a case like the following  
(without requiring the link target to be loaded and inspected), then  
maybe you're trying to make this one element do too much:


link rel=stylesheet href=http://example.org/atom-2-rss-2.0.xsl; /
link rel=stylesheet href=http://example.org/atom-2-rss-1.0.xsl; /
link rel=stylesheet href=http://example.org/atom-2-fooml.xsl; /
etc.

If one were to encounter such a list of links at the top of an Atom  
document, which should one use?  Should one download all of them and  
then pick one?  Or are you going to add an attribute something like  
this:


link rel=stylesheet href=http://example.org/atom-2-rss-2.0.xsl;  
ext:targettype=application/xml+rss /


Sorry, new to the conversation, but I have particular interest in  
this topic as it is my belief that the URI/IRI can be used to imply  
a lot of information that is otherwise hidden from view, or uses  
more complex mechanisms to achieve the same result.  If there is  
real concern as to this approach, it would be great to gain a  
greater understanding as what they are such that I can apply this  
to the work I am doing in this area.


For a particular example of what I mean, please see this post   
http://www.xsltblog.com/archives/2006/02/what_rest_gets_1.html 
Hmm.  If I'm reading that right, I wouldn't want to organize my  
websites that way.  And unless the specification for the stylesheet  
link relation were to mandate that URIs be constructed in a way  
enables readers to tell from the local path what type the stylesheet  
is going to transform the feed to, you wouldn't have any way to know  
whether you could apply such an interpretation in any given case.  I  
don't really see the benefit of putting the information into the URI  
versus creating an attribute whose sole purpose is to specify the  
type.  The number of bits it would save is trivial, and it would  
require the extra step of parsing the URI's local path to pull  
information out of it that could be taken more easily from a  
dedicated attribute.


Antone



Re: partial xml in atom:content ?

2006-01-17 Thread Antone Roundy


On Jan 17, 2006, at 11:04 AM, James Holderness wrote:
but I think I've shown some pretty compelling reasons why a  
producer (if they really absolutely have to use application/xhtml 
+xml), would be wiser to use an xhtml document fragment than a  
complete xhml document.


I'm all for consuming applications that want to be really smart  
checking whether the content of content type=application/xhtml 
+xml is a fragment or a complete document and handling either, but  
if your content is an xhtml document fragment, is there any reason at  
all to publish type=application/xhtml+xml rather than  
type=xhtml?  The only justification that comes to mind is if you  
want to make a political protest statement against the required  
wrapper div.  But unless you prominently warn your users that your  
app is doing this, you're doing them a grave disservice by making  
their feed content less likely to be seen.




Re: partial xml in atom:content ?

2006-01-16 Thread Antone Roundy


On Jan 15, 2006, at 8:09 PM, James Holderness wrote:
Thus, can atom be used to ship around parcels of xml snippets? I  
suppose it
could, but only so long as both ends knew what was going on, and  
knew naïve

atom processors might barf on the incomplete xml, right?


The one time I'd think it might be safe is with XHTML (as I  
mentioned in a previous message) since Atom processors are already  
required to handle XHTML fragments in the content element. Anything  
else would be highly risky unless it was a proprietary feed  
communicating between two known applications.


Processing type=xhtml and type=application/xhtml+xml are very  
different beasts.  Say your application converts Atom feeds to HTML  
to display in webpages.  With type=xhtml, the data could just be  
dumped into the webpage (after appropriate stripping of nasty tags  
and CSS and such).  With type=application/xhtml+xml, you'd have to  
figure out to do with everything outside of the body element.  If  
there's CSS involved for example, simply throwing it away could lead  
to some very messed up display.  But assuming your application is  
being called from within the webpage, it's not going to have the  
opportunity to add a style section to the document's head.  So to  
avoid losing the styling, for example, it would have to replace all  
id=foo and class=bar attributes with style=all of the styling  
for the id and class and parent classes, etc., with all cascading  
applied.  In other words, it's not going to happen.  Given the  
tremendously increased complexity involved, some apps are likely to  
refuse to process anything that's not one of Atom's three special types.




Re: partial xml in atom:content ?

2006-01-16 Thread Antone Roundy


On Jan 16, 2006, at 4:21 PM, James Holderness wrote:
For example, below are the results of some tests I've run on 15  
aggregators. The tests included the use of a div tag as the root  
element, a p tag as the root element, and an html tag as the  
root element (i.e. a complete xhtml document).


The following applications worked with all three tests:
BlogBridge 2.7
Bloglines
BottomFeeder 4.1
Google Reader
Snarfer 0.1.2

The following applications worked with the div tag and the p  
tag, but failed to handle a full document (the html tag):

FeedDemon 1.5
GreatNews 1.0.0.354
Newz Crawler 1.8.0
RSS Bandit 1.3.0.38
SharpReader 0.9.6.0


Out of curiosity, what constitutes success in the html case?  I'm  
mostly curious about the browser based readers.  If they displayed  
the content within a webpage, but failed to strip out the html, / 
html, body and /body tags and head section (assuming the test  
feed contained one), would that be a success or failure?  What did  
the apps that failed do in the html case?




Re: Sponsored Links and other link extensions

2005-10-25 Thread Antone Roundy


On Oct 25, 2005, at 12:59 AM, A. Pagaltzis wrote:

I am asking if is there a generic way for an application to
implement alternate-link processing that gives sensible behaviour
for any type of main link. If an implementor has to support
alternative links explicitly for each type of main link, where’s
the difference to having specific relationships for alternative
links depending on the main link type?


Here are a few examples of generic processing algorithms an  
application might use:


Mirrors:
1) Randomly selecting a mirror to download from, thus helping to  
spread the bandwidth usage among them.
2) Try the main link, and if the DNS lookup fails, or a connection  
can't be made or something, automatically try the next one.
3) Ping each of the servers in the background, and if the user clicks  
the link, use the fastest one.


Alternates:
1) Have a prioritized list of formats, and choose the link that  
points to the highest priority format.
2) Of all the formats the app supports, choose the one with the  
smallest @length, if present.


Either one:
1) Show some sort of UI for selecting which link to follow (perhaps  
have the main link selected by default, but allow the user to select  
an alternate from the popup).


None of those ideas is necessarily tied to any particular link  
relation.  They might be more important for enclosures than any of  
the other relations that have been defined so far, and an application  
may or may not do some for enclosures that it doesn't do for some  
other specific link relations.  But again, it comes back to the yet  
unanswered question, are there any disadvantages to keeping it  
generic?  I haven't heard anyone suggest any downside yet--only that  
some people can't imagine why anyone would want to use alternative  
links for anything but enclosures.




Re: Sponsored Links and other link extensions

2005-10-25 Thread Antone Roundy


On Oct 25, 2005, at 11:04 AM, James M Snell wrote:

All-in-one example

The x:group attribute links the two alternates into a single  
grouping; the x:mirror specifies the mirrors for each link.   
nf:follow=no is my Atom Link No Follow extension that tells  
clients not to automatically download the enclosure.  Dumb clients  
will see what amounts to the current status quo, two different  
enclosures of different types.  Smart clients will see the mirrors,  
the grouping and the no-follow instruction.


link rel=enclosure href=http://example.com/softwarepackage.zip;  
type=application/zip x:group=software-package nf:follow=no
 x:mirror href=http://example2.com/softwarepackage.zip;  
title=California Server /
 x:mirror href=http://example3.com/softwarepackage.zip;  
title=European Server /

/link
link rel=enclosure href=http://example.com/ 
softwarepackage.tar.gz type=application/x-gzip x:group=software- 
package nf:follow=no
 x:mirror href=http://example2.com/softwarepackage.tar.gz;  
title=California Server /
 x:mirror href=http://example3.com/softwarepackage.tar.gz;  
title=European Server /

/link

Thoughts?


The only thing I would change is the name of x:mirror/@title to make  
it clear that it isn't intended(?) to replace the parent link's  
@title.  My current favorite name is label.




Re: Sponsored Links and other link extensions

2005-10-25 Thread Antone Roundy


On Oct 25, 2005, at 1:16 PM, James M Snell wrote:
Also, assuming the title on the main link is supposed to describe  
the download file itself, there appears to be no way to inform the  
user of the mirror location of the main URI. Without a location  
name of some sort, the user can't make an informed decision about  
which mirror would be best to use. Perhaps something along the  
line of Antone's label suggestion might help here.




I could just do this:

link rel=enclosure href=http://example.com/ 
softwarepackage.tar.gz type=application/x-gzip x:group=software- 
package nf:follow=no 
x:mirror href=http://example.com/softwarepackage.tar.gz;  
label=Main Server /
x:mirror href=http://example2.com/softwarepackage.tar.gz;  
label=California Server /
x:mirror href=http://example3.com/softwarepackage.tar.gz;  
label=European Server /

/link


or this:

link rel=enclosure href=http://example.com/ 
softwarepackage.tar.gz type=application/x-gzip x:group=software- 
package x:label=Main Server nf:follow=no 
x:mirror href=http://example2.com/softwarepackage.tar.gz;  
x:label=California Server /
x:mirror href=http://example3.com/softwarepackage.tar.gz;  
x:label=European Server /

/link



Re: New Link Relations -- Ready to go?

2005-10-24 Thread Antone Roundy


On Oct 24, 2005, at 8:13 AM, James Holderness wrote:
With what we have so far we can do incremental feed archives; we  
can do at
least some form of searching; we can do non-incremental feeds (of  
the Top

10 variety) with history. I think that's a good start.


But we also want paged non-incremental feeds (OpenSearch result  
feeds),

while non-incremental feeds with history have not yet proven to be
needed.


I still don't see why OpenSearch result feeds can't be implemented  
as incremental feeds.
Perhaps they can, but that wouldn't always be desirable. Consider  
this scenario: Somebody writes a program that searches Google,  
scrapes the HTML results, and publishes them as an Atom feed.  My  
purpose in subscribing to the feed is not to be alerted when a new  
webpage is added to page 20 of Google's results, it's to be alerted  
whenever a new webpage makes it onto page 1.  So I don't want new  
pages added to the live end of the feed--I just want whatever is  
currently in the top 10 results, and my feed reader will tell me when  
one of them is one it hasn't seen before.


Either they're being used as a one-off search and you can't  
subscribe to them (in which case there is no difference between  
incremental and non-incremental), or they're being updated with new  
results over time (like a filtered aggregate feed) in which case I  
would think they have to be incremental.

Given the above scenario, why wouldn't you be able to subscribe to them?

I'm proposing previous/next linking from chunk to chunk inside the  
same
snapshot and adding a new link relation (or set of link relations)  
for

linking from snapshot to snapshot.

Do you now see what I'm talking about?


I understand what you're talking about, but I just don't see the  
need. I would have expected a non-incremental feed to be a single  
Atom document.
In the case of something like a top 10 feed, I'd imagine it would  
be.  But a search results feed like what's described above may not be.


My reason for wanting paging is so that a user doesn't need to  
fetch data that he already has - this can never be a problem with a  
non-incremental feed because it doesn't grow.
I'm not sure I understand--it's not as if a non-incremental feed were  
simply a static document.  They're resources whose contents are  
replaced wholesale (with the things that were in the old set possibly  
still being in the new set) rather than having their old contents  
augmented when new things are added.




Re: Profile links

2005-10-24 Thread Antone Roundy


On Oct 23, 2005, at 6:45 PM, James Holderness wrote:

James M Snell wrote:
1. Can a profile element appear in an atom:feed/atom:source?  If  
so, what does it mean? I think it should with the caveat that the  
profile attribute should only impact the feed and should not  
reflect on the individual entries within that feed.


I can't see any particular use for atom:source myself, but I would  
definately want profile support at the feed level. As an aggregator  
I want to be able to display a custom view for a particular feed  
based on what it contains (e.g. slideshow view if it's a flickr  
feed - all images). It would be difficult to do something like that  
with only entry level profiles.


I don't think it's possible to allow something at the feed level, but  
disallow it in atom:source (the Atom format spec could have done  
that, but I don't think an extension can add such restrictions).


What does it mean in atom:source?  That the feed that the entry came  
from conformed to the profile.


What will consuming applications do with profile elements in  
atom:source?  That's entirely up to the application developer.  Maybe  
nothing--maybe they'll ignore profiles that don't apply to the entire  
feed.  Or maybe they'll come up with something useful.




Re: Sponsored Links and other link extensions

2005-10-24 Thread Antone Roundy


On Oct 24, 2005, at 5:18 AM, James Holderness wrote:

Eric Scheid wrote:
The challenge with using alternate to point to files of different  
types
is that why would someone do (a) when they can already do (b)  
without

the help of a new extension

(a)
link rel=enclosure type=audio/mpeg href=http://example.com/ 
file.mp3
x:alternate type=application/ogg href=http://example2.com/ 
file.ogg /

/link

(b)
link rel=enclosure type=audio/mpeg
href=http://example.com/file.mp3; /
link rel=enclosure type=application/ogg
href=http://example2.com/file.ogg; /



With (a), we know the .mp3 and the .ogg are simply different  
formats of the

same thing. With (b) we don't know either way.


I like (a) in concept because, as you say, it enables you to tell  
when two links are the same so if you're auto-downloading you don't  
need them both. However, I do think James is right in thinking that  
many people will just use (b) because it's already there.


I don't see the harm in allowing (a) though. If a feed producer  
uses (a) and an end-user has auto-downloading enabled for that  
feed, they both benefit from less wasted bandwidth. The only  
downside would be that aggregators that aren't aware of this  
extension would fail to see the alternate enclosures. Is that so  
bad though? It's a trade-off the feed producer has to make - I'm  
not sure we should be making that decision for them.


Here's the middle path:

(c)
link rel=enclosure type=audio/mpeg href=http://example.com/ 
file.mp3 x:link-set=a /
link rel=enclosure type=application/ogg href=http:// 
example2.com/file.ogg x:link-set=a /


This won't save you from bandwidth waste by aggregators that don't  
support the extension, but it also won't prevent users of those  
aggregators from getting the data in a format they can use.  That  
said, this is not my preferred method.  I'd rather protect bandwidth  
and the user's hard drive space--all the more important because  
enclosures are often quite large.


Here's a final option--is it legal?  Is it better or worse than (a)  
in any ways?


(d)
link rel=enclosure type=audio/mpeg href=http://example.com/ 
file.mp3
link rel=alternate type=application/ogg href=http:// 
example2.com/file.ogg /

/link

Better: Doesn't require processing of a new namespace or element-- 
just a new way of using the data that one gets out of an existing  
element.


I prefer d, a, c and then b.



Re: New Link Relations -- Ready to go?

2005-10-24 Thread Antone Roundy


On Oct 24, 2005, at 11:16 AM, James Holderness wrote:
A more sensible approach would be a single feed document containing  
the top N results (where N is manageable in size). You could  
subscribe to that as a non-incremental feed and you would know at  
any point in time which were the top 10 results. There is no real  
need for paging other than as a form of snapshot history (i.e. what  
were the top 10 results last week).
That is certainly a good approach--allowing the number of results to  
be determined dynamically by something in the URL, for example.   
However, it could be useful to limit the chunk size and allow paging  
for people who want more.  For example, you might allow a maximum of  
50 results per chunk, and then support ETags.  That way, if somebody  
wants to monitor the top 250, they can send 5 requests, and if most  
of the time there are no changes, they'll get a lot of 304s, but if  
occasionally something changes in the last chunk of 50 for example,  
they're only downloading 50 results each time something changes.   
There are of course other approaches, like support for just sending  
the diffs.  But that would probably more difficult for most people to  
implement, and may be less likely to be supported by a wide variety  
of clients.


Another reason for wanting to limit the number of results per query  
(and support paging for those who want more) is to avoid bandwidth  
waste if someone accidentally ads an extra digit to the desired  
number of results; or tries to waste your system resources by  
requesting huge result sets (but dropping the connection before using  
up their own bandwidth actually receiving the whole result set); or  
has a client that doesn't support paging or diffs or ETags or  
anything, and wants a huge result set (and you don't want to  
accommodate them since it would use so much bandwidth), etc.


Once again, I have to ask the same question I asked Thomas: do you  
have a problem with Mark's next/prev proposal as it stands, or are  
you just arguing with me because you think I'm wrong? If the  
latter, feel free to just ignore me. We can agree to disagree.  
Unless we're discussing a particular proposal I don't see the point.
I have a problem with not having link relations specific to paging  
through a feed's current state.  I'm fine with having general chain  
navigation link relations, but hope that we'll get something specific  
to paging and that people will use it instead of the general link  
relations.  I've spoken my peace on that and have given up swimming  
against the tide, but am still willing to discuss specific related  
issues.




Re: Sponsored Links and other link extensions

2005-10-24 Thread Antone Roundy


On Oct 24, 2005, at 1:48 PM, A. Pagaltzis wrote:

I have a completely different proposition.

(e)
link
rel=enclosure type=audio/mpeg
href=http://example.com/file.mp3;
encl:mirrors=http://www2.example.com/file.mp3 http:// 
www3.example.com/file.mp3

xml:id=x-file
/
link
rel=alternative-enclosure type=application/ogg
href=http://example2.com/file.ogg;
encl:alternative-to=x-file
/

Since bit-for-bit identical files all have the exact same
attributes, there is absolutely no reason to have an entire tag
dedicated to each. In addition, making mirror URLs second-class
citizens in this ways provides an intuitive hint at the
bit-for-bit identity semantics.
Interesting.  Filling an attribute with a list of URIs doesn't really  
appeal to me though.  How about this:


link rel=enclosure type=audio/mpeg href=http://example.com/ 
file.mp3 xml:id=x-file

altlink:mirror href=http://www2.example.com/file.mp3; /
altlink:mirror href=http://www3.example.com/file.mp3; /
/link


Specifying alternative formats with a distinct link relationship
prevents bandwidth and diskspace drain from oblivious clients.
Sounds good, but you may have noticed above that I used a prefix not  
specific to enclosures--there's no reason to tie this all to one  
particular type of link (nor to make it look as if it were tied to  
one specific link type).  So the other link might, for example, be:


link rel=alternative-link type=application/ogg href=http:// 
example2.com/file.ogg altlink:primary=x-file /


Although alternative-link doesn't tell you what kind of link this  
is, since you're going to have to tie it back to the primary link to  
decide what to do with it anyway, it really shouldn't matter.  Note  
that I changed alternative-to to primary just because it's  
shorter and one word.




Re: Sponsored Links and other link extensions

2005-10-24 Thread Antone Roundy


On Oct 24, 2005, at 2:59 PM, A. Pagaltzis wrote:

* Antone Roundy [EMAIL PROTECTED] [2005-10-24 22:35]:

Interesting. Filling an attribute with a list of URIs doesn't
really appeal to me though. How about this:

link rel=enclosure type=audio/mpeg href=http://example.com/
file.mp3 xml:id=x-file
altlink:mirror href=http://www2.example.com/file.mp3; /
altlink:mirror href=http://www3.example.com/file.mp3; /
/link


It’s a lot more verbose and you have to fiddle with nesting.

What do you get in return? “It looks more XMLish”?
1) Easier parsing, as James said, since your XML parsing library is  
going to give you the data with the URI's already split apart.


2) You can break lines between elements, but you can't inside an  
attribute, so it's better for display for humans.


I think XMLishness leans this direction for good reason.


Sounds good, but you may have noticed above that I used a
prefix not specific to enclosures--there's no reason to tie
this all to one particular type of link (nor to make it look
as if it were tied to one specific link type). So the other
link might, for example, be:


I don’t know if striving for generality in this fashion without
a practical need is worthwhile. It smells of architecture
astronautics for a reason I can’t particularly pinpoint. So maybe
my instinct is wrong.
The way I see it, striving for specificity without a practical need  
isn't worthwhile.  Unless generalizing risks leading to some sort of  
problem, why do it?  I see no potential problems.


What if someday somebody does come up with a non-enclosure use for  
this (which hardly seems far-fetched to me--enclosures aren't the  
only things that get mirrored or exist in multiple formats)?  They'll  
have to define a new mechanism for it which is either going to be  
identical except for element names, or they're going to invent  
another way to do the same thing.  Either way, the pain of supporting  
both is completely unnecessary unless there's potential for  
generality causing problems.




Re: Sponsored Links and other link extensions

2005-10-24 Thread Antone Roundy


On Oct 24, 2005, at 9:59 PM, A. Pagaltzis wrote:

* Antone Roundy [EMAIL PROTECTED] [2005-10-25 00:35]:

2) You can break lines between elements, but you can't inside
an attribute, so it's better for display for humans.

That’s not what the XML spec says.


Doh!  Who knows where I got that idea.  I still prefer to have each  
piece of data in it's own place.



What if someday somebody does come up with a non-enclosure use
for this (which hardly seems far-fetched to me--enclosures
aren't the only things that get mirrored or exist in multiple
formats)? They'll have to define a new mechanism for it which
is either going to be identical except for element names, or
they're going to invent another way to do the same thing.
Either way, the pain of supporting both is completely
unnecessary unless there's potential for generality causing
problems.


If it isn’t obvious from the start what it means that there’s
an alternative-link for a via link or a previous or next link,
then clients will have to support each of these use case
separately. So on the implementor’s end, there’s no discernible
difference between the pain of supporting either approach.


I'm not sure I understand what you're saying.  Are you saying that  
one might do this if they want and alternate of a next link?


link rel=next xml:id=foo ... /
link rel=alternate-enclosure x:alternate-of=foo ... /

If that's what you mean, then sure, the code for that would be the  
same as for:


link rel=next xml:id=foo ... /
link rel=alternate-link x:alternate-of=foo ... /

...but it would sure look odd.  I see no advantage to naming these  
things in terms of enclosures.




Re: What is this entry about?

2005-10-21 Thread Antone Roundy


On Oct 21, 2005, at 5:47 PM, James M Snell wrote:

Err, are you forgetting atom:category? Doesn’t that satisfy all
your wants *and* more? It has a URI, a term and a human-readable
label.

Regards,


I dunno, that's why I was asking ;-)

atom:category works well for categorizing entries, but does it  
really tell us what the entry is about?  For instance, suppose that  
I want to indicate that an entry is about http://www.ibm.com and  
file that in a category called technology?  The categorization of  
the entry is different than the subject of the entry.. tho both are  
definitely related.


Why don't we define link/@rel=about for pointing to a specific  
internet resource that an entry is about (a little more specific than  
the general case of rel=related).  I know we discussed this before  
and in the chaos of trying to hammer the spec out, didn't do it, but  
I still think it's a good idea.




Re: New Link Relations -- Ready to go?

2005-10-21 Thread Antone Roundy


On Oct 21, 2005, at 7:19 PM, James Holderness wrote:
What's the difference between a search feed and a non-incremental  
feed? Aren't search feeds one facet of non-incremental feeds?


Not necessarily, no. A search feed could quite easily be  
implemented as an incremental feed. This is the most sensible  
approach since it would allow the feed to be viewed in all existing  
aggregators without requiring a special knowledge of non- 
incremental feeds.
If your goal is to work as well as possible with today's client  
software, then bending your data to fit their model is the most  
sensible approach, but that's not always the goal.


The initial feed document consists of all known results at the time  
the search is initiated. As new results are discovered over time,  
the feed can be updated by adding new entries to the top of the  
feed in much the same way that new entries would be added to the  
top of a blogging feed. In fact, if you do a search with something  
like feedster, this is exactly the sort of feed you will get back.
If creation time is relevant to the data being searched, then this  
makes sense.  But what if I want to subscribe to the top 10 Google  
results for some keywords I'm trying to optimize my site for  
(ignoring the fact that Google doesn't return search results in any  
feed format right now)?  Or what about alternative sort orders which  
are available on sites like Feedster, Google News, etc.? (You can  
sort by relevance rather than date--the date still has some weight,  
but the results aren't strictly in date order). How about Amazon.com  
affiliates who want to use an RSS parser to display affiliates links  
to best sellers search results?  There are a lot of search use  
cases that don't fit the incremental model.


All that said, search results are often a bit different than top 10  
lists and the like.  With search results, you often don't want to  
view the contents of the feed in order all at once--the first time  
you do, but after that, you may just want to see new things as they  
make it up into the top positions.  Today's clients can handle that  
just fine, unless you want to monitor more than just the first page  
of results.




Re: General/Specific [was: Feed History / Protocol overlap]

2005-10-19 Thread Antone Roundy


On Oct 19, 2005, at 11:12 AM, Mark Nottingham wrote:


next
next-chunk
next-page
next-archive
next-entries
are all workable for me.


...


Perhaps people could +1/-1 the following options:

* Reconstructing a feed should use:
   a) a specific relation, e.g., prev-archive
   b) a generic relation, e.g., previous


I'd prefer prev-page.  prev-archive doesn't sound right for  
paging through search results.  Also, prev-archive or next- 
archive (whichever ends up going forward in time) doesn't quite work  
if the final step forward points to the subscription feed URI (which  
isn't an archive.  That's a small matter since it's only that last  
step, but in search results type cases, archive would definitely be  
odd.


Just a little follow up on what I wrote last night about generic vs.  
specific link relations: related is a generic term that is likely  
to be a bit of a catch-all for links that don't have a specific  
relation defined for them.  alternate is a specific relation  
created for one of the major historical use cases for rss/link.  The  
proposed but not accepted about would have been the specific  
relation for the other major use case that rss/link was commonly used  
for.


related could conceivably handle the hypothetical use case of  
traversing a chain of different feeds--you'd just have to remember  
which related link to a feed document you had already traversed to  
know which one to follow next to continue down the chain.  It  
wouldn't be quite as nice for such an application as having a next  
and prev for that use, but I'd rather see it done that way till  
it's clear that such a thing is even needed than see intrafeed paging  
links used for interfeed navigation.




Re: Feed History / Protocol overlap

2005-10-18 Thread Antone Roundy


On Oct 18, 2005, at 6:10 PM, Robert Sayre wrote:

On 10/18/05, Antone Roundy [EMAIL PROTECTED] wrote:

-3 to being that generic.


That's a very large negative number. Can you explain how your version
will me write software I otherwise couldn't?


Anything larger than -2 is bogomips--the point I was trying to make  
is that I think the idea of using the same link relation for paging  
within a feed and for navigating between feeds is absolutely absurd-- 
completely lacking in foresight--almost looks like an attempt to  
create for future problems.  People were complaining that trying to  
avoid problems with the hypothetical top 100 DVDs scenario (not  
trying to solve it--just trying to avoid problems if it comes about)  
was wandering too far off into hypotheticals, but now people want to  
make sure they can use the next relation for the arguably even more  
hypothetical idea of building a chain of otherwise independent  
feeds?  This boggles my mind.


Here's what my version will let you do that you won't be able to do  
if the definitions of these links allows them to be used for  
interfeed navigation--it will enable you to do paging within a feed  
that is also part of a chain of feeds (because anyone wanting to  
create a chain of feeds will have to come up with a non-conflicting  
link relation to do it).  It will also enable you to know that  
(unless somebody's breaking the spec) you are navigating through a  
single feed when you follow next and prev links around--that you are  
not jumping from feed to feed.  Your software will be able to follow  
those links with a much greater degree of confidence that it won't  
result in your users complaining what the hell are you doing showing  
me entries from a feed I didn't subscribe to?  It will enable your  
application to take more actions automatically without having to ask  
for confirmation from the user every time you follow another next or  
prev link to avoid such complaints.




Re: Feed History / Protocol overlap

2005-10-18 Thread Antone Roundy


Here's what this discussion makes me think of--RSS has a link  
element.  That link was very generic, and has been variously used to  
link to what Atom calls link/@rel=alternate and link/ 
@rel=related, and perhaps even other things.  Once we'd gained a  
little experience and discovered that the imprecision of the meaning  
of the element was limiting uses we wanted to make of feeds, we  
created more specific types of links.  Hopefully, we were specific  
enough this time that we won't run into significant use cases that  
we've rendered impossible, but who knows.


Now we're defining a method of navigating through a chain of linked  
documents.  We know of two specific use cases that we're sure we want  
to be able to do: paging through things like search results, and  
catching up on incremental feeds (or reconstructing the entire state  
of the feed, which is an extension of catching up).  It would appear  
that the same link relation can be used to do both of those things  
without the fear of conflict, because they operate within feeds that  
have a basic difference in nature, so they're unlikely to both be  
needed within one feed.  Also, from a certain point of view, they are  
really the same thing--a way to navigate through the current state of  
the feed.  The fact that incremental feeds don't have old states that  
have been discarded and replaced the way non-incremental feeds do  
(their former state gets augmented rather than being replaced)  
doesn't make a difference with respect to the issue of navigating  
through their current state.


So why don't we create a mechanism to do those two things (that are  
really one thing), and NOT make it generic enough to encompass other  
things that we might want to do someday, which might lead to the same  
sort of limitation that RSS has by only having one generic link  
element?  Sure, we COULD do all of our interdocument navigation using  
next and prev until someday when we decide that we need something  
more specific for some of the navigation use cases.  But then we'll  
be doing some of the same things multiple ways--some people sticking  
with next and prev, and some using whatever new methods or link  
relations are invented, and nobody quite sure what next and prev  
mean in any particular feed.  Why not wait till we've really figured  
out what other ways we might want to navigate between documents, and  
then devise a new method for doing it?


If we're going to create some generic link relations for people to  
experiment with, let's create somethings that's explicitly for doing  
experimental things with so that the link relations we want to do  
more specific things with aren't rendered less useful by the  
experimentation.  Register x-next and x-prev or something for  
that, or register next-page and prev-page for the things we know  
we want to do.  Or don't register any such thing--just don't promote  
use of the the link relations we define for (reasonably) well  
understood use cases to do experimental things.


We'll, I've spoken my mind plenty on this issue, so unless somebody  
brings up an issue that my opinion on couldn't be understood from  
what I've written already, I think I'll leave it at that.  If we go  
with a highly-generic definition and it causes trouble down the road,  
I'll have some big ASCII art letters ready to say I told you so.   
If not, then oops, I guess I was wrong.




Re: Feed History -04

2005-10-17 Thread Antone Roundy


On Oct 17, 2005, at 2:20 AM, Eric Scheid wrote:

On 17/10/05 5:09 PM, James Holderness [EMAIL PROTECTED] wrote:
1. Which relationship,  next or prev, is used to specify a link  
backwards in
time to an older archive. Mark Nottingham's Feed History proposal  
used prev.

Mark Pilgrim's XML.com article used next.
I'd prefer that our use of 'prev' and 'next' be consistent with  
other uses
elsewhere, where 'next' traverses from the current position to the  
one that

*follows*, whether in time or logical order. Consider the use of
'first/next/prev/last' with chapters or sections rendered in HTML.
...so do you follow forward through time or backward?  Is the  
starting current position now or the the beginning of time?   
Especially if we're talking about history, following backward makes  
as much sense as following forward.


I prefer next to go back in time (if temporally ordered--from the  
most current chunk to the next most current chunk) or to less  
significant pages (in things like search results).  But I'll probably  
have to stop and think what next means in temporally ordered feeds  
from time to time since it'd be the reverse of temporal order.


2. Are next and prev both needed in the spec if we only require  
one of them

to reconstruct the full history?
Knowing that the most recently published archive won't likely  
remain the
most recently published archive, there will be use cases where it's  
better
to reconstruct the full history by starting at the one end which is  
fixed.

Not much sense starting at the other end which is constandly shifting.
Is this only going to be used to reconstruct full history?  What  
about just reconstructing the last 3 months (in which case you'd want  
a link from closer to the live end to close to the fixed end), or  
reading from the beginning before deciding whether to continue  
reading what comes later (in which case you'd want a link from closer  
to the fixed end to closer to the live end).



3. Are the first/last relationships needed?
See (2) above for 'first'. Meanwhile 'last' could be followed by a  
user to
jump ahead to the end of the set of archives to see if the butler  
did it.

Who said 'first/next/prev/last' would only be used by machines?
As mentioned above, there may be cases where you'd prefer to start at  
either the fixed or live end, so as long as complete feed  
reconstruction isn't the only goal, I'd say yes.


But what's first?  It'd be the top results in a search feed, but  
would it be the start of time or the start from the present (before  
possibly traveling backward through time) in a temporally ordered  
feed?  Making it the start of time would prevent it from matching up  
well with how significance ordered feeds match up (ie. does start  
point to the thing you'd most likely want to see if you subscribed to  
the feed?)  If we're not careful, we'll be traversing out of first  
through prev and last through next!



4. Is the order of the entries in a feed relevant to this proposal?

not to this proposal.
If you mean not just the order within each chunk of the feed, but the  
order of the chunks, then not central, but certainly related.  Two  
cases come to mind:


1) A chain of temporally ordered chunks in the history of a feed  
where new entries are tacked onto the end.
2) Search results, where the order of everything all along the entire  
chain shifts around all the time.


If you're not going to reconstruct the whole thing, then your  
decision function for when to stop may have to be different depending  
on how things are ordered.


BTW, case 2 destroys the idea of a fixed end and a live end.

Having a means to indicate what the ordering is might make it easier  
to make the distinction between next and prev more intuitive.   
I'm not sure how else we're going to reconcile terminology for  
significance and temporally ordered feeds.


5. Is the issue of whether a feed is incremental or not (the  
fh:incremental

element) relevant to this proposal?

non-incremental feeds wouldn't be paged, by definition, would they?
This week's top ten on the first page, last week's ten on the second  
page...


Since this proposal is defining a paging mechanism, I think that what  
each page represents is relevant.  Is it an earlier part of the feed  
or an earlier state of the feed?


6. What to name the link relation that points to the active feed  
document?

subscribe, subscription, self, something else?

'subscribe'
I just noticed something about the definition of self in the format  
spec.  In one place it says:


   o  atom:feed elements SHOULD contain one atom:link element with a  
rel

  attribute value of self.  This is the preferred URI for
  retrieving Atom Feed Documents representing this Atom feed.

Does that mean that it's the preferred optionsubscription/option  
URI, or the preferred place to retrieve optionthis chunk/option  
of the feed history?  The format spec didn't define paging, so it  
didn't 

Re: Feed History -04

2005-10-17 Thread Antone Roundy


On Oct 17, 2005, at 10:04 AM, Antone Roundy wrote:

4. Is the order of the entries in a feed relevant to this proposal?

...
1) A chain of temporally ordered chunks in the history of a feed  
where new entries are tacked onto the end.
2) Search results, where the order of everything all along the  
entire chain shifts around all the time.


If you're not going to reconstruct the whole thing, then your  
decision function for when to stop may have to be different  
depending on how things are ordered.


BTW, case 2 destroys the idea of a fixed end and a live end.

Having a means to indicate what the ordering is might make it  
easier to make the distinction between next and prev more  
intuitive.  I'm not sure how else we're going to reconcile  
terminology for significance and temporally ordered feeds.


Okay, I've got another idea--switch to totally generic terminology, a  
la:


end-a: the URI of most significant, most current,  
prerequisite[1], etc. end of a sequence of documents, or a randomly  
selected end if there is no order.
end-b: the URI of the least significant, least current,  
or ...uh, postrequisite? end of a sequence of documents or  
otherwise the opposite end from end-a.
a-ward: the URI of the document next closest to end-a in the  
sequence.
b-ward: the URI of the document next closest to end-b in the  
sequence.


If you have neither end-a nor end-b, then you should use b-ward  
to traverse out of the subscription document (ie. the subscription  
document in that case is assumed to be end-a).


[1] if the sequence should be read first to last, for example, if  
it's a novel broken down into entries, end-a points to the place  
from which one should start.  Which end is end-a and which is end- 
b is somewhat subjective. For example, in a temporally ordered feed,  
is it most important to read what's most current, or to understand  
the origins of the present first before reading what's most current?



One more thing occurs to me--if this extension is going to be used to  
handle things like paging in search results, then it's not really  
feed history, it's paging.




Re: Feed History -04 -- is it history or paging or both?

2005-10-17 Thread Antone Roundy


If we're going to separate the concepts of history and paging,  
then the term history doesn't really apply to incremental feeds.   
In an incremental feed, all of the entries are part of the current  
state of the feed.  We don't go back through history to find the  
present--we go to different pages of the present.  In a non- 
incrememental feed also, we may have multiple pages of current  
entries (eg. the top 100 DVDs in chunks of 10), or we may have just  
one.  We also may preserve historical data (eg. the top 10 songs last  
week, the week before, etc.)


So what we end up with might looks like this:

Any feed, whether incremental or not, MAY contain something like this  
(names chosen somewhat arbitrarily, with an eye toward avoiding  
excess conceptual baggage):


page-a - the URI of one end of a chain of documents representing one  
state of a feed resource (eg. the current state of an incremental  
feed)--it doesn't really matter which end it is

page-b - the other end of the chain of documents
page++ - the next farther page from page-a
page-- - the next closer page to page-a

Neither page-a nor page-b is necessarily fixed--the entire  
contents of the chain may shuffle around, be added to, be deleted  
from, etc., in the case of something like search results.


A non-incremental feed MAY also contain something like this (history  
is temporal, so we can use temporally loaded terminology):


history-1 - a document containing a representation of one of the ends  
of or the entire temporally first historical state of the feed resource
history-n - a document containing a representation of one of the ends  
of or the entire temporally last (perhaps current and still changing)  
historical state of the feed resource
history++ - one of the ends or ... of the the next more recent  
historical state... (moves toward history-n)
history-- - one of the ends ... of the next less recent historical  
state... (moves toward history-1)


If you want to catch up on an incremental feed to which you're  
subscribed, or want to get the last month of an incremental feed to  
which you are newly subscribed, you look for page++ or page-- and  
follow whichever one the subscription document (which can only have  
one, since it's one of the ends) contains till you've got everything  
you want.


If you start in the middle, you don't know which direction you're  
going...but since the ordering of the chain isn't defined, it's like  
the Cheshire cat says--it doesn't matter which direction you go if  
you don't know where you want to end up...or something like that.   
Perhaps convention could dictate that page-a be where the publisher  
subjectively thinks that a newcomer to the feed would be most likely  
to want to start reading.  It wouldn't always be correct, but so what?




Re: Are Generic Link Relations Always a Good Idea? [was: Feed History -04]

2005-10-17 Thread Antone Roundy


On Oct 17, 2005, at 5:17 PM, Mark Nottingham wrote:
They seem similar. But, what if you want to have more than one  
paging semantic applied to a single feed, and those uses of paging  
don't align? I.e., there's contention for prev/next?


If no one shares my concern, I'll drop it... as long as I get to  
say I told you so if/when this problem pops up :)

I share your concern.


On 17/10/2005, at 3:21 PM, Thomas Broyer wrote:

I don't think there are different concepts of paging.

Paging is navigation through subsets (chunks) of a complete set of  
entries.
Yeah, but what if you need what amounts to a multi-dimensional  
array.  The method of addressing each dimension has to be  
distinguishable from the others.


If the complete set represents all the entries ever published  
through an ever-changing feed document (what a feed currently is,  
you subscribe with an URI and the document you get when  
dereferencing the URI changes as a sliding-window upon a set of  
entries), then paging allows for feed state reconstruction.
In other terms, feed state reconstruction is a facet of paging, an  
application to non-incremental feeds.
Let's say you're doing a feed for the Billboard top 100 songs.  Each  
week, the entire contents of the feed are swapped out and replaced by  
a new top 100 (ie. it is a non-incremental feed).  And let's say you  
don't want to put all 100 in the same document, but you want to break  
it up into 4 documents with 25 entries each.  You now have two  
potential axes that people might want to traverse--from songs 1-25 to  
26-50 to 51-75 to 76-100, or from this weeks 1-25 to last weeks 1-25  
to two weeks ago's 1-25, etc.  You can't link in both directions with  
the same next.


There are clearly two distinct concepts here--navigating through the  
chunks that make up the current state of the feed resource, and in a  
non-incremental feed, navigating though the historical states of the  
feed resource.




Re: New Link Relations? [was: Feed History -04]

2005-10-17 Thread Antone Roundy


On Oct 17, 2005, at 3:44 PM, Mark Nottingham wrote:

On 17/10/2005, at 12:31 PM, James M Snell wrote:
Debating how the entries are organized is fruitless.  The Atom  
spec already states that the order of elements in the feed has no  
significance; trying to get an extension to retrofit order- 
significance into the feed is going to fail... just as I  
discovered with my Feed Index extension proposal.
Here's what the spec says: This specification assigns no  
significance to the order of atom:entry elements within the  
feed.  ...but there may be some.  ...but there's no action you can  
take based on it unless something else tells you what the  
significance is.  ...which, yes, is very difficult to specify.


For the purposes of this discussion, it doesn't matter what the order  
of atom:entry elements within a feed document is.  But the order of  
chunks of atom:entry elements within a linked series of feed  
documents may have significance, and in fact, unless you just want to  
reconstruct the complete feed state, working with a series of feed  
documents with no specific order would be fairly unwieldy.  Imagine  
paging though a feed of search results with no idea of whether you'd  
just jumped from the most to the least significant results, or to the  
second most significant results.  Imagine trying to catch up on a  
fast-moving incremental feed without having any idea whether a link  
would take you to the first entries ever added to a feed or the one's  
you just missed.


I do believe that a last link relation would be helpful for  
completeness
...and last certainly seems to imply SOME sort of ordering of  
chunks, even if we know nothing about the order of the entries in  
each chunk.


To each of the following, perhaps we could add something to indicate  
that these link relations are all used to page through the current  
state of a feed, and not to navigate among various states of a feed.   
The fact that most people wouldn't have a clue what that means  
without some discussion of incremental and non-incremental feeds may  
be an argument for having a spec document to provide more explanation  
(rather than embedding an identical explanation in each  
Description).  Example:


At any point in time, a feed may be represented by a series of Feed  
documents, each containing some of the entries that exist in the feed  
at that point in time.  In other words, a feed may contain more  
entries than exist in the Feed document that one retrieves when  
dereferencing the subscription URI, and there may be other documents  
containing representations of those additional entries.  The link  
relations defined in this specification are used to navigate between  
Feed documents containing pages or chunks of those entries which  
exist simultaneously within a feed.


Note that this specification does not address navigation between the  
current and previous states of a type of feed which does not  
simultaneously contain it's current and past entries.  For example, a  
Top 100 Songs feed might at any point in time only contain entries  
for the top 100 songs for a single week, which entries may or may not  
be divided among a number of Feed documents.  The entries for the top  
100 songs from the previous week are not only no longer part of the  
Feed document or Feed documents representing the current state of the  
feed--they are no longer part of the feed at all.  Another  
specification may describe a method of navigating between the current  
and previous states of such a feed.  The link relations defined in  
this specification are only used to navigate between the various Feed  
documents representing any single state of such a feed.



 -  Attribute Value: prev
 -  Description: A stable URI that, when dereferenced, returns a  
feed document containing entries that sequentially precede those in  
the current document. Note that the exact nature of the ordering  
between the entries and documents containing them is not defined by  
this relation; i.e., this relation is only relative.

 -  Expected display characteristics: Undefined.
 -  Security considerations: Because automated agents may follow  
this link relation to construct a 'virtual' feed, care should be  
taken when it crosses administrative domains (e.g., the URI has a  
different authority than the current document).


 -  Attribute Value: next
 -  Description: A stable URI that, when dereferenced, returns a  
feed document containing entries that sequentially follow those in  
the current document. Note that the exact nature of the ordering  
between the entries and documents containing them is not defined by  
this relation; i.e., this relation is only relative.

 -  Expected display characteristics: Undefined.
 -  Security considerations: Because automated agents may follow  
this link relation to construct a 'virtual' feed, care should be  
taken when it crosses administrative domains (e.g., the URI has a  
different authority 

Re: New Link Relations? [was: Feed History -04]

2005-10-17 Thread Antone Roundy


On Oct 17, 2005, at 10:17 PM, James M Snell wrote:
When I think of next/prev I'm not thinking about any form of  
temporal semantic.  I'm thinking about nothing more than a linked  
list of feed documents.  If you want to add a temporal semantic  
into the picture, use a mechanism such as the Feed History  
incremental=true element.
I don't think I expressed the point I wanted to make quite clearly  
enough, so let me try again.


Chains of Feed documents are going to be ordered in some way, whether  
it's specified or not, whether they explicitly indicate it or not.   
For example, the chain of Feed documents representing an incremental  
feed is going to naturally be in temporal order.  You're not going to  
be tacking on new entries willy nilly to whichever of the documents  
in the chain fits your fancy at the moment.  You're going to create a  
new document when the one you were most recently adding entries to  
gets full, and then your going to add entries there till that one  
is full, and so on.  There may be exceptions, but by and large,  
whether the temporal order is explicit or not, that's what's going to  
happen.


Chains of pages of search results feeds are going to naturally be  
ordered with the best matches on top.


The point I was trying to make was that you're not going to create  
all the documents without links between then and then randomly assign  
links between them in no specific order.  You're going to link  
between then in an order that makes sense within the context of how  
the feed was created.


I don't know how client applications are going to adapt to deal with  
the difference between incremental feeds and, for example, search  
results feeds, but I can't imagine that client software isn't going  
to rely on there being some sort of sense to the order of the Feed  
documents.


What I was trying to say further down with the example spec text I  
wrote was, let's state explicitly that this link relation does not  
have a temporal semantic, and if somebody want's a link relation with  
a temporal semantic, they should create another link/@rel value for it.


In other words...

In other words,

this does not imply a feed history thing...
...let's have this be a link for navigating among the pages of the  
current state of the feed (whether it be incremental or not--noting  
that some non-incremental feeds will only have one page, and won't  
need it).  The entries in the current state of the feed are not in  
any specific order (though we know that naturally they will be in  
some sort of order):

 feed
   ...
   link rel=next href=... /
 /feed


How does the following have anything to do with history?  In an  
incremental feed, all of the entries, whether part of the Feed  
document at the subscription end or not, are part of the present  
state of the feed--they don't just exist back in history.  History is  
for non-incremental feeds.  I'm saying let's not work on navigation  
through history right now, but let's recognize that unless we say not  
to, people might try to use the mechanism designed for paging through  
the current state of a feed to navigate through the history of a feed  
too, so let's say not to.  I understand (or at least suppose) that  
you don't think we need to say not to, because you don't see the harm  
in making the link relation more generic.  I disagree.  I think we're  
going to end up with a mess if we don't make it specifically for  
navigating the current state.

this does...
 feed
   ...
   fh:incrementaltrue/fh:incremental
   link rel=next href=... /
 /feed




Re: Spec wording bug?

2005-10-14 Thread Antone Roundy


On Oct 14, 2005, at 5:43 AM, Danny Ayers wrote:

I believe the language of the resource for hreflang makes no sense -
it will be the *representations* that are associated with languages,
and the implies a single language - there may be more than one.

If content negotiation might be used to select from among different  
languages (ie. if multiple representations are available from the  
same URI), then perhaps the hreflang attribute should be omitted.   
Were we to have allowed multiple languages to be specified in the  
same hreflang attribute to cover such cases, the wording would be  
incorrect, but since we didn't, I think it's correct as it is.




Re: Feed History -04

2005-10-14 Thread Antone Roundy


On Oct 14, 2005, at 11:13 AM, Mark Nottingham wrote:

On 14/10/2005, at 9:22 AM, Lindsley Brett-ABL001 wrote:
I have a suggestion that may work. The issue of defining what is  
prev and next with respect to a time ordered sequence seems to  
be a problem. How about defining the link relationships in terms  
of time - such as newer and older or something like that. That  
way, the collection returned should be either newer (more recent  
updated time) or older (later updated time) with respect to the  
current collection doc.


A feed isn't necessarily a time-ordered sequence. Even a feed  
reconstructed using fh:prev (or a similar mechanism) could have its  
constituent parts generated on the fly, e.g., in response to a  
search query.


The OpenSearch case mentioned by Thomas is what convinced me that  
terms related to temporal ordering aren't appropriate (what a pity,  
since newer and older are the perfect terms for time ordered  
sequences of feed documents!)


Previous and next suffer from the fact that they could easily be  
interpreted differently in different use cases. For example, for  
OpenSearch results pages, clearly prev points to the search  
results that come up on top and next to the lower results. But in  
a conventional syndication feed, next could easily be taken to mean  
either the next batch of entries as you track back towards the  
beginning of time from where you started (which is usually going to  
be the growing end of the feed), or a batch of entries containing  
the entries that were published next after the ones in this batch.   
I'd have to look at the document to remind myself of which next  
means, because either makes just as much sense to me.


Which brings me back to top, bottom, up and down.  In the  
OpenSearch case, it's clear which end the top results are going to  
be found.  In the syndication feed case, the convention is to put the  
most recent entries at the top.  If you think of a feed as a stack,  
new entries are stacked on top.  The fact that these terms are less  
generic and flexible than previous and next is both an advantage  
and a disadvantage.  I think the question is whether it's an  
advantage in a significant majority of cases or not.  What orderings  
would those terms not work well for?


Antone



Re: Feed History -04

2005-10-14 Thread Antone Roundy


On Oct 14, 2005, at 11:28 AM, Thomas Broyer wrote:

Mark Nottingham wrote:

How about:

atom:link rel=subscription href=.../

?

I always thought this was the role of @rel=self to give the URI  
you should subscribe to, though re-reading the -11 it deals with a  
resource equivalent to the containing element.
That's what some of us wanted it to be and thought it was intended to  
be.  The language that made it into the spec certainly falls short of  
expressing what was in PaceFeedLink, which is the proposal that added  
@rel=self [1].


1. Isn't a resource equivalent to the containing element the same  
as an alternate version of the resource described by the  
containing element?
That's how I would read that language knowing nothing of the history  
of that part of the spec.  I think some people intended equivalent  
to mean it may not be a different copy of the same bits, but  
whatever it is, it contains the same bits (or at least the same code  
points, if it happens to be transcoded).


2. Is the answer to 1. is no then what does a resource equivalent  
… mean? Is it really different than the URI you should subscribe  
to (at least if @type=application/atom+xml)?
I think what some people want that to mean is here's a place you  
could get the feed, but I'm not making any assertions regarding  
whether it's preferable to get it from there or somewhere else.


[1] http://www.imc.org/atom-syntax/mail-archive/msg15062.html



Re: more than one content element?

2005-10-13 Thread Antone Roundy


On Oct 13, 2005, at 12:06 PM, A. Pagaltzis wrote:

* John Panzer [EMAIL PROTECTED] [2005-10-13 19:40]:

Well, you can pass them around by reference with [EMAIL PROTECTED]
I think.

By the letter of the spec, but not by the spirit.


I just ran through the discussion of this very question on the  
mailing list[1], and though it looks like allowing composite types in  
remote content had pretty good support, that doesn't appear to have  
been translated into a Pace, and obviously, no language specifically  
allowing it got into the spec document.  Thus, it looks like the  
prohibition from section 4.1.3.1 stands, and that you're right that  
the only way you could do it without breaking the rules outright  
would be by ignoring the SHOULD (have content/@type when using  
content/@src), which would certainly be contrary to the spirit of  
the spec as it stands.


[1] http://www.imc.org/atom-syntax/mail-archive/msg15949.html



Re: Feed History -04

2005-10-13 Thread Antone Roundy


On Oct 13, 2005, at 7:58 PM, Eric Scheid wrote:

On 14/10/05 9:18 AM, James M Snell [EMAIL PROTECTED] wrote:



Excellent.  If this works out, there is an opportunity to merge the
paging behavior of Feed History, OpenSearch and APP collections  
into a

single set of paging link relations (next/previous/first/last).



'first' or 'start'?

Do we need to define what 'first' means though?  I recall a dissenting
opinion on the wiki that the 'first' entry could be at either end  
of the

list, which could surprise some.


Yeah, that's a good question.  Maybe calling them top and bottom  
would work better.  Considering that the convention is to put the  
newest entry at the top of a feed document, top might be more  
intuitively understandable as being the new end.  You might also  
rename next and previous (or is it previous and next?) to  
down and up.  There's SOME chance of that getting confused with  
hierarchical levels, but I could live with that.




Re: Straw Poll: age:expires vs. ...... plus a gazillion words

2005-10-10 Thread Antone Roundy


Gh!  Sorry about the mile long subject. Gotta be careful with  
that copy and paste!




Re: Straw Poll: age:expires vs. dcterms:valid (was Re: Unofficial last call on draft-snell-atompub-feed-expires-04.txt) On Oct 8, 2005, at 8:37 AM, James M Snell wrote: I wanted to indicate that a gi

2005-10-10 Thread Antone Roundy


Oops, sent this from the wrong address on Saturday. No wonder it  
didn't get through.


On Oct 8, 2005, at 8:37 AM, James M Snell wrote:

I wanted to indicate that a given entry must expire at Midnight on  
Dec, 12, 2005 (GMT).

using age:expires:


[snip]

using dcterms:valid (http://web.resource.org/rss/1.0/modules/ 
dcterms/#valid)


 entry
   dcterms:validend:2005-12-12T00:00:00Z/dcterms:valid
 /entry

 Advantage:
   * Existing namespace, known element
 Disadvantage:
   * Value can be many different things. I've even seen cases in  
which the content of dcterms:valid is an XML structure.
 My chief problem with dcterms:valid (and with dublin core in  
general) is that the elements are very loosely defined.  The  
content can literally be anything folks want it to be and still be  
considered valid.  Unless we constrain the value space for this  
element when used in Atom, it *could* lead to a bunch of extra work  
for consumers to parse and process those dates. I prefer very  
crisply defined elements.  Then again, reusing an existing  
namespace is Goodness.




I think it would be going too far to say when using dcterms:valid in  
Atom, you must follow this profile, because we don't own dcterms,  
and doing so might limit people from doing valid things with it that  
don't follow that profile.  But I do think it would be reasonable to  
say when using dcterms:valid in Atom, it is recommended that you  
follow this profile--otherwise your data may be technically valid,  
but not widely understood, thus giving developers an excuse for not  
supporting data not formatted according to that profile.  If a use  
case that requires a different format becomes common, then developers  
can start supporting more formats at that point.


That said, my vote is for doing what I just said--advocate the use  
of dcterms:valid for this purpose, with the date formatted to match  
Atom's date construct profile.


BTW, you might choose language that leaves room for having both start  
and end dates for validity--for example, to enable Atom delivery of a  
coupon that's valid for a particular span of dates.




Re: ACE - Atom Common Extensions Namespace

2005-10-03 Thread Antone Roundy


On Oct 2, 2005, at 11:15 PM, Mark Nottingham wrote:
I think this is a well-intentioned effort, but at the wrong end of  
the process. The market (i.e., users and implementors) should have  
a go at sorting out at what's common/prevalent enough to merit this  
sort of thing; having a co-ordinated namespace will lead to the  
problem of what to lump into it, how to version individual  
extensions within it, etc.


I have to agree with Mark.  Consider this scenario: an extension gets  
added to ACE. Someone makes an extension that does the same thing  
differently. The market prefers the non-ACE method and adopts it more  
widely than the ACE solution. Now not only do you have multiple  
namespaces to declare, but one of them has a bunch of elements that  
don't get used, yet implementors feel compelled to implement them  
because they're part of this special namespace.


Here's another scenario: an extension gets added to ACE, and another  
extension gets created that does the same thing better. Because the  
first has the ACE stamp of approval, the inferior method gets wide  
support, and the superior method dies.


Both scenarios suggest that the market should be given time to choose  
best practices rather than some group deciding which practices are  
going to get special status in advance. If a feed is going to carry  
elements from a bunch of different extensions, it's going to be a  
relatively heavy feed anyway. The overhead of including multiple  
namespace declarations isn't going to be that great.




Re: FYI: Updated Index draft

2005-09-22 Thread Antone Roundy


On Wednesday, September 21, 2005, at 11:43  PM, James M Snell wrote:

feed xmlns:i=urn:ranking
i:domain{domain}/i:domain
I was thinking yesterday of suggesting that feed/id be used the way 
you're using i:domain. Which is better is probably a matter of whether 
ranking domains that span multiple feeds will be useful or not. In the 
movie ratings use case presented below, perhaps rather than a 
fivestarts scheme and netflix and amazon domains, it might make more 
sense to do this:


feed
idurn:my_reviews/id
i:order scheme=urn:netflix.com/reviews label=Netflix 
ratingdescending/i:order
i:order scheme=urn:amazon.com/reviews label=Amazon 
ratingdescending/i:order

entry
idMovie A/id
i:rank scheme=urn:netflix.com/reviews3/i:rank
i:rank scheme=urn:amazon.com/reviews4/i:rank
/entry
entry
idMovie B/id
i:rank scheme=urn:netflix.com/reviews2/i:rank
i:rank scheme=urn:amazon.com/reviews1/i:rank
/entry
/feed

Notes:
* The i:order element tells the user agent whether higher or lower 
numbers are considered better, higher priority, first, or 
whatever. In these cases, higher numbers are better, so would 
typicially be shown first, so they're considered a descending schemes.
* i:order/@label indicates a human readable label for the scheme, and 
could be optional.
* Since the urn:(netflix|amazon).com/reviews schemes are feed 
independent, it is not necessary to indicate a feed (or domain) in 
this case.
* For a feed-specific scheme, like natural order, the feed ID would be 
included like this (so that if these entries were aggregated, it would 
be clear that the i:order elements were relevant to the source feed, 
not the aggregate feed):


feed
idurn:my_feed/id
i:order scheme=urn:indexascending/i:order
entry
idurn:my_feed/a/id
i:rank scheme=urn:index feed=urn:my_feed1/i:rank
/entry
entry
idurn:my_feed/b/id
i:rank scheme=urn:index feed=urn:my_feed2/i:rank
/entry
/feed

If sticking with i:domain, I'd recommend that you recommend that in 
cases where a ranking domain does not span multiple feeds, the feed/id 
value be used for the value of i:domain, and that in all cases, the 
same care be taken to (attempt to) ensure that i:domain's value is 
unique to what is intended to be a particular domain.




Re: FYI: Updated Index draft

2005-09-22 Thread Antone Roundy


On Thursday, September 22, 2005, at 10:20  AM, James M Snell wrote:

Antone Roundy wrote:
I was thinking yesterday of suggesting that feed/id be used the way 
you're using i:domain. Which is better is probably a matter of 
whether ranking domains that span multiple feeds will be useful or 
not. In the movie ratings use case presented below, perhaps rather 
than a fivestarts scheme and netflix and amazon domains, it might 
make more sense to do this:


Using atom:id as the ranking domain would limit the ranking to a 
single feed which is useful, but does not cover the full range of 
cases.

...

Yes, there are two special cases here:

1. Lack of a i:domain
2. i:domain value that is a same document reference


I think a ranking without a domain is pretty much useless--or at least 
likely to lead to problems downstream--so that case doesn't need to be 
covered.  More on that below.



 xhtml:html
   ...
   xhtml:body
 atom:feed
   atom:idFeed1/atom:id
   i:domain#/i:domain !-- document ranking domain --
   atom:entry
 atom:idA/atom:id
 i:rank scheme=priority50/i:rank
 i:rank scheme=priority domain=#20/i:rank
   /atom:entry
   atom:entry
 atom:idB/atom:id
 i:rank scheme=priority25/i:rank
 i:rank scheme=priority domain=#40/i:rank
   /atom:entry
 /atom:feed
 atom:feed
   atom:idFeed2/atom:id
   i:domain#/i:domain !-- document ranking domain --
   atom:entry
 atom:idC/atom:id
 i:rank scheme=priority50/i:rank
 i:rank scheme=priority domain=#30/i:rank
   /atom:entry
   atom:entry
 atom:idD/atom:id
 i:rank scheme=priority25/i:rank
 i:rank scheme=priority domain=#10/i:rank
   /atom:entry
 /atom:feed
   /xhtml:body
 /xhtml:html


In this example, the domainless rankings were added when the XHTML 
document was created, right?  So the XHTML document is essentially an 
aggregate feed, just not in Atom format.  Would it not make as much or 
more sense to mint an ID for the document (call it the ID of a virtual 
Atom Feed Document if you don't actually create an aggregate feed) and 
use it to scope those i:rank elements?  If, somehow, someone were to 
pull the atom:feeds out of the XHTML document (if atom:feed getting 
embedded into xhtml:body is going to happen, then is not atom:feed 
getting extracted from xhtml:body also likely?) and aggregate them with 
other feeds with domainless i:rank elements, the scopes of those 
elements would get mixed.


* Since the urn:(netflix|amazon).com/reviews schemes are feed 
independent, it is not necessary to indicate a feed (or domain) in 
this case.
* For a feed-specific scheme, like natural order, the feed ID would 
be included like this (so that if these entries were aggregated, it 
would be clear that the i:order elements were relevant to the source 
feed, not the aggregate feed):


The goal of @scheme is to identify the type of ranking to apply while 
the goal of @domain is to identify the scope of the ranking.  I do not 
believe that it is a good idea to conflate the two.


Okay, I've come to agree with that while writing and editing this 
message.  Note however that fivestar also indicates multiple things:


1) Higher numbers are better
2) The range is 0 to 5 (BTW, if this is limited to integers, how will 
you handle things like 3.5 stars, which are common in that type of 
rating system? Maybe decimal values need to be allowed.)

3) Hint: you might want to display the value as stars

#1 is the only one needed for sorting of entries. #2 would be useful if 
the feed reader wanted to display some sort of graphical element to 
indicate the ranking. #3 might be slightly useful, but except for the 
most popular schemes, would probably be ignored. Perhaps all of these 
should be separated, a la:


i:ranking-scheme
label=Amazon rating
order=descending
min-value=0
max-value=5
symbol=stars
domain=urn:amazon.com/customer-rating
/
...
entry
i:rank domain=url:amazon.com/customer-rating3/i:rank

...where @domain is the feed/id of the feed if there's just one feed in 
scope, or a value that won't be duplicated by any feed/id otherwise (if 
one can mint a unique feed id, surely one can also mint a unique id 
that won't be used for a feed).


I'd suggest that i:ranking-scheme/@domain either default to the 
containing feed/id (or the one from atom:source, if it exists) or be 
required, i:rank/@domain be required, @order default to ascending, 
@min-value default to 0, and the rest of the attributes be optional 
with no defaults.




Re: Don't Aggregrate Me

2005-08-29 Thread Antone Roundy


On Monday, August 29, 2005, at 10:12  AM, Mark Pilgrim wrote:

On 8/26/05, Graham [EMAIL PROTECTED] wrote:

(And before you say but my aggregator is nothing but a podcast
client, and the feeds are nothing but links to enclosures, so it's
obvious that the publisher wanted me to download them -- WRONG!  The
publisher might want that, or they might not ...


So you're saying browsers should check robots.txt before downloading
images?

...

Normal Web browsers are not robots, because they are operated by a
human, and don't automatically retrieve referenced documents (other
than inline images).


As has been suggested, to inline images, we need to add frame 
documents, stylesheets, Java applets, external JavaScript code, objects 
such as Flash files, etc., etc., etc.  The question is, with respect to 
feed readers, do external feed content (content src=... /), 
enclosures, etc. fall into the same exceptions category or not?  If 
not, then what's the best mechanism for telling feed readers whether 
they can download them automatically--robots.txt, another file like 
robots.txt, or something in the XML?  I'd prefer something in the XML.  
A possibility:


feed
ext:auto-download target=enclosures default=false /
ext:auto-download target=content default =true /
...
entry
link rel=enclosure href=... ext:auto-download=yes /
content src=... ext:auto-download=0 /
...



Re: Don't Aggregrate Me

2005-08-29 Thread Antone Roundy


On Monday, August 29, 2005, at 10:39  AM, Antone Roundy wrote:

ext:auto-download target=enclosures default=false /

More robust would be:
ext:auto-download target=[EMAIL PROTECTED]'enclosure'] default=false 
/
...enabling extension elements to be named in @target without requiring 
a list of @target values to be maintained anywhere.




Re: Don't Aggregrate Me

2005-08-26 Thread Antone Roundy


On Friday, August 26, 2005, at 04:39  AM, Eric Scheid wrote:

On 26/8/05 3:55 PM, Bob Wyman [EMAIL PROTECTED] wrote:

Remember, PubSub never does
anything that a desktop client doesn't do.


Periodic re-fetching is a robotic behaviour, common to both desktop
aggregators and server based aggregators. Robots.txt was established to
minimise harm caused by automatic behaviour, whether by excluding
non-idempotent URL, avoiding tarpits of endless dynamic links, and such
forth. While true that each of these scenarios involve crawling new 
links,

the base principle at stake is to prevent harm caused by automatic or
robotic behaviour. That can include extremely frequent periodic 
re-fetching,
a scenario which didn't really exist when robots.txt was first put 
together.


I'm with Bob on this.  If a person publishes a feed without limiting 
access to it, they either don't know what they're doing, or they're 
EXPECTING it to be polled on a regular basis.  As long as PubSub 
doesn't poll too fast, the publisher is getting exactly what they 
should be expecting.  Any feed client, whether a desktop aggregator or 
aggregation service, that polls too fast (extremely frequent 
re-fetching above) is breaking the rules of feed consuming 
etiquette--we don't need robots.txt to tell feed consumers to slow down.




Re: Feed History: stateful - incremental?

2005-08-25 Thread Antone Roundy


On Wednesday, August 24, 2005, at 04:07  PM, Mark Nottingham wrote:
Just bouncing an idea around; it seems that there's a fair amount of 
confusion / fuzziness caused by the term 'stateful'. Would people 
prefer the term 'incremental'?


I.e., instead of a stateful feed, it would be an incremental feed; 
fh:stateful would become fh:incremental.


Worth it?

I think it's worth seeing if a term can be found that has a more 
intuitively understandable meaning.  It might be helpful to explore the 
kinds of names that describe non-stateful feeds too--if a better term 
can be found for that, it could be used instead (and just reverse true 
 false).  Brainstorming a little:


Stateful: sliding window, most recent segment, segment, stream, entry 
stream, appendable, appending, augmentable, augmenting


Non-stateful: uh...stateful? (what you just downloaded represents the 
current state of the entire feed), current state, current, snapshot, 
fixed entry, set entry, replacable, replacing, entry replacing, 
non-appending, non-augmenting




Re: Don't Aggregrate Me

2005-08-25 Thread Antone Roundy


On Thursday, August 25, 2005, at 12:25  AM, James M Snell wrote:
Up to this point, the vast majority of use cases for Atom feeds is the 
traditional syndicated content case.  A bunch of content updates that 
are designed to be distributed and aggregated within Feed readers or 
online aggregators, etc.  But with Atom providing a much more flexible 
content model that allows for data that may not be suitable for 
display within a feed reader or online aggregator, I'm wondering what 
the best way would be for a publisher to indicate that a feed should 
not be aggregated?


For example, suppose I build an application that depends on an Atom 
feed containing binary content (e.g. a software update feed).  I don't 
really want aggregators pulling and indexing that feed and attempting 
to display it within a traditional feed reader.  What can I do?



In that particular use case, I'd expect entries something like this:

entry
...
titlePatch for MySoftware/title
	summaryThis patch updated MySoftware version 1.0.1 to version 
1.0.2/summary

content type=[...whatever goes here]k3jafidf8adf.../content
/entry

Looking at this, my thoughts are:
1) Feed readers that can't handle the content type are just going to 
display the summary or title anyway, so it's not going to hurt anything.
2) People whose feed readers can't handle the patches probably aren't 
going to subscribe to this feed anyway.  Instead they'll subscribe to 
your other feed (?) which gives them a link to use to download the 
patch:

entry
...
titlePatch for MySoftware/title
		summaryThis patch updated MySoftware version 1.0.1 to version 
1.0.2/summary

link rel=[???] type=[???] href=... /
/entry

I don't think we need anything special to tell aggregators to beware 
content that they don't know how to handle in this feed.  That should 
be marked clearly enough by @type.  More in a separate message...




Re: Don't Aggregrate Me

2005-08-25 Thread Antone Roundy


On Thursday, August 25, 2005, at 08:16  AM, James M Snell wrote:
Good points but it's more than just the handling of human-readable  
content.  That's one use case but there are others.  Consider, for  
example, if I was producing a feed that contained javascript and CSS  
styles that would otherwise be unwise for an online aggregator to try  
to display (e.g. the now famous Platypus prank...  
http://diveintomark.org/archives/2003/06/12/ 
how_to_consume_rss_safely).  Typically aggregators and feed readers  
are (rightfully) recommended to strip scripts and styles from the  
content in order to reliably display the information.  But, it is  
foreseeable that applications could be built that rely on these types  
of mechanism within the feed content.  For example, I may want to  
create a feed that provides the human interaction for a workflow  
process -- each entry contains a form that uses javascript for  
validation and perhaps some CSS styles for formatting.


For that, you'd either need to use a less sophisticated feed reader  
that didn't strip anything out (and only use it to subscribe to fully  
trusted feeds, like internal feeds), or a more sophisticated feed  
reader that allowed you to turn off the stripping of potentially  
dangerous stuff, or to configure exactly what was, or better yet,  
wasn't, stripped (perhaps and a feed by feed basis).


The stripping-or-not behavior should be controlled from the client  
side, so I don't see any point in providing a mechanism for the  
publisher to provide hints about whether or not to strip things out.   
That would probably only benefit malicious publishers at the expense of  
brain-dead clients:


entry
...
ext:keep-potentially-dangerous-stuff=true /
	content ... script ...  
TriggerExploitThatErasesDrive('C:');/script/content

/entry



Re: Don't Aggregrate Me

2005-08-25 Thread Antone Roundy


I can see reasonable uses for this, like marking a feed of local disk 
errors

as not of general interest.


This is not published data - http://www.spacekdet.com/pipe/
Security by obscurity^H^H^H^H^H^H^H^H^H saying please -  
http://www-cs-faculty.stanford.edu/~knuth/ (see the second link from 
the bottom)


This certainly wouldn't be useful as a security measure.  But yeah, a 
way to tell the big republishing aggregators that you'd prefer they 
didn't republish the feed could be useful, in case they somehow go 
ahold of the URL of a non-sensitive (and thus non- encrypted and 
authentication-protected), but not-intended-for-public-consumption 
feed.  Ideally though, such feeds should probably be password 
protected, since that wouldn't require aggregator support for an 
extension element.




Re: Don't Aggregrate Me

2005-08-25 Thread Antone Roundy


On Thursday, August 25, 2005, at 03:12  PM, Walter Underwood wrote:

I would call desktop clients clients not robots. The distinction is
how they add feeds to the polling list. Clients add them because of
human decisions. Robots discover them mechanically and add them.

So, clients should act like browsers, and ignore robots.txt.

How could this all be related to aggregators that accept feed URL 
submissions?  I'd imagine the desired behavior is the same as for 
crawlers--should they check for robots.txt at the root of any domain 
where a feed is submitted?  How about cases where the feed is hosted on 
a site other than the website that it's tied to (for example, a service 
like FeedBurner) so some other site's robot.txt controls access to the 
feed (...or at least tries to)?


We've already rejected the idea of trying to build DRM into feeds--is 
there some way to sidestep the legal complexities and problems that 
would arise from trying to to that and at the same time enable machine 
readable statements about what the publisher wants to allow others to 
do with the feed, and things they want to prohibit, into the feed?  If 
we're not qualified to design an extension to do that, is there someone 
else who is qualified, and who cares enough to do it?




Re: If you want Fat Pings just use Atom!

2005-08-23 Thread Antone Roundy


On Monday, August 22, 2005, at 09:54  PM, A. Pagaltzis wrote:

* Martin Duerst [EMAIL PROTECTED] [2005-08-23 05:10]:

Well, modulo character encoding issues, that is. An FF will
look differently in UTF-16 than in ASCII-based encodings.

Depends on whether you specify a single encoding for all entries
at the HTTP level or not. For this application, I would do just
that, in which case, as a bonus, non-UTF-8 streams would get to
avoid resending the XML preamble over and over and over.


Of course, if you do that, you won't be able to keep signatures for 
entries originally published in an encoding other than the one you've 
chosen.


If one were to want to signal an encoding change mid-stream, how might 
that work with what's been proposed thus far?




Re: Comments Draft

2005-08-01 Thread Antone Roundy


On Sunday, July 31, 2005, at 10:24  AM, A. Pagaltzis wrote:

* Antone Roundy [EMAIL PROTECTED] [2005-07-31 01:15]:

I could add more, but instead, here's my suggestion for
replacing that sentence:

If the resource being replied to is an atom:entry, the
value of the href attribute MUST be the atom:id of the
atom:entry. If the value of the type attribute is
application/atom+xml then the href attribute MUST be the
(URL/URI/IRI) of an Atom Entry Document containing the
atom:entry being replied to.


This undermines the purpose of the link.


I'd say that not being able to tell whether @href in 
[EMAIL PROTECTED]in-reply-to] is dereferencable or not is what undermines 
link.


The primary purpose of [EMAIL PROTECTED]in-reply-to] is to identify the 
resource (which may be an atom:entry) being replied to.  If that 
resource is an atom:entry, then the appropriate identifier for it is 
it's atom:id.


If If the resource being replied to is an atom:entry, the value of the 
href attribute MUST be the atom:id of the atom:entry doesn't sound 
like a good rule, then I'd argue that using atom:link to identify the 
resource being replied to is a bad idea.


As I've said before, I think that stuffing data that happens to be a 
URI but may not be dereferencable into link/@href is a bad idea.  If we 
ARE going to do it, then I think we need a way to at least hint at 
whether it's a dereferencable link or some other data stuffed into a 
link element.


Here's what the spec says @type is for:

   On the link element, the type attribute's value is an advisory
   media type; it is a hint about the type of the representation that is
   expected to be returned when the value of the href attribute is
   dereferenced.

If @href isn't dereferenable, then the existence of @type is deceptive. 
 I suppose it could mean when I saw it, it was in some kind of Atom 
document, but so what?  What if the feed gets converted to RSS 2.0, 
the atom:id is put into guid, and I find the entry in the RSS feed?



Atom Entry Documents can move around; their IDs are eternal.
True, so you could just omit @type from this link if you're concerned 
that your entry document might move.  Or we could go with something 
like this:


ext:in-reply-to id=...
atom:link rel=found-in-entry-document href=... /
atom:link rel=found-in-feed-document href=... /
/ext:in-reply-to

Or we could just stick with what has been proposed, perhaps including 
what I proposed in my last message, and if they entry document moves, 
then oh well, the web has another broken link just as it would in what 
I proposed just above here or in any case where a dereferencable link 
was published, but the atom:id would still be valid.  If after moving 
the entry document, one were to publish the in-reply-to link again, it 
would be appropriate to remove the @type attribute.


...okay, that last sentence suggests that what I propose just above 
here is a superior way to having possibly-derefencable atom:links, 
because you could update the found-in-entry-document link if it got out 
of sync with the location of the document.  Otherwise, we'll have to be 
limited to linking to the feed in which the entry is found.




Re: Proposed changes for format-11

2005-08-01 Thread Antone Roundy


On Monday, August 1, 2005, at 09:55  AM, A. Pagaltzis wrote:

* Robert Sayre [EMAIL PROTECTED] [2005-08-01 17:25]:

On 8/1/05, Sam Ruby [EMAIL PROTECTED] wrote:

Perhaps the following could be added to section 6.2:

  The Atom namespace is reserved for future
  forwards-compatable revisions of Atom.


s/compatable/compatible/


Sounds OK to me, but I recall squawking about this.


There wasn’t any squawking about the rule as such, I think. A
minor amount of squawking was about what a consumer should do
when it encounters Atom-namespaced elements in locations it
didn’t expect them.

Per spec: it should simply treat them as unknown foreign markup.
Intent: this allows old consumers to continue working with future
revisions of the spec, so long as changes are not so drastic that
a new namespace is warranted to prevent existing consumers from
doing anything with new documents.

It sounds to me like we might benefit from adding language specifying 
that elements in the Atom namespace can appear as children of elements 
from other namespaces, but may not appear as children of elements in 
the Atom namespace except as specified by the spec (or from wording the 
language to be added so that it says that).


...I am correct about our intent to allow Atom elements to be used as 
children of extension elements, right?  For example, that one be able 
to do this:


foo:bar qwerty=asdf
atom:titleMy title/atom:title
atom:link rel=foo:my-rel href=... /
/foo:bar

...rather than having to do this:

foo:bar qwerty=asdf
foo:titleMy title/foo:title
foo:link rel=foo:my-rel href=... /
/foo:bar

...right?



Re: Comments Draft

2005-07-30 Thread Antone Roundy


On Saturday, July 30, 2005, at 02:38  PM, A. Pagaltzis wrote:

* James M Snell [EMAIL PROTECTED] [2005-07-30 18:10]:

Yeah, source is likely the most logical choice, but I didn't
want to confuse folks with a link @rel=source that has a
different meaning from atom:source.


An argument by way of which I came around to Antone’s suggested
“start-of-thread,” though I was going to suggest “thread-start.”

I took a look at the draft to verify whether I correctly understood 
what this link points to, and I think it isn't what I originally 
thought based on the old name root.  Does this point to the feed in 
which the immediate parent entry was found, or to the feed in which the 
first entry in a thread of replies was found?  If the former, which the 
draft seems to suggest, and which seems more useful, then 
start-of-thread and thread-start probably aren't such good names 
after all.  With clarity in mind, in-reply-to-feed might be good, 
though it's a bit long.


And problem comes to mind: if you have multiple in-reply-to links, 
how do you related those to their respective in-reply-to-feed links 
(in case they're different)?  Is it possible?  Dare we do something 
like this?  (Wish we to if we dare?)


link rel=in-reply-to ...
link rel=in-reply-to-feed ... /
/link

Pro:
* Groups the two links together
* Gives us more options for what to call the inside one without 
creating confusion: source-feed, for example.  It would be nice to 
choose a name that's not likely to be the perfect name for some other 
use, or to define this @rel value broadly enough to be applicable to 
other purposes.


Con:
* Puts an atom:link in a location not expected by apps that don't 
understand this extension.





Re: Comments Draft

2005-07-30 Thread Antone Roundy


On Saturday, July 30, 2005, at 04:37  PM, James M Snell wrote:
One challenge is that for anything besides references to Atom entries, 
there is no guarantee that in-reply-to links will be non-traversable.  
For instance, if someone were to go and define a behavior for using 
in-reply-to with RSS, the href of the link may point to the same URL 
that the RSS item's link element points to (given that there is no way 
to uniquely identify an RSS item).
link rel=in-reply-to type=text/html 
href=http://www.example.com/entries/1; /


This is legal in the spec but is left undefined.


The natural choice of values when replying to an RSS 2.x item would be 
the guid, since it's the closest counterpart to atom:id.  But if the 
guid is not a permalink (ie. not dereferencable), then it won't have a 
MIME type, just as non-dereferencable atom:id's don't have a MIME type. 
 Both of these facts suggest that the following sentence should 
probably be removed from section 3:


   If the type attribute is omitted, it's value is assumed to be 
application/atom+xml.


Instead, I'd suggest stating that if the type attribute is omitted, the 
in-reply-to link cannot be assumed to be dereferencable, and that 
non-dereferencable links MUST NOT have a type attribute.


Editorial notes about this sentence:

   A type attribute
   value of application/atom+xml indicates that the resource being
   responded to is an atom:entry and that the href attribute MUST
   specify the value of the parent entries atom:id element.

1) parent probably isn't the best word here since in-reply-to isn't 
being defined in terms of parents and children.


2) entries - entry's

I could add more, but instead, here's my suggestion for replacing that 
sentence:


If the resource being replied to is an atom:entry, the value of the 
href attribute MUST be the atom:id of the atom:entry.  If the value of 
the type attribute is application/atom+xml then the href attribute 
MUST be the (URL/URI/IRI) of an Atom Entry Document containing the 
atom:entry being replied to.


Anything else could lead to inconsistencies.  For example, when 
replying to an atom:entry that can be found in an Atom Entry Document, 
but whose atom:id does NOT point to that document, there would be 
multiple choices available for the reply link's href attribute.




Re: Comments Draft

2005-07-29 Thread Antone Roundy


On Friday, July 29, 2005, at 02:41  PM, A. Pagaltzis wrote:

* Antone Roundy [EMAIL PROTECTED] [2005-07-29 02:40]:

On Thursday, July 28, 2005, at 05:58  PM, James M Snell wrote:

root is now called replies-source... which is a horrible
name but I'm not sure what else to call it


How about start-of-thread.


Or maybe “parent-entries?”


How about mother-of-all-entries?  Ha ha.

The problem with parent-entries is that this link may not be pointing 
to the immediate parent, right?




Re: Comments Draft

2005-07-28 Thread Antone Roundy


On Thursday, July 28, 2005, at 05:58  PM, James M Snell wrote:
 * root is now called replies-source... which is a horrible name 
but I'm not sure what else to call it



How about start-of-thread.



Re: I-D ACTION:draft-nottingham-atompub-feed-history-01.txt

2005-07-20 Thread Antone Roundy


On Wednesday, July 20, 2005, at 11:44  AM, Thomas Broyer wrote:

I was actually wondering why non-stateful feeds couldn't have archives:
in the This month's Top 10 records feed, why couldn't I link to Last
month's Top 10 records?
If this kind of links are not dealt within feed-history, then I suggest
splitting the draft into two (three) parts:
  1. fh:stateful: whether a feed is stateful or not
  2. fh:prev: state reconstruction of a stateful feed
  3. (published later) fh:: link to archives of a non-stateful feed
(no, I actually don't want such a split, I'd rather deal with the 3.
in feed-history, no matter how)

If we want to solve this issue using a distinct element (fh:prev if
fh:stateful=true, fh: if fh:stateful=false), is fh:stateful still
needed? The presence of fh:prev would be equivalent to 
fh:stateful=true,

the presence of fh: would be equivalent to fh:stateful=false, the
absence of both fh:prev and fh: would be equivalent to the absence
of fh:stateful, and the presence of both fh:prev and fh: would be 
an

error.
This is off course true only if fh:prev must be accompanied by
fh:stateful=true. The question is: is it useful to have fh:stateful if
you have no link to any kind of archive?

I would think that rather than fh:stateful=true | false, it might be 
more useful to have (with a different element name, and perhaps 
different values) fh:what-kind-of-feed-is-this=sliding-window | 
snapshot | ???.  If it's a sliding-window feed, fh:prev points to the 
previous sliding window.  If it's a snapshot feed, then fh:prev points 
to the previous snapshot.  fh:what-kind-of-feed-is-this might have a 
default value of sliding-window.




Re: Notes on the latest draft - xml:base

2005-07-20 Thread Antone Roundy


On Wednesday, July 20, 2005, at 10:22  PM, A. Pagaltzis wrote:

* James Cerra [EMAIL PROTECTED] [2005-07-21 05:00]:

Sjoerd Visscher,

That's because it is not an attempt at abbreviating strings,
but to preserve the meaning of relative URIs, when content is
used outside of its original context.


Same thing.  You are framing the question in a manner that
hides the problem, but it's still there.


No, it frames the question in a manner that addresses the purpose
of having the mechanism.

Right--it frames it in the context created by RFC 3986.  However, since 
this issue is commonly misunderstood, it's likely that xml:base will 
often be used for string abbreviation in the wild--thus, indeed the 
problem is still there.


If anyone doubts that base URIs as defined by RFC 3986 are not intended 
simply for abbreviation, read section 4.4 (Same-Document References). 
 The method outlined there for recognizing same-document references 
would be entirely unreliable if base URIs were used to abbreviate 
arbitrary portions of URIs.  It only works if the base URI is an 
address from which the data containing the relative URI can be 
retrieved.  If base URIs are intended for abbreviation convenience, 
then that section of RFC 3986 is completely broken.  My impression is 
that it isn't broken, but says what was intended.


...but now I've forgotten whether anyone has made a concrete suggestion 
about what can be done at this point, and to solve exactly what 
problem. Do I smell another note in the infamous implementers guide?




Re: Feed History -02

2005-07-19 Thread Antone Roundy


On Monday, July 18, 2005, at 01:59  AM, Stefan Eissing wrote:
Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. 
As an alternative one could drop fh:stateful and define that an empty 
fh:prev (refering to itself) is the last document in a stateful feed. 
That would eliminate the cases of wrong mixes of fh:stateful and 
fh:prev.


The problem is that an empty @href in fh:prev is subject to xml:base 
processing, and who knows what the current xml:base is going to be when 
you get to it.  Is there a way to explicitly make xml:base undefined?  
If I'm not mistaken xml:base= doesn't do it--it just adds nothing to 
the existing xml:base.  If there is a way, you could say link 
rel=fhprev href= xml:base=[whatever value sets it to 
undefined] /, but otherwise, using an empty @href is probably 
overloading the wrong attribute.  A different @rel value like 
fh:noprev (with an empty link, since it doesn't matter what it 
actually points to) might be a step up, but using any kind of link to 
indicate the lack of a link is a little odd.




Re: Feed History -02

2005-07-19 Thread Antone Roundy


On Tuesday, July 19, 2005, at 12:29  PM, Antone Roundy wrote:

On Monday, July 18, 2005, at 01:59  AM, Stefan Eissing wrote:
Ch 3. fh:stateful seems to be only needed for a newborn stateful 
feed. As an alternative one could drop fh:stateful and define that an 
empty fh:prev (refering to itself) is the last document in a stateful 
feed. That would eliminate the cases of wrong mixes of fh:stateful 
and fh:prev.


The problem is that an empty @href in fh:prev is subject to xml:base 
processing, and who knows what the current xml:base is going to be 
when you get to it.  Is there a way to explicitly make xml:base 
undefined?  If I'm not mistaken xml:base= doesn't do it--it just 
adds nothing to the existing xml:base.  If there is a way, you could 
say link rel=fhprev href= xml:base=[whatever value sets it to 
undefined] /, but otherwise, using an empty @href is probably 
overloading the wrong attribute.  A different @rel value like 
fh:noprev (with an empty link, since it doesn't matter what it 
actually points to) might be a step up, but using any kind of link to 
indicate the lack of a link is a little odd.


Yikes, I should have caught up on the xml:base thread first!  Looks 
like the jury's out, or at least hung, on this issue.




Re: I-D ACTION:draft-ietf-atompub-format-10.txt

2005-07-15 Thread Antone Roundy


A misspelling...in case the opportunity to fix it arises: Text 
Contruct -- missing an s in 6.3.  (I found it because I misspelled 
it the same way when searching for it!)




Re: The Atomic age

2005-07-15 Thread Antone Roundy


On Friday, July 15, 2005, at 09:56  AM, Walter Underwood wrote:
--On July 14, 2005 11:37:05 PM -0700 Tim Bray [EMAIL PROTECTED] 
wrote:


So, implementors... to  work.


Do we have a list of who is implementing it? That could be used in
the Deployment section of http://www.tbray.org/atom/RSS-and-Atom.

I've update Grouper (http://www.geckotribe.com/rss/grouper/) to support 
conversion of Atom 1.0 to RSS 2.0.  A future version will support going 
the other way...when I get time to complete the major overhaul that 
will be required to do that.


Antone



Re: More while we're waiting discussion

2005-07-12 Thread Antone Roundy


On Tuesday, July 12, 2005, at 12:42  PM, A. Pagaltzis wrote:

* James M Snell [EMAIL PROTECTED] [2005-07-12 02:00]:

The second extension is a comments link type that allows an
entry to be associated with a separate feed containing
comments. […]

  feed
 entry
link rel=comments 
href=http://example.com/commentsfeed.xml; /

 /entry
  /feed


What I don’t like about this idea is that if a thread-aware
aggregator wants to keep up with *all* discussion on a weblog, it
will have to poll any number of comments-for-entry-X feeds per
single main newsfeed in the general case – in the case of a
typical weblog encountered in practice, that would be several
hundred. Clearly, this is untenable.

If you're already creating an extension link type, why not throw in an 
additional attribute too to help with that:


feed xmlns:comments=http://example.org/commentfeed;
entry
   link rel=comments comments:updated=2005-07-12T12:53:15Z 
href=http://example.com/commentsfeed.xml; /

/entry
/feed

Then you'd only need to poll the main feed unless it indicated an 
update in the comment feed.  Of course, if comments were threaded, you 
have to cascade comments:updated values up through all the feeds in a 
thread, and aggregators would have to follow updates back the other 
way, potentially down multiple branches, to find all the updated leaves.


...which raises the question of whether an application like this might 
beg a minimal feed for comments that simply pointed to an Entry 
Document for each comment. Entries in such a feed would really only 
require an atom:id, atom:updated, atom:link pointing to the entry 
document, and atom:link pointing to the parent comment or entry. 
atom:title could conceivably be considered undesirable bloat for such a 
feed. Is Atom the right format for this need? An alternative might be 
to define a format for this need that used Atom elements but had 
minimalized cardinality requirements.


Well, enough stream of thought blabbering for now.



Re: More while we're waiting discussion

2005-07-12 Thread Antone Roundy


On Tuesday, July 12, 2005, at 06:21  PM, A. Pagaltzis wrote:

* Thomas Broyer [EMAIL PROTECTED] [2005-07-13 00:00]:

As an atom:id is an identifier that might (should?) not be
dereferenceable, atom:link is not a good choice.

There is nothing in the spec that forbids atom:link

That should be atom:id, right?

 from being
dereferencable, nor anything that advises against it being so.
See 4.2.6 and 4.2.6.1 in -09.

...

The spec just says is that the URI MUST NOT be assumed to be
dereferencable,

...

Whether atom:link is a bad choice for carrying a non-
dereferencable URI around is a better argument. The spec says,
verbatim:

The atom:link element defines a reference from an entry or
feed to a Web resource.

That would seem to imply dereferencability, but is open to
interpretation.

...

Personally, I would prefer to interpret the spec liberally, if
that is within the intended spirit,
It's definitely not within the spirit that I, for one, intended.  But 
the spirit that I intended (atom:link being limited to links intended 
to be traversed in response to explicit user interaction) was not 
accepted by the WG, so perhaps that has little bearing.


If atom:link is intended to be dereferencable, then clearly, any 
solution that takes a value from atom:id and puts it into 
atom:link/@href has a strike against it since any feed that uses 
non-dereferencable atom:ids would either have to violate the spirit of 
atom:link to participate in the feature, or would have to invent a 
competing solution.


Also, if a feed that uses dereferencable atom:ids is relocated, clients 
would be much more likely to attempt to dereference the atom:links that 
carried those previously dereferencable values than an extension 
element that was explicitly defined as not necessarily dereferencable.




Re: Roll-up of proposed changes to atompub-format section 5

2005-07-05 Thread Antone Roundy


On Tuesday, July 5, 2005, at 10:11  AM, Tim Bray wrote:

On Jul 5, 2005, at 8:58 AM, Bob Wyman wrote:

We can debate what it means to have an interoperability issue,
however, my personal feeling is that if systems are forced to break 
and
discard signatures in order to perform usual and customary processing 
on
entries that falls very close to the realm of interoperability if not 
within
it. Deferring this issue until the implementer's guide is written is 
likely
to defer it beyond the point at which common practice is established. 
The
result is likely to be that intermediaries and aggregators end up 
discarding

most signatures that appear in source feeds.


Huh?!  Pardon my ignorance, could you please provide an explanation 
for the simple-minded as to how the absence of a source element in a 
signed entry will lead to signatures being discarded?  Also, it would 
be helpful to sketch in some of the surrounding scenario... -Tim


If a signed entry doesn't have a source element and an aggregator 
inserts one, the signature will be broken--thus the aggregator will 
either discard the signature or republish the entry with a broken 
signature.


Perhaps language like this would work without being too much of a 
change at this late date:


When signing individual entries that do not contain an atom:source 
element, be aware that aggregators inserting an atom:source element 
will be unable to retain the signature. For this reason, publishers 
might consider including an atom:source element in all individually 
signed entries.




Re: Roll-up of proposed changes to atompub-format section 5

2005-07-05 Thread Antone Roundy


On Tuesday, July 5, 2005, at 01:09  PM, A. Pagaltzis wrote:

* Bob Wyman [EMAIL PROTECTED] [2005-07-05 19:30]:

Antone Roundy wrote:

When signing individual entries that do not contain an
atom:source element, be aware that aggregators inserting an
atom:source element will be unable to retain the signature. For this
reason, publishers might consider including an atom:source element in
all individually signed entries.

+1

+1 as well. It is one of those obvious-in-hindsight things that
the spec would do well to point out to implementors in advance.

If putting this into the spec would require a delay, then I
suppose we’ll have to end up living with a spec that could have
been more explicit. This clarification is not worth slowing
things down for.


Agreed.  If we can get it in without delaying things, I'm all for it.  
But if not, then I can live without it.  It doesn't actually change 
anything--just reduces the probability of the issue being overlooked.




Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt

2005-06-29 Thread Antone Roundy


If it's for identification rather than retrieval, maybe it could be an 
Identity Construct...except Identity Constructs got nuked in 
format-06...not necessarily dereferencable.  Another option would be to 
identify whether you need to continue by checking whether you've seen 
the prev link before.  Would not that be as reliable as checking the 
this link?


On Wednesday, June 29, 2005, at 12:10  AM, Mark Nottingham wrote:

You need to be able to figure out which documents you've seen before 
and which ones you haven't, so you don't recurse down the entire 
stack. Although you can come up with some heuristics to determine when 
you've seen a document before, most (if not all) of them can be fooled 
by particular sequences of entries. Remembering which ones you've seen 
(using their 'this' URI) allows you to easily figure this out.



On 28/06/2005, at 8:48 PM, Antone Roundy wrote:


Thinking a little more about this, I'm not sure what the this link 
would be used for.  The prev link seems to be doing all the work, 
and especially assuming a batches of 15 sort of model, the this 
link seems likely to end up pointing to a document that's going to 
disappear soon 14 times out of 15.




--
Mark Nottingham http://www.mnot.net/





Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt

2005-06-29 Thread Antone Roundy


On Wednesday, June 29, 2005, at 07:27  AM, Dave Pawson wrote:

I guess the answer is:
http://example.com/latest is your feed, e.g. containing the latest 10 
entries

http://example.com/archive-1 through n are your archive feeds.


Which would mean that the instance at /latest keeps changing?
I need to keep swapping old ones out, new ones in, i.e. rebuilding
each time?

  I guess that's another reason it feels like a kludge.

Replace http://example.com/latest with http://example.com/atom.xml.  Of 
course the latest document keeps changing and has to be rebuilt and 
replaced each time.  It's the feed document just like what we see 
today.  At least that's how I read what was written 
above--http://example.com/latest; was intended as the URI to which 
you'd subscribe.




Annotating signed entries (was Re: More on Atom XML signatures and encryption)

2005-06-29 Thread Antone Roundy


On Wednesday, June 29, 2005, at 01:47  PM, James M Snell wrote:
8. Aggregators and Intermediaries MUST NOT alter/augment the content 
of digitally signed entry elements.



Just mulling over things...

Obviously, we don't have any way to annotate signed entries without 
breaking the signature.  I hesitate to introduce new complexity, so I 
don't know whether I LIKE the idea I'm about to write about, but here 
it is.  If you want to annotate a signed entry, or even annotate an 
unsigned one but keep your annotations separate, you might do something 
like this:


feed ... 
[feed metadata]
ex:annotation entry-id=foo
		ex:entry-signaturethe entry's signature goes 
here/ex:entry-signature

[this annotation could be signed here]
ex:some-annotation-element.../ex:some-annotation-element
...
/ex:annotation
...
entry
idfoo/id
[entry's signature here if signed]
...
/entry
/feed

Notes:
1) ex:entry-signature is optional, but recommended if the entry is 
signed and the annotation is signed.

2) Multiple annotations could point to the same entry
3) It could be requested that aggregators forward annotations along 
with their entries...but of course, that's optional, and they could 
certainly be dropped at the request of the end user if they only want 
to see the originals.
4) It might be recommended or required that ex:annotation elements 
appear before the entries they annotate (whether above all entries or 
interspersed with them) to make life easier for processors that 
finalize their processing of entries as soon as they hit /entry 
rather than doing it after they've parsed the whole document.
5) Aggregators COULD attach annotations from various sources when 
outputting entries, even if those annotations never appeared together 
within a feed before.

6) I don't see any way to choose between conflicting annotations.



Dealing with namespace prefixes when syndicating signed entries

2005-06-29 Thread Antone Roundy


Mulling more...

Let's say an aggregator is putting these two entries into the same 
aggregate feed:


feed ... xmlns:a=foo xmlns:b=bar
...
entry
[signature]
a:foo ... /
b:bar ... /
...
/entry
/feed

feed ... xmlns:b=foo xmlns:a=bar
...
entry
[signature]
b:foo ... /
a:bar ... /
...
/entry
/feed

Perhaps a reasonable way to deal with the namespace prefix conflict 
would be for the signature to be applied after a transform that yielded 
this (putting full namespace names in where the prefixes were):


[atom's namespace]:entry
[signature]
foo:foo ... /
bar:bar ... /
...
/[atom's namespace]:entry

Unprefixed attributes would naturally remain unprefixed, but elements 
in the default namespace would need to have their namespace names 
prepended.




Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt

2005-06-29 Thread Antone Roundy


On Wednesday, June 29, 2005, at 06:50  PM, A. Pagaltzis wrote:

My first thought upon reading the draft was what I assume is
what Stefan Eissing said: I would rather have a single,
entry-less “archive hub” feed which contains “prev” links to
*all* previous instances
For an active feed, that document could easily grow till it was larger 
than many feed instances.  I prefer the chain of instances method.



, leading to a setup like

http://example.com/latest
└─ http://example.com/archive/feed/
├─ http://example.com/archive/feed/2005/05/
├─ http://example.com/archive/feed/2005/04/
├─ http://example.com/archive/feed/2005/03/
├─ http://example.com/archive/feed/2005/02/
└─ http://example.com/archive/feed/2005/01/
I don't quite get what the hub feed would look like.  Could you show 
us some XML?



I don’t see anything in the draft that would preclude this use,
and as far as I can tell, aggregators which support the draft
should have no trouble handling this scenario correctly.
The draft doesn't explicitly say that a feed can only contain one 
prev link, but I find it hard to read a to mean one or more in 
'and optionally a Link element with the relation prev'.



Again, I don’t see anything in the draft that would preclude
this use, and as far as I can tell, aggregators which support
the draft should have no trouble handling this scenario
correctly.

...unless they expected only to find one prev link per document.


Note how the archive directory feed being static makes this
painlessly possible, while it would be a pain to achive
something similar using the paginated approach with local
“prev” links (you’d need to go back and change the previously
newest old version every time a new one was added).
I don't see why this would be any more difficult.  The paginated 
approach could easily use static documents that never need to be 
updated, as I described earlier.  I'll re-explain at the end of this 
email.



It would in fact require a “prev” link to what is actually the “next”
page.

Funnily enough, I don’t see anything in the draft that would
preclude this counterintuitive use of the “prev” link to point
to the “next” version

Could you explain what you mean by that?


I’d much rather have a single archive feed containing all
entries, and use RFC3229+feed to return partial versions of it;
That might be good for those who can support it, but many people won't 
be able to.  On the other hand, if that single feed grows to where it's 
hundreds of MB, it could cause real problems if someone requests the 
whole thing or a large portion of it.



Getting back to how to use static documents for a chain of instances, 
that could easily be done as follows. The following assumes that the 
current feed document and the archive documents will each contain 15 
entries.


The first 15 instances of the feed document do not contain a prev 
link (assuming one entry is added each time).


When the 16th entry is added, a static document is created containing 
the first 15 entries, and a prev link pointing to it is added to the 
current feed document. This link remains unchanged until the 31st entry 
is added.


When the 31st entry is added, another static document is created 
containing the 16th through 30th entries. It has a prev link pointing 
to the first static document. The current feed document's prev link is 
updated to point to the second static document, and it continues to 
point to the second static document until the 46th entry is added.


When the 46th entry is added, a third static document is created 
containing the 31st through 45th entries, etc.


If you want to reduce the number of requests required to get the entire 
history (which I don't imagine would happen often enough that it would 
necessary be worth bothering), you could put more entries into each 
static document.  If you didn't correspondingly increase the number of 
entries in the current feed document, you'd have to update the most 
recent static document a number of times rather than only outputting it 
once as described above, but even that would only require multiple 
updates to the most recent static document at any time.




Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt

2005-06-28 Thread Antone Roundy


Let's say we are planning to keep the latest 15 entries in our stateful 
feed.  We publish the first entry, and have a feed with 1 entry in it.  
It has a this link, but no prev link.


Then we add an entry.  The old this link can't be used to point to 
the new instance of the feed, right?  Because that would violate this 
requirement:


   The value of the this link relation's href attribute MUST be a URI
   indicating a permanent location that is unique to that Feed Document
   instance; i.e., the content obtained by dereferencing that URI SHOULD
   NOT change over time.

So the new feed instance has a new this link and perhaps a prev 
link pointing to the first instance.  But maybe the prev link could 
be omitted at this point, because the this link will point to a feed 
with all the information in the original feed and then some.


Now let's say someone tries to fetch the original this feed.  The 
draft says:


   Note that publishers are not required to make all previous Feed
   Documents available.

This seems like a likely circumstance where the publisher might not 
want to both to continue making the original instance available.  If 
that's what they decide, then what?  Do they return a 410 (gone)?  
Presumably, some will return a 404 (not found), even though 410 would 
be better.  What should a client do if it receives a 404 or 410?  Is 
there a way for them to find the new instance?  Should there be?  
(Presumably they're subscribed to the feed from a URI different than 
the one in the this link, so in this case, it's probably not such a 
big deal, but read on, and you'll see where it could become an issue).


Now let's look further down the road--we have 15 entries in the feed, 
and the latest instance has it's this and maybe a prev or maybe 
not.  We add another entry.  One reasonable thing to do would be to 
continue to provide the instance with the first 15 entries and point to 
it as the prev.  Another reasonable thing to do would be to point to 
the original single-entry instance as the prev--ie. the most recent 
instance which doesn't share any entries with this one.


As time goes by, the publisher could end up providing every old 
instance, or just one old instance for each 15 entries.  The latter 
would provide for much more efficient catching up on the feed state.  
But if the in between instances are dropped, clients could easily end 
up running into dead ends (410 or 404) often when trying to catch up, 
even though there is older data available at a different URI.


Perhaps the best solution would be to have no prev for the first 15 
instances, then point to the instance with the first fifteen entries 
from each of the next 15 instances, then point to the instance with 
entries 16-30 from the next fifteen instances, etc., so that one is 
never pointing to an instance that won't continue to be provided 
(unless, for example, you only continue to provide the most recent 10 
(for example) groups of 15 entries).


If this is to be allowed, then one word ought to be changed in the 
draft (and I'd think that fleshing out some of these details would be 
very useful, though of course it wouldn't be normative):


   The value of the prev link relation's href attribute MUST be a URI
   indicating the location of the previous representation of the feed;
   i.e., the last Feed Document's this URI.

THE previous representation = A previous representation or 
something along the lines of THE previous representation in the chain 
or representations.


I'm noticing now that  i.e., the last Feed Document's this URI. 
sounds like it's disallowing the batches-of-15 method outlined above.  
If we don't wish to disallow that, that should be changed to something 
like  i.e., a previous Feed Document's this URI.


Also, I just noticed that in some places, the word representation is 
used, and in some places instance is used, apparently to mean the 
same thing.  In my opinion, instance is better.


Antone



Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt

2005-06-28 Thread Antone Roundy


Thinking a little more about this, I'm not sure what the this link 
would be used for.  The prev link seems to be doing all the work, and 
especially assuming a batches of 15 sort of model, the this link 
seems likely to end up pointing to a document that's going to disappear 
soon 14 times out of 15.


On Tuesday, June 28, 2005, at 07:05  PM, Mark Nottingham wrote:
Now let's say someone tries to fetch the original this feed.  The 
draft says:


   Note that publishers are not required to make all previous Feed
   Documents available.

This seems like a likely circumstance where the publisher might not 
want to both to continue making the original instance available.  If 
that's what they decide, then what?  Do they return a 410 (gone)?  
Presumably, some will return a 404 (not found), even though 410 would 
be better.  What should a client do if it receives a 404 or 410?  Is 
there a way for them to find the new instance?  Should there be?  
(Presumably they're subscribed to the feed from a URI different than 
the one in the this link, so in this case, it's probably not such a 
big deal, but read on, and you'll see where it could become an  issue).


I'm not sure what you're looking for; the semantics of 404 and 410 are 
clearly defined by HTTP. If the server says it can't find it, or it's 
gone, the client is unable to reconstruct the full state of the feed, 
and SHOULD warn the user.


What I'm saying is that if instance 16 of the feed points back to 
instance 15, instance 17 to instance 16, instance 18 to instance 17, 
etc., but at some point your drop all but instance 15, instance 30, 
etc., then the links to all the instances in between are going to end 
up returning 404s or 410s.  So I'd suggest that there be no 
per-instance this link, and that the prev link be updated only when 
a new batch-of-n document is created.  Doing it that way, n-1 times out 
of n, there would be overlap between the current feed and that most 
recent batch-of-n document (but that wouldn't be a big deal), but no 
overlap between any previous batch-of-n documents, and no intermediate 
documents would disappear.




Re: More on Atom XML signatures and encryption

2005-06-21 Thread Antone Roundy


On Monday, June 20, 2005, at 11:33  PM, James M Snell wrote:
OK, so given the arguments I previously posted in my response to Dan + 
the assertion that digitally signing individual entries will be 
necessary, the only real possible solution would be to come up with a 
canonicalization scheme for digitally signed Atom entries.
...or as Bob said, always including a source element in signed entries, 
even if they're in the origin feed.


The following is all academic at this point, but here's pseudofeed of 
what I'd like to have seen...part of it only in retrospect:


feed

head!--it's bck!--
[feed metadata]
		Signature xmlns=... /!--the feed head is signed--the entire feed 
could be too, but this is for aggregation--

/head

entry
[entry metadata and content]
		feedsig!--a copy of the feed's head's signature, so that the entry 
can be verifiably linked to the signed feed metadata--/feedsig
		Signature xmlns=... / !--the entry is signed, including the 
local copy of the feed head signature--

/entry

entry
[entry metadata and content]
feedsig.../feedsig
Signature xmlns=... /
/entry

[etc.]

/feed

Of course, aggregating this while preserving the signatures' validity 
would require a different aggregation model than what we've 
chosen--like what I proposed for aggregation documents. (Indentation 
added for readability--in practice, that would break the signature, 
right?):


aggregation
[aggregation metadata]

feed
head
[feed metadata]
Signature xmlns=... /
/head
entry
[entry metadata and content]
feedsig.../feedsig
Signature xmlns=... /
/entry
/feed

feed
[etc.]
/feed

[etc.]
/aggregation



Re: Question on Use Case: Choosing atom:content's type when ArchivingMailing Lists

2005-06-18 Thread Antone Roundy


On Saturday, June 18, 2005, at 01:36  PM, Graham wrote:

On 17 Jun 2005, at 6:14 pm, Tim Bray wrote:
Uh, has Mark spotted a dumb bug here that we should fix?  Do we care 
if *remote* content is of a composite MIME type?  My feeling was that 
we ruled out composite types in *local* content, for fairly obvious 
reasons.  The fix is obvious, in 4.1.3.1
I would have no objection to this, since the spec already creates the 
expectation that remote content will be less widely supported than 
local content.


The better way to do this is to use atom:link rel=alternate to 
reference the messages.
This is certainly a better solution than multipart local content, and 
would hope that people would do remote content this way too unless they 
have a really good reason for multipart remote content.  But I could 
live with allowing multipart remote content if it's really needed in 
some case.




Re: Polling Sucks! (was RE: Atom feed synchronization)

2005-06-17 Thread Antone Roundy


On Friday, June 17, 2005, at 12:32  PM, Bob Wyman wrote:

This is *not* simpler than taking a push feed using Atom over XMPP.
For a push feed, all you do is:
1. Open a socket
2. Send a login XML Stanza
3. Process the stanzas as they arrive.

...

For your solution, you need to:
1. Poll the feed to get a pointer to the first link. (each poll
will cost you a TCP/IP connection).
2. If you got a new first link then go to step 5
3. Wait some period of time (the polling interval)
4. GoTo Step 1
5. Open a new TCP/IP socket to get the next link
6. Form and send an HTTP request for the next entry
7. Catch the response from the server
8. Parse the response to determine if its time stamp is something
you've already seen.
9. If you haven't seen the current entry before, then go to step 5
10. Go to step 1 to start over.


Not to get into a big argument (each method has its advantages 
depending on circumstances), but allow me to revise the above a little. 
 The following assumes applications that attempt to keep you up-to-date 
on changes to the feed that occurred while you were offline:


XMPP:
1. Open a socket
2. Request and get the feed
3. Parse the XML
4. Process the entries (Determine whether each is new/updated or 
not--if so, do the appropriate thing)

5. If the feed had entries that were old and not updated, go to step 7
6. If the feed has a first or next or whatever link, go to step 1 
using that link

7. Open a socket
8. Send login XML stanza
9. Wait for a stanza (sending keep-alive packets periodically), and 
when it arrives...

10. Parse the XML
11. Process it (Determine whether the entry is new/updated or not and 
do the appropriate thing)

12. Go to step 9

Polling:
1. Open a socket
2. Request and get the feed
3. Parse the XML
4. Process the entries (Determine whether the entry is new/updated or 
not and do the appropriate thing)

5. If the feed had entries that were old and not updated, go to step 7
6. If the feed has a first or next or whatever link, go to step 1 
using that link

7. Wait some period of time
8. Go to step 1

The XMPP app will need to contain a superset of the polling app's code. 
My assessment of which method wins on various issues:


Latency: XMPP
Implementation complexity: Polling
Bandwidth consumption: XMPP
Resource consumption between polls or pushes: Polling
Getting all feed changes while online: XMPP if you're trying to archive 
the feed, otherwise no difference

Getting feed changes that occurred while offline: no difference



If we're not concerned about ensuring that we get all changes, the 
story is different:


XMPP:
1. Open a socket
2. Send login XML stanza
3. Wait for a stanza (sending keep-alive packets periodically), and 
when it arrives...

4. Parse the XML
5. Process it (Determine whether the entry is new/updated or not and do 
the appropriate thing)

6. Got to step 3

Polling:
1. Open a socket
2. Request and get the feed
3. Parse the XML
4. Process the entries (Determine whether the entry is new/updated or 
not and do the appropriate thing)

5. Wait some period of time
6. Go to step 1

My assessment:

Latency: XMPP
Implementation complexity: similar
Bandwidth consumption: XMPP
Resource consumption between polls or pushes: Polling
Getting all feed changes while online: XMPP
Getting feed changes that occurred while offline: Polling

XMPP could achieve parity in getting feed changes that occurred while 
offline, at the expense of implementation complexity parity, by polling 
the feed once upon startup.




Re: I-D ACTION:draft-ietf-atompub-format-09.txt

2005-06-08 Thread Antone Roundy



4.1.1:
o  atom:feed elements MUST NOT contain more than one atom:image
  element.

Should be atom:logo.

4.1.2 says:

   o  atom:entry elements MUST NOT contain more than one atom:link
  element with a rel attribute value of alternate that has the
  same combination of type and hreflang attribute values.

4.1.1 says:

   o  atom:feed elements MUST NOT contain more than one atom:link
  element with a rel attribute value of alternate that has the
  same type attribute value.

Should 4.1.1 also mention hreflang?

4.1.2 puts this in a separate bullet, but 4.1.1 does not:

   o  atom:entry elements MAY contain additional atom:link elements
  beyond those described above.


Nit pick: 4.1.2 says:

   o  atom:entry elements MUST have exactly one atom:title element.

   o  atom:entry elements MUST contain exactly one atom:updated 
element.

Do we want to be consistent in saying contain i/o have?


4.1.3.2  The src attribute

   atom:content MAY have a src attribute, whose value MUST be an IRI
   reference [RFC3987].  If the src attribute is present, 
atom:content

   MUST be empty.  Atom Processors MAY use the IRI to retrieve the
   content, and MAY NOT process or present remote content in the same
   manner as local content.
It took me a second to realize that MAY NOT means don't have to 
rather than aren't allowed to.  The technical meaning of the terms is 
perfectly clear, but it's quite different from the usual meaning of 
those words, and may be misunderstood.  It might be better to say Atom 
Processors MAY use the IRI to retrieve the content, and MAY process or 
present remote content in a different manner from local content.


Appendix A.  Contributors doesn't appear to have been updated to add 
more names.




Re: PaceAtomIdDos posted (was Re: Consensus snapshot, 2005/05/25)

2005-05-26 Thread Antone Roundy


On Wednesday, May 25, 2005, at 06:14  PM, James M Snell wrote:
Ignoring the overhead that it adds for now, isn't this the kind of 
situation digital signatures are designed to handle?
Sure, but how many publishers are going to be using digital signatures 
in the near term (and more importantly, how many aren't?), and who 
knows how many consuming applications will support them.  Until digital 
signatures start providing more help with this kind of thing, let's 
provide a warning to developers so that they can at least consider what 
they might do to safeguard the quality of their users' experience.


And I just thought of another thing (I don't know how digital 
signatures work in this case, so I may be missing something, but I'm 
pretty sure the following is at least partially valid): if I get an 
entry with a valid digital signature and one with no signature (both 
with the same atom:id, of course), then what?  Do I always accept the 
one with the signature?  If so, then DOSing/spoofing unsigned entries 
will be even easier, because all you'd have to do is sign your fake 
entry.  So even in that case, some extra checking might have to be done 
before concluding that the entries are duplicates, and that the 
unsigned one is the one that's disposable.


Without any kind of cryptographic guarantee of this sort, the best you 
could do is make an educated guess.
Wouldn't that be better than nothing until digital signatures become 
more ubiquitous?



Would it make sense to include some language along these lines?

Sure.



Re: PaceAtomIdDos posted (was Re: Consensus snapshot, 2005/05/25)

2005-05-26 Thread Antone Roundy


On Thursday, May 26, 2005, at 08:04  AM, A. Pagaltzis wrote:

* Graham [EMAIL PROTECTED] [2005-05-25 23:00]:

How is this a Denial of service attack? Isn't it just
ordinary spoofing/impersonation?


Indeed; Id like to see this reworded to refer to spoofing, as
thats what it is.


I presume the specific wording can be left to the discretion of the 
editors.




Re: PaceDuplicateIdsEntryOrigin posted (was Re: Consensus snapshot, 2005/05/25)

2005-05-26 Thread Antone Roundy


On Wednesday, May 25, 2005, at 01:06  PM, Antone Roundy wrote:

== Abstract ==

State the atom:entries from the same feed with the same ID are the 
same entry, whether simulateously in the feed document or not.


I'm retracting this proposal in preference for PaceAtomIdDos, which I 
like better and is getting more support.




Re: Consensus snapshot, 2005/05/25

2005-05-25 Thread Antone Roundy


On Wednesday, May 25, 2005, at 12:03  PM, Tim Bray wrote:
The level of traffic in recent days have been ferocious, and reading 
through it, we observe the WG has consensus on changing the format 
draft in a surprisingly small number of areas.  Here they are:


All looks good (or at least entirely acceptable) to me.  One question 
though:


3. Change to previous consensus call.  The phrase that begins If 
multiple atom:entry elements with the same atom:id value appear in an 
Atom Feed document, they describe the same empty... loses the 
language about how software MUST treat them as such.


A few of people appeared to support[1][2] this[0]:

* State that multiple entries originating in the same feed with the 
same atom:id are instances of the same entry [yes, they're SUPPOSED to 
be, even REQUIRED to be universally unique, but let's live in the 
real world]


...but there was no Pace written (oops), and little or no comment 
directed specifically toward this detail, either for or against.  This 
wording got no response when suggested[3] two days ago:


If multiple atom:entry elements originating in the same Atom feed have 
the same atom:id value, whether they exist simultaneously in one 
document or in different instances of the feed document, they describe 
the same entry.


I'm going to write a Pace right now, in case that will make any 
difference.  Comments?


Antone

[0] http://www.imc.org/atom-syntax/mail-archive/msg15517.html
[1] http://www.imc.org/atom-syntax/mail-archive/msg15518.html
[2] http://www.imc.org/atom-syntax/mail-archive/msg15526.html
[3] http://www.imc.org/atom-syntax/mail-archive/msg15644.html



PaceDuplicateIdsEntryOrigin posted (was Re: Consensus snapshot, 2005/05/25)

2005-05-25 Thread Antone Roundy


On Wednesday, May 25, 2005, at 12:35  PM, Antone Roundy wrote:
I'm going to write a Pace right now, in case that will make any 
difference.
Here it is--now comments on that particular detail can be directed at a 
proper Pace:


http://www.intertwingly.net/wiki/pie/PaceDuplicateIdsEntryOrigin

== Abstract ==

State the atom:entries from the same feed with the same ID are the same 
entry, whether simulateously in the feed document or not.


== Status ==

New

== Rationale ==

 * The accepted language for allowing duplicate IDs in a feed document 
speaks only multiple atom:entry elements with the same atom:id 
describing the same entry if they exist in the same document--of 
course, we intend for them to describe the same entry whether they're 
simultaneously in the feed document or not
 * The accepted language does not speak of the origin feed of the 
entries. Ideally, an atom:id should be univerally unique to one entry 
resource, and we rightly require publishers to mint them with that 
goal. However, in reality, malicious or undereducted publishers might 
duplicate the IDs of others. Therefore, it is proposed to modify the 
specification to state that the atom:entry elements describe the same 
entry (resource) if they originate in the same feed.
 * Aggregators wishing to protect against DOS attacks are not unlikely 
to perform some sort of safety checks to detect malicious atom:id 
duplication, regardless of whether the specification authorizes them 
to or not.


== Proposal ==

in format-08:

1. Remove this bullet point from 4.1.1:

atom:feed elements MUST NOT contain atom:entry elements with identical 
atom:id values.


2. Add the following paragraph, either to atom:entry or atom:feed, at 
the editors' discretion (instead of the first sentence proposed by 
PaceAllowDuplicateIDs, if accepted):


If multiple atom:entry elements originating in the same Atom feed have 
the same atom:id value, whether they exist simultaneously in one 
document or in different instances of the feed document, they describe 
the same entry.


== Impacts ==

 * Aggregators wishing to both perform duplicate detection and protect 
against DOS attacks will be justified by the specification in applying 
their judgement regarding whether entries with the same atom:id come 
from the same source or not.


== Notes ==

 * Because we are unlikely to agree on a method for determining whether 
the atom:entry elements originate in the same feed or not, no 
particular method will be specified.
 * The proposed language does not preclude the possibility of 
aggregators applying their own judgement regarding whether two 
atom:entry elements with the same atom:id which originate in different 
feeds might describe the same entry resource, which they might if 
someone posts the same to entry to, for example, a category feed and a 
feed of all their categories, and doesn't present one as having been 
aggregated from the other by including an atom:source element.




PaceAtomIdDos posted (was Re: Consensus snapshot, 2005/05/25)

2005-05-25 Thread Antone Roundy


On Wednesday, May 25, 2005, at 01:20  PM, Graham wrote:

On 25 May 2005, at 7:35 pm, Antone Roundy wrote:
If multiple atom:entry elements originating in the same Atom feed 
have the same atom:id value, whether they exist simultaneously in one 
document or in different instances of the feed document, they 
describe the same entry.


What about when they don't? I don't see any value here. A line saying 
that when two matching entry ids found in different feeds is fine, but 
(apparently) saying it's completely meaningless goes way, way too far.


In my grand tradition (...I'm sure I've done this before), I've posted 
an alternative to my own proposal.  The following would legitimize 
considering more than just atom:id in doing duplicate detection in 
order to protect against DOS, but without risking anyone thinking we've 
weakened the requirement for universal uniqueness of atom:id.  I'd vote 
for this over PaceDuplicateIdsEntryOrigin, and 
PaceDuplicateIdsEntryOrigin over no change.


http://www.intertwingly.net/wiki/pie/PaceAtomIdDos

== Abstract ==

Point out the potential for denial of service by duplicating others' 
atom:id values.


== Status ==

New

== Rationale ==

 * We want atom:id to be univerally unique to a particular entry 
resource.
 * However, depending on such uniqueness could lead to denial of 
service attacks where the attacker publishes an entry with an atom:id 
value used by someone else.
 * Restricting the uniqueness scope of atom:id entirely to a single 
feed would make it much less valuable, since entries are often copied 
form feed to feed, and sometimes simultaneously published in multiple 
feeds.
 * Only requiring entries with the same atom:id to be considered the 
same if coming from the same feed, but allowing the consuming 
application to exercize judgement with respect to entries originating 
in different feeds is a much better match with reality.
 * Still, pointing out the potential for DOS attacks in the Security 
Considerations section may be preferable to loosening the scope of 
atom:id uniqueness elsewhere in the spec in either of the ways describe 
by the preceding bullet points.


== Proposal ==

Add the following to format-08:

8.5 Denial of Service Attacks

Atom Processors should be aware of the potential for denial of service 
attacks where the attacker publishes an atom:entry with the atom:id 
value of an entry from another feed, and perhaps with a falsified 
atom:source element duplicating the atom:id of the other feed. Atom 
Processors which, for example, suppress display of duplicate entries by 
displaying only one entry with a particular atom:id value or 
combination of atom:id and atom:updated values, might also take steps 
to determine whether the entries originated from the same publisher 
before considering them to be duplicates.




Re: Semantics and Belief contexts - was: PaceDuplicateIdsEntryOrigin posted

2005-05-25 Thread Antone Roundy


On Wednesday, May 25, 2005, at 02:26  PM, Henry Story wrote:
Since the referents of Superman and Clark Kent are the same, what 
is true of the one,
is true of the other. When speaking directly about the world, we can 
replace any occurrence

of Superman with Clark Kent, and still say something true.
Clark Kent is the secret identity of Superman. - Superman is the 
secret identity of Superman.  Whether they're perfectly 
interchangeable or not depends on whether the name is referring to the 
object or some a facet of the object.  The second sentence actually 
works if the first Superman refers to the persona, and the second to 
the person.  But getting back to Atom...



Autistic children have great difficulty understanding the difference
between what is and how people perceive things to be.
They sure don't have a monopoly on this!  Really getting back to 
Atom!...



So to prevent a DOS attack, best is to have aggregator feeds such as:

feed
!-- aggregator feed --
feed src=http://true.org;
   idtag://true.org,2005/feed1/id
   entry
  titleEnter your credit card number here/title
  ...
   /entry
 /feed
feed src=http://false.org;
   idtag://true.org,2005/feed1/id
   entry
  titleEnter your credit card number here/title
  ...
   /entry
/feed
/feed

Here all the aggregator feed is claiming is that he has seen entries 
inside other

feeds.

...
It will be up to the consumer of such aggregated feeds to decide which 
to trust.


From the end user's point of view, it's not much different.  Somebody 
still has to make the decision, and the end user doesn't want to be the 
one doing it--they want the super aggregator or their feed reader or 
somebody else to do it for them.  The feed reader should be doing it 
anyway, since they won't be getting all of their data through a super 
aggregator.  But the super aggregator is likely to want to do it too, 
both to reduce how much data they forward to their clients, and because 
many feed readers aren't going to do it very well, so handling part of 
the job for them will improve the end user's experience.


I'm not a fan of feeds of feeds (though I have proposed and still like 
a one-level embedding of feeds in a different top-level element).  
Plus, I think it's inconceivable that the WG would make this drastic a 
change at this point.  Let's focus on doing what's actually possible, 
given the WG schedule and temperment, to mitigate this problem.




Re: PaceAtomIdDos posted (was Re: Consensus snapshot, 2005/05/25)

2005-05-25 Thread Antone Roundy


On Wednesday, May 25, 2005, at 02:49  PM, Graham wrote:

On 25 May 2005, at 9:01 pm, Antone Roundy wrote:

8.5 Denial of Service Attacks

Atom Processors should be aware of the potential for denial of 
service attacks where the attacker publishes an atom:entry with the 
atom:id value of an entry from another feed, and perhaps with a 
falsified atom:source element duplicating the atom:id of the other 
feed. Atom Processors which, for example, suppress display of 
duplicate entries by displaying only one entry with a particular 
atom:id value or combination of atom:id and atom:updated values, 
might also take steps to determine whether the entries originated 
from the same publisher before considering them to be duplicates.


How is this a Denial of service attack? Isn't it just ordinary 
spoofing/impersonation?


Apart from that, +1.


I don't particularly care whether we call it a DOS or something else, 
as long as we point it out and give implementers something to point to 
if asked why they're not simply accepting atom:id at face value.


But is it not potentially a DOS?  The Good Guy publishes an entry.  The 
Bad Guy copies the atom:id of that entry into an entry with different 
content, gives it a later atom:updated, and publishes it.  The 
aggregator stops publishing/displaying the Good Guy's entry and instead 
publishes/displays the Bad Guy's entry.  Thus, the subscriber doesn't 
see the Good Guy's entry (unless they saw it before it was replaced).


But you're also right--if they saw it before it was replaced and then, 
when they see the updated version, they think it was updated by The 
Good Guy, it becomes a spoof/impersonation.  Perhaps we should say 
Denial of Service and Spoofing Attacks and ...potential for denial 
of service and spoofing attacks...?  How that's worded doesn't really 
matter to me.




  1   2   3   >