Re: Feed History -02
On 18/07/2005, at 2:17 PM, Stefan Eissing wrote: On a more semantic issue: The described sync algorithm will work. In most scenarios the abort condition (e.g. all items on a historical feed are known) will also do the job. However this still means that clients need to check the first fh:prev document if they know all entries there - if my understanding is correct. This is one of the unanswered questions that I left out of scope. The consumer can examine the previous archive's URI and decide as to whether it's seen it or not before, and therefore avoid fetching it if it already has seen it. However, in this approach, it won't see changes that are made in the archive (e.g., if a revision -- even a spelling correction -- is made to an old entry); to do that it either has to walk back the *entire* archive each time, or the feed has to publish all changes -- even to old entries -- at the head of the feed. I left it out because it has more to do with questions about entry deleting and ordering than with recovering state. it's an arbitrary decision (I had language about this in the original Pace I made), but it seemed like a good trade-off between complexity and capability. Does that make sense, or am I way off-base? Is it worthy to think of something to spare clients and servers this lookup? Are the HTTP caching and If-* header mechanisms good enough to save network bandwidth? An alternate stragety would be to require that fh:prev documents never change once created. Then a client can terminate the sync once it sees a URI it already knows. And most clients would not do more lookups than they are doing now... -- Mark Nottingham http://www.mnot.net/
Re: Feed History -02
On 18/07/2005, at 1:29 PM, Stefan Eissing wrote: I agree that special URIs are not that great either. Another idea might be nested elements: stateful feed: fh:historyfh:prevhttp://example.org/thingie1.1/ fh:prev/fh:history stateful initial feed: fh:history/ stateless feed: fh:historyfh:none//fh:history Hmm. My thinking was that allowing stateful to be omitted would be concise and unambiguous; to compare, stateful feed: fh:prevhttp://example.org/thingie1.1/fh:prev stateful initial feed: fh:statefultrue/fh:stateful stateless feed: fh:statefulfalse/fh:stateful -- Mark Nottingham http://www.mnot.net/
Feed History -02
The Feed History draft has been updated to -02; http://ftp.ietf.org/internet-drafts/draft-nottingham-atompub-feed- history-02.txt The most noticeable change in this version is the inclusion of a namespace URI, to allow implementation. I don't intend to update it for a while, so as to gather implementation feedback. Cheers, -- Mark Nottingham http://www.mnot.net/
Re: Atom for RDF transfer?
Hi Danny, What does Atom bring to the table here? Sounds like straight REST would do the trick admirably for you... On 09/07/2005, at 3:18 AM, Danny Ayers wrote: While everyone's waiting for ratification of the Atom format, maybe there are a few brain-cycles that can be harnessed... What I (and others [1]) am looking for is a standard means of interfacing with an RDF store like Redland or Jena over HTTP. The operations required are: 1. query the store 2. add statements to the store 3. delete statements from the store 4. make an update to the store (5. sync stores) The first of these is reasonably well provided for already with SPARQL. There appears to be some commonality of aproaches to 2, 3 and 4, but no single standard seems yet to have emerged. 5. is something needed sometime before long, but could probably be fulfilled with the other points and an appropriate algorithm (possible approaches at [4],[5]). Now ideally all the operations would be done directly over HTTP, but there are one or two issues, and I'm wondering if the Atom format/protocol could form a consistent, easily implemented lightweight delivery mechanism as an alternative. It'd be tunneling, but without all the WS-* overhead. The interchange format specified for RDF is XML (i.e. RDF/XML) so that's the first hurdle hopped. This kind of thing is being covered to some extent by the W3C Data Access WG (DAWG) and Semantic Web Best Practices and Deployment WG, but they are tied to their charters. A good solution from left-field (built on good standards) could fast-forward development. So first the status quo, as far as I'm aware: 1. ask the store a query The SPARQL protocol and RDF query language [2] is emerging as the standard for queries, i.e. read-only operations. The query language itself is fairly SQL-like. The protocol as it currently stands has a generic WSDL 2.0 expression, but in practice *the* binding so far is to HTTP, using the query itself as a parameter in a GET: GET http://example.com/sparqlendpoint?query=...bunch of sparql... The results are returned as a XML doc in the response body, the format of which will depend on the nature of the query (there's a simple result-set format for SELECTs, RDF/XML for CONSTRUCT etc). The operations of 2, 3 and 4 can be covered by a protocol in a similar fashion: by supplying an RDF graph to add, a graph to delete, or a combined operation for update supplying a graph to delete followed by a graph to add. I'll just expand that a little before describing existing protocol support - 2. add statements to the store Data can be added to an RDF store by supplying a list of statements, that is the graph, as an RDF/XML doc. This is something all triplestores should support. 3. delete statements from the store This is a little trickier, there isn't any operation common to stores that says delete(graph). In practice this may mean listing and deleting the individual statements. I think there may be issues where the graph to delete matches a subgraph in the store where the nodes aren't sufficiently bound to URIs to make matching unambiguous. Frankly I'm not sure, but I don't think it would impact on the protocol. 4. make an update to the store A two phase operation, delete(graphA), add(graphB). It would be nice for this to happen as an atomic transaction. Ok, existing protocols/proposals include the NetAPI [6,7], and Joseki: RDF WebAPI [8] (I think this is currently moving over to SPARQL for queries, not sure about updates). The NetAPI use a two-part mime/multipart message with a HTTP POST, the first containing the graph to delete, the second containing the graph to add. Adding and deleting alone are special cases. (I seem to remember there being some issues relating to the use of mime/multipart around Atom, I can't for the life of me remember what they were). There's also URIQA, which can provide the operations listed, but is more aligned towards working with authoritative sources, i.e. where the host owns the resources identified. Technically it looks good, but involves the addition of extra HTTP methods, which has caused some controversy. So how might all this be done in Atom? I don't really know, beyond thinking perhaps that many of the interfacing operations with a triplestore may be exressed nicely as a sequence of entries, content as graphs, each entry representing an add/delete operation. Cheers, Danny. [1] http://lists.gnomehack.com/pipermail/redland-dev/2005-July/ 001019.html [2] http://www.w3.org/TR/rdf-sparql-query/ [3] http://www.w3.org/TR/rdf-sparql-protocol/ [4] http://www.w3.org/DesignIssues/Diff [5] http://www.dbin.org/ [6] http://www.w3.org/Submission/2003/SUBM-rdf-netapi-20031002/ [7] http://www.wiwiss.fu-berlin.de/suhl/bizer/rdfapi/tutorial/ netapi.html [8] http://www.joseki.org/protocol.html [9] http://sw.nokia.com/uriqa/URIQA.html -- http://dannyayers.com -- Mark Nottingham http://www.mnot.net/
Re: format-10 draft editorial request
+1 It may be a good idea to change the descriptive text about the examples; e.g., A brief, single-entry Atom Feed Document - A single-entry Atom Feed Document that conforms to all MUST-level requirements A more extensive, single-entry Atom Feed Document - A single- entry Atom Feed Document that conforms to all MUST- and SHOULD-level requirements On 08/07/2005, at 6:14 PM, Sam Ruby wrote: I truly believe that this request is completely editorial. I would like to request that an attempt be made to ensure that the second more extensive example in section 1.1 comply with all SHOULDs. In particular, the feed SHOULD contain a link with a rel='self'. If I find other deviations from the recommended practices, I'll note them here. - Sam Ruby -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems -- Mark Nottingham http://www.mnot.net/
Re: Major backtracking on canonicalization
We should go into a little more detail. Are we specifying exclusive c14n with or without comments? My preference would be without. As I understand it, inherited xml:lang and xml:base attributes aren't signed when you're using exclusive c14n. If we ended up allowing per- entry signatures, we need to give guidance that xml:lang and xml:base should be explicitly included in the signed content if they are important. It may be helpful to give guidance about the usage of the InclusiveNamespaces PrefixList, especially with default namespaces. It also might be good to give guidance to extension authors and publishers about use of namespaces that aren't visibly bound; e.g., QNames in content. More information at: http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/#sec-Limitations Another good reference is the WS-I Basic Security Profile; http://www.ws-i.org/Profiles/ BasicSecurityProfile-1.0.html#xmlSignatureAlgorithms On 06/07/2005, at 6:28 PM, Paul Hoffman wrote: Greetings again. I gravely misunderstood XML Canonicalization, and as it has been explained to me now, XML Canonicalization would be a disaster for Atom: what we want is Exclusive XML Canonicalization. See http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/. What I didn't get was that in normal XML Canonicalization, the canonicalized version gets all the external definitions added as text; that doesn't happen in Exclusive XML Canonicalization. I thought that in normal XML Canonicalization, those definitions got assumed; I didn't realize that they got actually put in as text. Yuck. (I cannot understand how the folks who put together XMLDigSig could allow normal XML Canonicalization to be even thought of, much less the only required form. What a mess.) Now that I understand this better, I believe that our text should read: [[ NEW ]] Section 6.5.1 of [W3C.REC-xmldsig-core-20020212] requires support for Canonical XML. However, many people believe that Canonical XML may be deprecated in the future, and many implementers do not use it because signed XML documents enclosed in other XML documents have their signatures broken. Thus, Atom Processors that verify signed Atom Documents MUST be able to canonicalize with Exclusive XML Canonicalization. Does anyone object to that? --Paul Hoffman, Director --Internet Mail Consortium -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems
Re: Major backtracking on canonicalization
On 07/07/2005, at 11:36 AM, Paul Hoffman wrote: At 10:23 AM -0400 7/7/05, Mark Nottingham wrote: Are we specifying exclusive c14n with or without comments? My preference would be without. Without. That is explicitly the default for http://www.w3.org/TR/ 2002/REC-xml-exc-c14n-20020718/. Where does it state that explicitly? There are two identifiers in section four; it would be best to reference the spec and the applicable identifier by name. As I understand it, inherited xml:lang and xml:base attributes aren't signed when you're using exclusive c14n. If we ended up allowing per-entry signatures, we need to give guidance that xml:lang and xml:base should be explicitly included in the signed content if they are important. Why? We are signing bags of bits. Why add those from the outside? Exclusive canonicalisation itself says, in section 5.1; applications must carefully specify the XML (i.e., source, fragment, and target) or define the node-set processing (i.e., removal, replacement, and insertion) with respect to default namespace declarations (e.g., xmlns=) and XML attributes (e.g., xml:lang, xml:space, and xml:base). Imagine that you sign an entry that relies on an feed-level xml:base of http://www.example.com/;. If you exclusively canonicalise, an attacker could introduce an xml:base of http://www.evil- attacker.net/ into the signed entry without invalidating the signature, effectively rewriting all of the URIs inside it. Similar problems would occur with xml:lang, since we depend heavily on it (although I see xml:base as a more serious problem). It may be helpful to give guidance about the usage of the InclusiveNamespaces PrefixList, especially with default namespaces. The whole purpose of using exclusive XML is to not need to guess about what is and is not in the bag of bits being hashed. Right. People need to understand the implications of including or excluding particular pieces of information from that bag, as per above. -- Mark Nottingham http://www.mnot.net/
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
On 01/07/2005, at 5:28 AM, A. Pagaltzis wrote: No, that’s not necessary. For fetches which don’t supply any RFC3229 headers, you can simply return the partial feed that you always return. How do you figure that? HTTP delta encoding is a generic mechanism layered into HTTP; if you do a bare GET against a resource, the server has to return the whole representation. The only problem to solve in that case is how one would go about downloading the entire archive when one has never polled the feed before. A simple solution that lets anyone define their own server-side mechanism is to use a different URL given by @rel='full-archive' (or something like that.) But this is problematic in that there’s no defense if a non-RFC3229-compliant client ever stumbles upon this URL. This could be avoided by an extension element that provides an RFC3229-compliant client with the details it needs to supply the headers for which the server will return the entire feed. (In the simplest form it could simply be an element that contains the ETag itself as its content, though that feels like a kludge along the lines of HTML’s meta.) Clients without RFC3229 support will then never be able to request the entire archive. (The point is not to defend against abuse – that’s not possible –, it is to prevent harm from hapless users with dumb aggregators.) Where does it talk about this in RFC3229? A delta-capable intermediary is going to be completely unaware of this, and very likely to return erroneous responses. Thinking about it, the cleanest way to solve this is actually not the use of an extension element in Atom, but adding support for this scenario to RFC3229+feed. The original RFC3229 obviously does not assume it might be used an environment where partial representations are, in fact, the default, rather than the exception, as is generally the case with feeds. RFC3229+feed could f.ex suggest that for requests that include RFC3229 headers, the server return headers with the response which inform the client how to request the entirety of the feed – maybe Resource-Initial-ETag or something like that. Maybe it shouldn't be called RFC3229, but what it is; a format- specific extension to HTTP. Which begs the question; why do it in HTTP at all, if it's format- specific? Format-specific protocol extensions are generally considered bad practice. Why not do it in the format? Regards, -- Mark Nottingham http://www.mnot.net/
Re: Clearing a discuss vote on the Atom format
+1 On 01/07/2005, at 8:36 AM, Paul Hoffman wrote: At 4:44 PM +0900 7/1/05, Martin Duerst wrote: At 10:26 05/07/01, Paul Hoffman wrote: To be added near the end of Section 5.1 of atompub-format: Section 6.5.1 of [W3C.REC-xmldsig-core-20020212] requires support for Canonical XML. Atom Processors that sign Atom Documents MUST use Canonical XML. The rest of your changes looked reasonable, but the MUST above looks too strong to me. Good catch. I think we can make the requirement for canonicalization like we do for encryption and signing: put the onus on the receiver. That would change the wording to: Atom Processors that verify signed Atom Documents MUST be able to canonicalize with Canonical XML. --Paul Hoffman, Director --Internet Mail Consortium -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems -- Mark Nottingham http://www.mnot.net/
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
Hi James, On 29/06/2005, at 10:09 AM, James M Snell wrote: 1. This appears to be addressed at solving the same problem as Bob Wyman's RFC3229+feed proposal [http://bobwyman.pubsub.com/main/ 2004/09/using_rfc3229_w.html]. Do you have any empiracle data similar to what Bob provides @ http://bobwyman.pubsub.com/main/ 2004/10/massive_bandwid.html that would indicate that your approach is a better solution to this problem? These are actually not mutually exclusive solutions, they're just different and could be used for different scenarios -- e.g. Bob's tends to make a lot of sense for blog dashboard feeds like what we use within IBM to show all post and commenting activity within our internal blogs server while your mechanism would work rather well for things like Top Ten lists, etc. I would just like to see a bit of a compare/contrast on the two approaches. It's orthoganal to RFC3229. The problem I'm solving is how to reconstruct the *entire* state of the logical feed, not just one partial representation of it; although RFC3229 could be used to do that, it would require feed authors to post the entire content of their feed (potentially, many megabytes). This would incur a huge load, because any clients that don't support RFC3229 would have to GET the entire feed, leading to severe bandwidth problems. To give a concrete example, Dave Winer would have to post one RSS file containing every entry he's made in Scripting News for the past 10+ years to use RFC3229 to meet the same goal; with this proposal, he'd just have to add a 'prev' to each archived feed (assuming he has archives around, which if he doesn't, I imagine he could reconstruct). 2. Is the feed state mechanism a way of paging through the current contents of a collection or a snapshot-in-time view of a feed? That is... is it A) Collection has a bunch of entries. Each feed representation has 15 entries and the prev link acts like a paging mechanism similar to what we see currently use in search results. Deleting the first ten entries out of the collection would cause all of the entries in the feed to shift backwards in the feeds B) Each prev link is representative of how the feed looked at a given point in time. E.g. the feed as it would have appeared at a given hour of a given day If it's A, then Bob's RFC3229+feed solution seems much more efficient. (see #1) If it's B, then I'm wondering why you don't just use an ETag based approach, e.g. fs:Stateful1/fs:Stateful fs:prev{ETag}/fs:prev This would allow clients to only ever have to deal with a single URI for a feed and use conditional-gets with ETag to differentiate which snapshot of the feed they want to get and would likely make it easier to remediate potential recursive reference attacks, (e.g. feed A references feed B which references feed C which is a blind redirect to Feed A). This proposal doesn't handle deletion or other aspects of identity in feeds; I tried to introduce language like that earlier in Atom itself, but we failed to gain consensus around it. How does an ETag help you locate a previous feed to reconstruct state? Even if it could, I'm not sure intermingling HTTP protocol details with application semantics; although there's nothing to prevent this theoretically, in many implementations, it might be problematic to predict what the ETag is. 3. Microsoft's RSS Lists spec uses cf:treatAs / to attach behavioral semantics to a feed. This proposal uses fs:Stateful / to attach behavioral semantics. It would be nice if we could come up with a relatively simple and standardizable way of attaching behavioral semantics. For example, a standardized treatAs / element: atomex:treatAsstateful/atomex:treatAs The value of the treatAs element would be a list of tokens with defined semantics. Each token SHOULD be registered with IANA. Unknown tokens would be ignored. Incompatible tokens would be ignored with first-in-the-list takes precedence semantics. For example: atomex:treatAsstateful list/atomex:treatAs Indicates that the feed should be treated as a list whose past states can be queried using the kind of mechanism you've defined. That seems like an awfully heavyweight solution. What does defining the container and an IANA registry add? -- Mark Nottingham http://www.mnot.net/
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
On 30/06/2005, at 1:41 PM, James M Snell wrote: The value is that I would really like to see a common and consistent way of attaching behavioral semantics to the feed rather than each individual vendor / spec defining their own app and impl specific methods. It could be done without IANA support, of course, but it's just annoying to see relatively similar tasks done in completely different ways. I totally agree that we should have neutral, non-vendor-specific semantics defined. I just don't see how having this container defined, along with the IANA registry, helps; if it was the intent of the WG to forbid all vendor-specific mechanisms, we should have disallowed all extensions except for those that are in an IANA registry (for example). That's an extreme, of course, but it points out that Atom -- and RSS, for that matter -- is still in the period of its lifetime where vendors and individuals have to experiment to figure out what's valuable, and let the market sort out what becomes commonly deployed. It's not pretty, but it works pretty well in the long run. Cheers, -- Mark Nottingham http://www.mnot.net/
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
You need to be able to figure out which documents you've seen before and which ones you haven't, so you don't recurse down the entire stack. Although you can come up with some heuristics to determine when you've seen a document before, most (if not all) of them can be fooled by particular sequences of entries. Remembering which ones you've seen (using their 'this' URI) allows you to easily figure this out. On 28/06/2005, at 8:48 PM, Antone Roundy wrote: Thinking a little more about this, I'm not sure what the this link would be used for. The prev link seems to be doing all the work, and especially assuming a batches of 15 sort of model, the this link seems likely to end up pointing to a document that's going to disappear soon 14 times out of 15. -- Mark Nottingham http://www.mnot.net/
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
Hi Danny, Thanks for the comments. On 29/06/2005, at 1:57 AM, Danny Ayers wrote: Trivial: might 1/0 be confusing compared to something clearly binary: true/false or yes/no, and difficult to extend: true/false/unknown ack What is the Stateful nature of a feed *without* a Stateful element, the default if you will? (Could be per-case or 'indeterminate', but I think some comment on this would be helpful). Yes, that's the intent; if you don't have a flag, there isn't any information about what it is (or isn't), and you act as you would today. If you're talking about feed reconstruction, might it not make sense to have something like an all (which could appear as a minimal list of dated URIs of entries) to avoid countless GET/interpret cycles? I put that forth in the original Pace http://www.intertwingly.net/ wiki/pie/PaceFeedState (see history), and it got very negative reviews, because it requires a lot of work to maintain, and can be a bandwidth hog. I'm of two minds about it. -- Mark Nottingham http://www.mnot.net/
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
I think you're right... I was concerned about servers doing redirection, so that a client might miss the fact that it's already seen an archive, but as long as it uses the same identifiers to locate documents, it should be fine. On 29/06/2005, at 6:21 AM, Antone Roundy wrote: If it's for identification rather than retrieval, maybe it could be an Identity Construct...except Identity Constructs got nuked in format-06...not necessarily dereferencable. Another option would be to identify whether you need to continue by checking whether you've seen the prev link before. Would not that be as reliable as checking the this link? On Wednesday, June 29, 2005, at 12:10 AM, Mark Nottingham wrote: You need to be able to figure out which documents you've seen before and which ones you haven't, so you don't recurse down the entire stack. Although you can come up with some heuristics to determine when you've seen a document before, most (if not all) of them can be fooled by particular sequences of entries. Remembering which ones you've seen (using their 'this' URI) allows you to easily figure this out. On 28/06/2005, at 8:48 PM, Antone Roundy wrote: Thinking a little more about this, I'm not sure what the this link would be used for. The prev link seems to be doing all the work, and especially assuming a batches of 15 sort of model, the this link seems likely to end up pointing to a document that's going to disappear soon 14 times out of 15. -- Mark Nottingham http://www.mnot.net/ -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems -- Mark Nottingham http://www.mnot.net/
FWD: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
A New Internet-Draft is available from the on-line Internet-Drafts directories. Title : Feed History: Enabling Stateful Syndication Author(s) : M. Nottingham Filename: draft-nottingham-atompub-feed-history-00.txt Pages : 6 Date: 2005-6-27 This document specifies mechanisms that allow feed publishers to give hints about the nature of the feed's statefulness, and a means of retrieving ^missed^ entries from a stateful feed. A URL for this Internet-Draft is: http://www.ietf.org/internet-drafts/draft-nottingham-atompub-feed- history-00.txt To remove yourself from the I-D Announcement list, send a message to i-d-announce-request at ietf.org with the word unsubscribe in the body of the message. You can also visit https://www1.ietf.org/mailman/listinfo/I-D-announce to change your subscription settings. Internet-Drafts are also available by anonymous FTP. Login with the username anonymous and a password of your e-mail address. After logging in, type cd internet-drafts and then get draft-nottingham-atompub-feed-history-00.txt. A list of Internet-Drafts directories can be found in http://www.ietf.org/shadow.html or ftp://ftp.ietf.org/ietf/1shadow-sites.txt Internet-Drafts can also be obtained by e-mail. Send a message to: mailserv at ietf.org. In the body type: FILE /internet-drafts/draft-nottingham-atompub-feed- history-00.txt. NOTE: The mail server at ietf.org can return the document in MIME-encoded form by using the mpack utility. To use this feature, insert the command ENCODING mime before the FILE command. To decode the response(s), you will need munpack or a MIME-compliant mail reader. Different MIME-compliant mail readers exhibit different behavior, especially when dealing with multipart MIME messages (i.e. documents which have been split up into multiple messages), so check your local documentation on how to manipulate these messages. ___ I-D-Announce mailing list I-D-Announce at ietf.org https://www1.ietf.org/mailman/listinfo/i-d-announce -- Mark Nottingham http://www.mnot.net/
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
Hi Antone, Thanks for the comments. This draft was admittedly rough, but I wanted to get an idea of people's general reactions before refining it too much. In particular, I'd be interested in implementors' reactions (e.g., aggregators, publishers). Responses below. Then we add an entry. The old this link can't be used to point to the new instance of the feed, right? Because that would violate this requirement: The value of the this link relation's href attribute MUST be a URI indicating a permanent location that is unique to that Feed Document instance; i.e., the content obtained by dereferencing that URI SHOULD NOT change over time. This is poorly worded; your 'batches of 15' scenario is what I had in mind. I'll work on better language for -01. Now let's say someone tries to fetch the original this feed. The draft says: Note that publishers are not required to make all previous Feed Documents available. This seems like a likely circumstance where the publisher might not want to both to continue making the original instance available. If that's what they decide, then what? Do they return a 410 (gone)? Presumably, some will return a 404 (not found), even though 410 would be better. What should a client do if it receives a 404 or 410? Is there a way for them to find the new instance? Should there be? (Presumably they're subscribed to the feed from a URI different than the one in the this link, so in this case, it's probably not such a big deal, but read on, and you'll see where it could become an issue). I'm not sure what you're looking for; the semantics of 404 and 410 are clearly defined by HTTP. If the server says it can't find it, or it's gone, the client is unable to reconstruct the full state of the feed, and SHOULD warn the user. Also, I just noticed that in some places, the word representation is used, and in some places instance is used, apparently to mean the same thing. In my opinion, instance is better. I'll take a look. Thanks again! -- Mark Nottingham http://www.mnot.net/
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
Thanks Garrett, I'll take this into account if I do another draft. Cheers, On 28/06/2005, at 1:39 PM, Garrett Rooney wrote: Mark Nottingham wrote: A New Internet-Draft is available from the on-line Internet- Drafts directories. Title : Feed History: Enabling Stateful Syndication Author(s) : M. Nottingham Filename: draft-nottingham-atompub-feed- history-00.txt Pages : 6 Date: 2005-6-27 This document specifies mechanisms that allow feed publishers to give hints about the nature of the feed's statefulness, and a means of retrieving ^missed^ entries from a stateful feed. A URL for this Internet-Draft is: http://www.ietf.org/internet-drafts/draft-nottingham-atompub-feed- history-00.txt I imagine someone else has already noticed this, but there's a typo in the example. The open tag uses the prefix 'fh', but the close tag uses 'fs'. Also, why is Stateful capitalized? All the other parts of atom use lower case tag names, IIRC. -garrett -- Mark Nottingham http://www.mnot.net/
Re: PaceOtherVocabularies
Does this imply that this (or another IETF) Working Group cannot mint an Atom extension without putting it into a new namespace, or changing the version number? If so, I think it's a needless constraint. This WG might come up with a backwards-compatible extension and want to put it in the Atom namespace for convenience; that shouldn't necessitate bumping the version number (which would cause a lot of compatibility problems). Cheers, On May 16, 2005, at 3:35 PM, Robert Sayre wrote: http://www.intertwingly.net/wiki/pie/PaceOtherVocabularies == Abstract == Ban non-IETF use of the Atom namespace. == Status == Open == Rationale == Keep extensions in other namespaces, so the Atom namespace can be safely extended by the IETF. == Proposal == === 6.1 Extensions From Non-Atom Vocabularies === This specification describes Atom's XML markup vocabulary. Markup from other vocabularies (foreign markup) can be used in an Atom document, but MUST be namespace-qualified and in a namespace other than Atom's. Note that the atom:content element is designed to support the inclusion of arbitrary foreign markup. === 6.2 Extensions To the Atom Vocabulary === Future versions of this specification could add new elements and attributes to the Atom markup vocabulary. Software written to conform to this version of the specification will not be able to process such markup correctly and, in fact, will not be able to distinguish it from markup error. For the purposes of this discussion, unrecognized markup from the Atom vocabulary will be considered foreign markup. == Impacts == == Notes == -- Mark Nottingham http://www.mnot.net/
Re: On SHOULD, MUST, and semantics
Well said, Paul; this articulates the reasons for a profiling mechanism much more effectively than I ever did. Cheers, On Apr 27, 2005, at 9:28 AM, Paul Hoffman wrote: A few brief notes for the WG to chew on. - Literal interpretation of 2119 for a document format such as Atom would make nearly every element of an entry a MAY. It is a tad dishonest to say why is element Xyz even a SHOULD? when we have a *long* list of MUSTs already. It is no wonder that people are changing their opinion of any particular element from MUST to MAY to SHOULD and around again. - The format document, like many IETF Applications Area specs, is steeped in hidden semantics. If I suggested that atom:title go from MUST to MAY, there would be outrage. The reason is that most of us think of entries as having some semantics that include a title. We don't say in the format document why we made atom:title a MUST: it's just obvious. - Differentiating the importance of the semantics of various elements is really, really hard, and people's opinions change over time. I strongly suspect that many of the less-vocal-but-still-active members of this list have changed their view of the importance of atom:category at least once. This is the hardest one to align with making a simple-to implement spec. The fewer cross-element links we have, the easier it will be for an implementer. - We have, unfortunately, linked the semantics of two or more elements together. atom:content and atom:summary are linked because of their semantics (if you don't have a readable content, you MUST/SHOULD/MAY have a summary). - We have, fortunately, not required linked semantics for extensions. If I came out with hoffman:content-plus-plus, I can't assume that whatever linkage there is between atom:summary and atom:content will apply to my extension. But, of course, if someone considers hoffman:content-plus-plus to be like atom:content, they may *want* the same linkages. - Every SHOULD, by definition, adds logic into the software that creates a message in the spec'd format. In the long history of the IETF, this also tends to prevent interop, because it gives two differing parties a soapbox to argue from. - Person A may think that an optional element that is missing means W. Person B may think that an optional element that is present but null means X. Person C may think that W == X, Person D may think that W != X. Person C and Person D will probably code very differently. Proposal for thinking about: to simplify the spec, atom:summary should either be a MUST in all cases or a MAY in all cases. If it is just semantic like atom:category, it should be a MAY. If it is inherently valuable like atom:title, it should be a MUST. --Paul Hoffman, Director --Internet Mail Consortium -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems
Re: For review: application/atom+xml
+1 On Apr 25, 2005, at 5:10 PM, Tim Bray wrote: On Apr 25, 2005, at 3:49 PM, Mark Nottingham wrote: Comments on the media type template. He's got a point on the namespace being mentioned, which creates some semi-circular dependencies, sigh. As to whether it's currently in use, largely due to lobbying from us, recent releases of both Apache and IIS tie the application/atom+xml media type to the .atom extension, so if people are creating 0.3 files and calling them whatever.atom, this could be right. Having said that, I think we should push back. Because any such current usage is unlicensed by any spec, because we plan to aggressively deprecate Atom 0.3 once we get 1.0 out and since I suspect that 90% of 0.3 feeds come from one place we should have some success. -Tim Begin forwarded message: From: Mark Baker [EMAIL PROTECTED] Date: April 20, 2005 11:13:54 AM PDT To: [EMAIL PROTECTED] Cc: iesg@ietf.org Subject: Re: For review: application/atom+xml [ CC: IESG, since I suppose this counts as a last call comment ] Mark, Congrats on going to last call with the atompub format! - http://www.ietf.org/internet-drafts/draft-ietf-atompub-format-08.txt But I note that my comment[1] on the lack of any mention of the existing use of application/atom+xml with the incompatible Atom 0.3 content didn't make it in. Though not widespread, my understanding (though I'd be happy to be corrected!), at least at the time that you first drew up this media type[2], was that it was in use[3]. I certainly don't feel that this is a big enough problem to warrant a new media type, but I do think it should be flagged to implementors. Also, I noticed something else that I missed my first time through ... On Wed, Apr 06, 2005 at 08:41:08PM -0700, Mark Nottingham wrote: Additional information: Magic number(s): As specified for application/xml in [RFC3023], section 3.2. Based on my understanding of the purpose of magic numbers, I think referencing another spec's magic number algorithm to be prima facie incorrect. What I think should be there is information that, as uniquely as possible, identifies Atom documents, and RFC 3023 can, of course, only help with identifying XML documents. So I'd suggest either adding something about the root namespace value (if indeed the spec requires that the root element always be from the Atom namespace?) as I did for XHTML[4], or else just saying Magic Number: None (as we did for SOAP[5], in effect, by not providing an algorithm). Cheers, [1] http://eikenes.alvestrand.no/pipermail/ietf-types/2005-April/ 000678.html [2] http://www.mnot.net/drafts/draft-nottingham-atom-format-02.html [3] http://diveintomark.org/archives/2003/12/13/atom03 [4] http://www.ietf.org/rfc/rfc3236.txt [5] http://www.ietf.org/rfc/rfc3902.txt Mark. -- Mark Baker. Ottawa, Ontario, CANADA.http://www.markbaker.ca -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems - Tim Bray, Director of Web Technologies, Sun Microsystems +1-877-305-0889 Sun ext. 60561 http://www.tbray.org/ongoing/ AIM: MarkupPedant -- Mark Nottingham http://www.mnot.net/
Fwd: For review: application/atom+xml
Comments on the media type template. Begin forwarded message: From: Mark Baker [EMAIL PROTECTED] Date: April 20, 2005 11:13:54 AM PDT To: [EMAIL PROTECTED] Cc: iesg@ietf.org Subject: Re: For review: application/atom+xml [ CC: IESG, since I suppose this counts as a last call comment ] Mark, Congrats on going to last call with the atompub format! - http://www.ietf.org/internet-drafts/draft-ietf-atompub-format-08.txt But I note that my comment[1] on the lack of any mention of the existing use of application/atom+xml with the incompatible Atom 0.3 content didn't make it in. Though not widespread, my understanding (though I'd be happy to be corrected!), at least at the time that you first drew up this media type[2], was that it was in use[3]. I certainly don't feel that this is a big enough problem to warrant a new media type, but I do think it should be flagged to implementors. Also, I noticed something else that I missed my first time through ... On Wed, Apr 06, 2005 at 08:41:08PM -0700, Mark Nottingham wrote: Additional information: Magic number(s): As specified for application/xml in [RFC3023], section 3.2. Based on my understanding of the purpose of magic numbers, I think referencing another spec's magic number algorithm to be prima facie incorrect. What I think should be there is information that, as uniquely as possible, identifies Atom documents, and RFC 3023 can, of course, only help with identifying XML documents. So I'd suggest either adding something about the root namespace value (if indeed the spec requires that the root element always be from the Atom namespace?) as I did for XHTML[4], or else just saying Magic Number: None (as we did for SOAP[5], in effect, by not providing an algorithm). Cheers, [1] http://eikenes.alvestrand.no/pipermail/ietf-types/2005-April/ 000678.html [2] http://www.mnot.net/drafts/draft-nottingham-atom-format-02.html [3] http://diveintomark.org/archives/2003/12/13/atom03 [4] http://www.ietf.org/rfc/rfc3236.txt [5] http://www.ietf.org/rfc/rfc3902.txt Mark. -- Mark Baker. Ottawa, Ontario, CANADA.http://www.markbaker.ca -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems
Re: IRI/URI
Welcome ;) With the caveat that I'm not an i18n expert; what do you mean by 'different location'? IRIs don't have a separate level of %-encoding on top of that used by URIs; rather, as I understand it, they leverage the URI %-encoding mechanism, by just standardising on UTF-8 for the character set of encoded characters. Thus, there would be no difference between an IRI and a URI containing a '%' as payload, rather than an escape (as you show). There's another underlying issue here, which I think Martin brushed up against at one point in the past. Basically, it's whether there is information in the distinction between something being an IRI vs. it being a URI. Compared bit-for-bit, there is no difference, and there isn't any difference if you dereference them; it's just whether you put any information in the difference. I'm of the opinion that there is no difference; although it's useful to have different terms for them to built conformant systems appropriately, purposely saying that there's a difference between two character-for-character identical identifiers isn't useful, and would be harmful in many cases. Cheers, On Apr 11, 2005, at 7:10 PM, Porges wrote: OK, first-time poster :) I was just thinking about IRIs recently and thought about a possible source of ambiguousness. If the URI element can be EITHER an IRI or a URI, then: urihttp://example.com/200%25equalsZero/uri This is both a valid IRI and a valid URI, but if it is considered to be an IRI it will point to a different location than if it is considered to be a URI. IRI (as URI): http://example.com/200%2525equalsZero URI: http://example.com/200%25equalsZero Reading through the draft, it seems like IRIs might be the only thing allowed. If so, could this be made clearer - through a renaming of the element for instance I'm not sure if this is even a problem, perhaps there's something I've missed in the stack of protocols sitting out there. Feel free to castrate me if this is the case, I've gotten myself rather confused trying to work through this by myself and thought I'd ask some more knowledgeable people ;) -- Mark Nottingham http://www.mnot.net/
ABNF, Validity, Relation Registry [was: AD Review Comments and Questions: draft-ietf-atompub-format-07]
Does anybody have feedback on the suggestions/questions below? If I don't get any feedback on the ABNF or validity discussions, I'll proceed as outlined. I think there needs to be *some* feedback regarding the link relation registry; I'm proposing substantial changes there (my preferred approach is to change it so that IETF Consensus, rather than IESG approval, is required to register a link relation). Anybody have thoughts? On Apr 6, 2005, at 5:13 PM, Mark Nottingham wrote: Section 1.2: please reference draft-crocker-abnf-rfc2234bis-00.txt instead of RFC 2234 and confirm that everything that was valid before is still valid. The IESG approved this document as a Draft Standard last week. Rob/Mark? Hmm. As far as I can tell, the *only* place where we actually define a rule is 4.2.9.2, and that's just combining two rules by reference. I wonder if we can save complexity here (and remove one normative reference) by just doing this in prose; the text is currently: [[[ ABNF for the rel attribute: rel_attribute = isegment-nz-nc / IRI The value of rel MUST be string that is non-empty, does not contain any colon (:) characters, and matches the isegment-nz-nc or IRI ABNF forms in [RFC3987]. ]]] So it seems like the ABNF is extraneous here, and if we dropped it, we could also drop the reference. (Just a suggestion, I'm also happy to change the reference.) Section 2 describes a requirement for well-formedness, but it doesn't mention validity. I suspect that validity isn't a requirement given that the RELAX NG schema is informative, but it would be better if a specific statement were included to note that validity is not a requirement. Hmm, I would say that validity isn't a requirement because the syntactic constraints are (we think) fully given in the text. The group consciously decided not to make the schema normative, for that reason. We do currently say that the schema is non-normative; having said that, a statement that there is no DTD and no validity requirement couldn't hurt. Rob/Mark? Suggest adding the following after Atom Documents MUST be well-formed XML.; [[[This specification does not define a DTD for Atom Documents, and hence does not require them to be valid (in the sense used by XML).]]] Section 7.1: what process is the IESG supposed to use to review registration requests? Please see section 2 of RFC 2434/BCP 26 for mechanisms that might be used and please specify one in the document. Hmm. Looking over this, I wonder why IESG approval was the path chosen, given that URIs can also be used. It seems like the natural bar for getting something into the registry would be IETF consensus; can someone comment as to why this was chosen (I didn't participate in the discussions surrounding this registry)? If we remain on an IESG approval path, such text would probably look something like; Registered link relations SHOULD be widely implemented, since they effectively serve as shortcuts for URIs; as such, proposals need to demonstrate that there is community value in minting such a shortcut. Also, it may be good to replace the list of suggested topics with a registration template, to get more uniformity and make IANA's life easier (sorry I didn't notice this earlier). Finally, has someone doubled-checked with IANA that the http://www.iana.org/assignments/relation/; URI is available and appropriate? -- Mark Nottingham http://www.mnot.net/
Re: AD Review Comments and Questions: draft-ietf-atompub-format-07
Oops; I meant draft-freed-media-type-reg. On Apr 6, 2005, at 5:13 PM, Mark Nottingham wrote: Section 4: RFC 2045 is referenced. 2045 is on its way to being obsoleted by draft-freed-mime-p4 (in the RFC Editor queue) and draft-freed-media-type-reg (in last call). Can the more recent documents be referenced instead of 2045? Rob/Mark? I think all of the references would go to draft-freed-mime-p4. As an aside -- it appears we may reference a few documents that are in the RFC editor queue, or about to be there. It might be good to set expectations within the WG as to what that means for our publication schedule. -- Mark Nottingham http://www.mnot.net/
Re: What is a media type?
In my experience, media type is colloquially used to mean the type/subtype construct, without parameters; a particular context specifies whether parameters is allowed (e.g., Content-Type). That said, it's not clear in the specs; there is no ABNF rule or even terms that I can find that we could refer to to disambiguate this. We should probably be more precise. As previously mentioned, the updated reference will be to: http://www.ietf.org/internet-drafts/draft-freed-media-type-reg-03.txt I might make a last call comment to the effect that it would be handy to have some to have some terminology disambiguated here. Cheers, On Apr 11, 2005, at 5:03 PM, David Powell wrote: The type attribute of atom:content can be a MIME media type: 4.1.3 The atom:content Element [...] 4.1.3.1 The type attribute [...] [...] Failing that, it MUST be a MIME media type [RFC2045] with a discrete top-level type (see Section 5 of [RFC2045]). After looking at RFC2045, I wasn't very clear about what a media type is. Does it include parameters? Parts of 2045 suggest that a media type might include parameters: 5. Content-Type Header Field The purpose of the Content-Type field is to describe the data contained in the body [...] The value in this field is called a media type. Other parts (most of the document in fact), suggest that a media type is only the top level element, such as text: After the media type and subtype names, the remainder of the header field is simply a set of parameters Is media type an accurate term for us to use? I'm asking this because I really don't know whether parameters are supposed to be allowed in the type attribute or not. -- Dave -- Mark Nottingham http://www.mnot.net/
Re: AD Review Comments and Questions: draft-ietf-atompub-format-07
On Apr 5, 2005, at 9:26 AM, Tim Bray wrote: Section 1.2: please reference draft-crocker-abnf-rfc2234bis-00.txt instead of RFC 2234 and confirm that everything that was valid before is still valid. The IESG approved this document as a Draft Standard last week. Rob/Mark? Hmm. As far as I can tell, the *only* place where we actually define a rule is 4.2.9.2, and that's just combining two rules by reference. I wonder if we can save complexity here (and remove one normative reference) by just doing this in prose; the text is currently: [[[ ABNF for the rel attribute: rel_attribute = isegment-nz-nc / IRI The value of rel MUST be string that is non-empty, does not contain any colon (:) characters, and matches the isegment-nz-nc or IRI ABNF forms in [RFC3987]. ]]] So it seems like the ABNF is extraneous here, and if we dropped it, we could also drop the reference. (Just a suggestion, I'm also happy to change the reference.) Also, looking into this popped up a few more small editorial issues; - 3.2.3 mentions BNF in relation to 2822, but RFC2822 uses ABNF - 4.2.9.2 nominates rules by surrounding them with quotation marks; they are bare elsewhere in the spec Section 2 describes a requirement for well-formedness, but it doesn't mention validity. I suspect that validity isn't a requirement given that the RELAX NG schema is informative, but it would be better if a specific statement were included to note that validity is not a requirement. Hmm, I would say that validity isn't a requirement because the syntactic constraints are (we think) fully given in the text. The group consciously decided not to make the schema normative, for that reason. We do currently say that the schema is non-normative; having said that, a statement that there is no DTD and no validity requirement couldn't hurt. Rob/Mark? Suggest adding the following after Atom Documents MUST be well-formed XML.; [[[This specification does not define a DTD for Atom Documents, and hence does not require them to be valid (in the sense used by XML).]]] Section 4: RFC 2045 is referenced. 2045 is on its way to being obsoleted by draft-freed-mime-p4 (in the RFC Editor queue) and draft-freed-media-type-reg (in last call). Can the more recent documents be referenced instead of 2045? Rob/Mark? I think all of the references would go to draft-freed-mime-p4. As an aside -- it appears we may reference a few documents that are in the RFC editor queue, or about to be there. It might be good to set expectations within the WG as to what that means for our publication schedule. Section 7.1: what process is the IESG supposed to use to review registration requests? Please see section 2 of RFC 2434/BCP 26 for mechanisms that might be used and please specify one in the document. Hmm. Looking over this, I wonder why IESG approval was the path chosen, given that URIs can also be used. It seems like the natural bar for getting something into the registry would be IETF consensus; can someone comment as to why this was chosen (I didn't participate in the discussions surrounding this registry)? If we remain on an IESG approval path, such text would probably look something like; Registered link relations SHOULD be widely implemented, since they effectively serve as shortcuts for URIs; as such, proposals need to demonstrate that there is community value in minting such a shortcut. Also, it may be good to replace the list of suggested topics with a registration template, to get more uniformity and make IANA's life easier (sorry I didn't notice this earlier). Finally, has someone doubled-checked with IANA that the http://www.iana.org/assignments/relation/; URI is available and appropriate? -- Mark Nottingham http://www.mnot.net/
Re: summary of editors' action items ...so far
I can do that later tonight. On Apr 6, 2005, at 5:37 PM, Tim Bray wrote: On Apr 5, 2005, at 2:53 PM, Robert Sayre wrote: Anything to add? No, I think Rob's got it. Sooner is better. Who's going to take care of submitting the MIME type registration? A volunteer would be welcome. -Tim Julian Reschke wrote: 05-C05, 4.15.3 processing model Update -06: I'm still confused by the text. For instance... I agree that this section is gnarly. The editors will attempt to clarify that section without making any normative changes, and will check with the WG to verify that no normative changes have been unintentionally introduced. 06-C01, 3.1.1 type Attribute thus *any* kind of change that encourages xhtml would be appreciated. While there is no consensus in favor of changing the document to 'encourage' XHTML, more than one person has questioned the use of XHTML-Basic. I don't remember where this decision was made (Sam?). While I disagree that XHTML has a definite advantage over HTML, I am concerned that this choice will portray XHTML as somehow less capable, since the HTML section cites the more general HTML 4.01. Graham wrote: A quick bit of rewording would help. Currently it basically says The type attribute may have the values... three times with three different rules. Changing it to On the summary element, the type atrribute may have the values... stops the spec being apparently self-contradicting. The editors erred in failing to incorporate this suggestion in -07. We'll get it this time. Scott Hollenbeck wrote: Section 1.2: please reference draft-crocker-abnf-rfc2234bis-00.txt instead of RFC 2234 and confirm that everything that was valid before is still valid. The IESG approved this document as a Draft Standard last week. Will do. Section 4: RFC 2045 is referenced. 2045 is on its way to being obsoleted by draft-freed-mime-p4 (in the RFC Editor queue) and draft-freed-media-type-reg (in last call). Can the more recent documents be referenced instead of 2045? I think so. Will do. Tim Bray wrote: We do currently say that the schema is non-normative; having said that, a statement that there is no DTD and no validity requirement couldn't hurt. Will add text to this effect. Paul Hoffman wrote: I'd really like to see some guidance in the document to describe what the IESG should look for. We're not atom experts, so it's going to be hard to determine what we should and shouldn't approve. Another paragraph will help future IESGs understand what they need to consider when reviewing requests. Sounds reasonable. (I was erring on the side of not micro-managing the IESG.) Rob/Mark: please take a shot at some guidance for them. Will do. Scott Hollenbeck wrote w.r.t. the namespace: Yes, we can do it this way. Just please add some text so that so that IESG reviewers understand that this is a known issue that we have a plan to address. OK, will do. Robert Sayre - Tim Bray, Director of Web Technologies, Sun Microsystems +1-877-305-0889 Sun ext. 60561 http://www.tbray.org/ongoing/ AIM: MarkupPedant -- Mark Nottingham http://www.mnot.net/
Re: AD Review Comments and Questions: draft-ietf-atompub-format-07
Done; http://eikenes.alvestrand.no/pipermail/ietf-types/2005-April/ 000676.html Just curious; when/how does the ietf-types list switch over to @iana.org (as per draft-freed-media-type-reg)? On Apr 5, 2005, at 8:39 AM, Scott Hollenbeck wrote: The MIME media type registration template included in section 7 MUST be submitted to the ietf-types list ([EMAIL PROTECTED]) for review. A two-week review period is standard for requests to register new types in the standards tree. Please see the list archives [1] for samples if help is needed in crafting a review request and please send the request ASAP. -- Mark Nottingham http://www.mnot.net/
Re: PaceAlternateLinkWeakening - was Managing entries/entry state
On Mar 31, 2005, at 10:07 AM, Henry Story wrote: The value alternate signifies that the containing element is an alternative representation of the IRI in the value of the href attribute. ...an alternate representation of the resource identified by the IRI in the value...? -- Mark Nottingham http://www.mnot.net/
Re: application/rss+xml
I tried; the official response [1] was that the IESG wanted to see an stable and available spec -- by their standards -- for RSS before putting it in the standards tree. Just doing a registration doesn't cut it. I worked on an RSS 2.0 I-D [2] for a while and then stopped when I got nervous about change control and copyright issues. Given that RSS 2.0 is now hosted at Harvard, it may be that the IESG will consider that a stable enough ref; if not, I'm not nearly as nervous now that it's under a CC license, and I think I could take a crack at the I-D again... I'll do a bit of asking around... 1. https://datatracker.ietf.org/public/pidtracker.cgi? command=view_iddTag=7792rfc_flag=0 2. http://www.mnot.net/drafts/draft-nottingham-rss2-00.txt On Mar 29, 2005, at 9:05 PM, Tim Bray wrote: On Mar 29, 2005, at 8:55 PM, Bjoern Hoehrmann wrote: IESG approval of an Internet-Draft with a media type registration would register the type, yes. Whether we should try to register application/ rss+xml is a different question though. D'oh, Randy wanted rss+xml, not atom+xml. Missed the point. -Tim -- Mark Nottingham http://www.mnot.net/
Re: application/rss+xml
it's more an issue of whether the CC Attribution + ShareAlike 1.0 license terms are satisified by the I-D boilerplate. I've just asked CC that very question... On Mar 29, 2005, at 10:01 PM, Robert Sayre wrote: Mark Nottingham wrote: I tried; the official response [1] was that the IESG wanted to see an stable and available spec -- by their standards -- for RSS before putting it in the standards tree. Just doing a registration doesn't cut it. I worked on an RSS 2.0 I-D [2] for a while and then stopped when I got nervous about change control and copyright issues. Given that RSS 2.0 is now hosted at Harvard, it may be that the IESG will consider that a stable enough ref; if not, I'm not nearly as nervous now that it's under a CC license, and I think I could take a crack at the I-D again... Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. Are IETF drafts compatible with these terms? Robert Sayre -- Mark Nottingham http://www.mnot.net/
Re: s/url/web/
+1 to the just pick something and ship it position On Mar 18, 2005, at 2:44 AM, Dan Brickley wrote: * Bjoern Hoehrmann [EMAIL PROTECTED] [2005-03-18 11:13+0100] * Tim Bray wrote: There are a couple of places where we use uri in the markup, specifically the atom:uri element (3.2.2) and the uri attribute of atom:generator (4.2.5). In both cases they're not actually URIs, they're IRIs, so the name is WRONG, except for nobody knows what an IRI is so renaming them iri would be confusing, and anyhow everyone thinks of URLs not *RIs, but naming them url would be wrong too, so why don't we actually change them to say what they're there for not what their syntax is and use web in both cases? -Tim We can call those at or about or internet but certainly not web. While we're at it, we can relive 10-15 years of URN vs URI debates on the Atom list instead of shipping product. Are you appealing to some notion of 'online' versus 'offline' resource? A spec could be cited from the formal Atom spec? Such distinctions are notoriously hard to maintain... If you want to add an implicit (and imho illadviced) notion of 'URI dereferencability' into the spec, it'd be good to see candidate text for inclusion, rather than doing it via attribute/element name choice. Note that the deferencability of identifiers changes over time, as infrastructure is deployed (or rots away); eg. DOIs, gopher:, java: URIs... Dan -- Mark Nottingham http://www.mnot.net/
Re: PaceProfile updated
On Feb 13, 2005, at 2:52 AM, Eric Scheid wrote: On 13/2/05 11:49 AM, Mark Nottingham [EMAIL PROTECTED] wrote: The biggest change from the previous approach is that cardinality of metadata elements is specified by those elements, in the form When present, atom:title MUST occur exactly once. Profiles only constrain what metadata elements are required in the feed. does this mean then that someone can't make a profile for bilingual feeds? It depends on the definitions of the individual metadata elements. If they allow multiple instances, yes; otherwise, you'd need to define new ones. -- Mark Nottingham http://www.mnot.net/
Re: PaceProfile updated
On Feb 13, 2005, at 2:03 AM, Anne van Kesteren wrote: Mark Nottingham wrote: Apologies for the delay; I've been sick since Monday. I've revised PaceProfile to make it more complete, as requested. http://www.intertwingly.net/wiki/pie/PaceProfile The biggest change from the previous approach is that cardinality of metadata elements is specified by those elements, in the form When present, atom:title MUST occur exactly once. Profiles only constrain what metadata elements are required in the feed. Do I understand it correctly that the profiles themselves still need to be discussed? That's the purpose of this Pace. Are you asking something else? Also, you can not mix profiles like you can in HTML 4.01? (Although there it is underdefined.) That's my intention in this proposal. I'm amenable to allowing multiples, though, as mentioned in the notes. I would also like to know what is meant exactly with metadata elements. Which elements does that include and which are excluded. All of those that are children of atom:head and atom:entry are metadata. -- Mark Nottingham http://www.mnot.net/
Re: PaceProfile updated
of profiles should at least touch on the topic of compatibility with the core profile PaceProfileAttribute, for instance, strongly recommends that any new profiles be created so that they are backwards compatible with the core while acknowledging that incompatible profiles could be created. I don't think this is a good idea; parts of the core won't make any sense for some applications. If we allow multiple profiles to be advertised, someone can *say* that they're compatible with both FooProfile and the Core. The proposal also states that @profile is informational only and does not impose any implementation requirements on the part of implementors. Things aren't so clear with PaceProfile. Is the @profile attribute normative? Are implementations *required* to validate against the profile? What should an implementor do if an unknown profile is found? I thought this was clear (keeping in mind this is pre-spec text); Atom Processors MAY change their behaviour based upon it, but are not required to do so. I'll beef this text up to be more explicit, along the lines above. Thanks! -- Mark Nottingham http://www.mnot.net/
PaceProfile updated
Apologies for the delay; I've been sick since Monday. I've revised PaceProfile to make it more complete, as requested. http://www.intertwingly.net/wiki/pie/PaceProfile The biggest change from the previous approach is that cardinality of metadata elements is specified by those elements, in the form When present, atom:title MUST occur exactly once. Profiles only constrain what metadata elements are required in the feed. Regards, -- Mark Nottingham http://www.mnot.net/
mustUnderstand, mustIgnore [was: Posted PaceEntryOrder]
On Feb 5, 2005, at 6:26 AM, Joe Gregorio wrote: On Thu, 3 Feb 2005 20:25:50 -0800, Mark Nottingham [EMAIL PROTECTED] wrote: My preference would be something like This specification assigns no significance to the order of atom:entry elements within an Atom Feed Document. Atom Processors MAY present entries in any order, unless a specific ordering is required by an extension. (I.e., I could come up with the UseLexicalOrdering extension, and require processors to understand it to use the feed, assuming our extensibility model supports that, which I very much hope it will). -1 Atom is a Must Ignore format. What does that mean? SOAP is a Must Ignore format, but it also has a way of saying that you have to understand a particular extension; as I said before, this is one of the big problems with HTTP. mustUnderstand should be used sparingly, but sometimes it's necessary. -- Mark Nottingham http://www.mnot.net/
Re: Posted PaceEntryOrder (was Entry order)
On Feb 5, 2005, at 4:38 AM, Henry Story wrote: You put this in terms of databases and I put the question in terms of graphs (which if you have an rdf database to store your triples comes to the same thing). And my feeling is here that we should not have to keep the sequence numbers of the order of the entries in the document. Very well said, with emphasis on have to; they shouldn't be prohibited from doing so if they want to (and I don't think you're saying otherwise). Display order on the client will also completely depend on what the client is trying to do. If the client is just interested in archiving all the entries, then any new feed be it an old one or a new one will be of interest: it will just be added to the database. +1 -- Mark Nottingham http://www.mnot.net/
Re: PaceProfile - new
Bill, I'm sorry, I don't think I get what you're saying; the words all make sense, but I don't know how you got here. Atom currently constrains feed data (e.g., you MUST have a title, there MUST only be one) based on the most common use case; bloggling/news syndication. How does this move towards agents enforcing policy on one another? If anything, the Pace makes it *less* likely that publishers will make assumptions about how data will be processed downstream; as Atom is currently specified, there are lots of such assumptions built in. The Pace doesn't place any requirements on Atom Processors WRT @profile; it's just an advisory flag that tells it what kinds of metadata it can count on appearing in the feed. On Feb 4, 2005, at 1:53 AM, Bill de hÓra wrote: One concern I have is that I don't want to see feed data constrained according to usecases (ie system monitoring); that makes the data less useful and goes down the route of publishers telling clients how to write their applications or making assumptions on how data will be processed downstream. While I believe that claiming Atom is a container for metadata is good, I doubt that profiling represents a more flexible approach - it moves us away from agents coordinating together to agents enforcing policy on one another via a profiling mechanism. -- Mark Nottingham http://www.mnot.net/
Re: Call for final Paces for consideration: deadline imminent
On Feb 4, 2005, at 12:29 PM, Paul Hoffman wrote: At 2:56 PM -0500 2/4/05, Bob Wyman wrote: Although I can't find it specified in the current draft, there used to be a rule that you weren't supposed to use the same atom:id more than once in a single feed. The current draft says: 5.8 atom:id Element The atom:id element is an Identity construct that conveys a permanent, universally unique identifier for an entry. atom:entry elements MUST contain exactly one atom:id element. That means that you're not allowed to sue the same atom:id in any two entries, ever. I don't read it that way, although I understand how you might infer that; there's too much wiggle room in the current text for that intent to be clear. I.e., just because it's a permanent, universally unique identifier doesn't mean you're not able to use it twice to talk about a single entry; to RDF people, this will seem quite natural. If you want to only see one instance of an atom:id's content in the set of all entries ever published in any feed, you need to say that explicitly. -- Mark Nottingham http://www.mnot.net/
Re: PaceExtendingAtom
It certainly gives the impression that there's a preference; it's like saying The language of the feed SHOULD be English; there are lots of options, and we don't require one, but it does call one out. Why is this a normative requirement, and what does adding this sentence bring to the spec? On Feb 3, 2005, at 11:27 PM, Tim Bray wrote: On Feb 3, 2005, at 8:17 PM, Mark Nottingham wrote: This specification describes Atom's XML markup vocabulary. Markup from other vocabularies (foreign markup) can be used in Atom in a variety of ways. Text Constructs may contain foreign markup which SHOULD be HTML or XHTML. What does this statement add? Why is HTML or XHTML normatively preferred over text, MathML, or any other vocabulary? It's not preferred over text, you can say type='TEXT', but since this is a *text* construct and quite likely to be presented in rows columns (title, author, etc) simpler is better, so I'd think the SHOULD is sensible here. -Tim -- Mark Nottingham http://www.mnot.net/
Re: PaceExtendingAtom
Baking this as a normative requirement -- even a SHOULD -- into a standards-track RFC is a bad idea. These formats are not the only interoperable formats on the planet, and in fact they all have interop problems to some degree. In five years, this requirement isn't going to make any sense. I've said my piece on this one; I'm interested in responses to my other questions and points. Cheers, On Feb 4, 2005, at 2:15 PM, Robert Sayre wrote: Mark Nottingham wrote: It certainly gives the impression that there's a preference; it's like saying The language of the feed SHOULD be English; there are lots of options, and we don't require one, but it does call one out. Why is this a normative requirement, and what does adding this sentence bring to the spec? TEXT, HTML, and XHTML are preferred because they interoperate. there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. Seems right to me. If MathML titles become wildly popular, the spec can be revised later on. Robert Sayre -- Mark Nottingham http://www.mnot.net/
Re: Call for final Paces for consideration: deadline imminent
When you talk about characters being the same or different, are you saying in the entry, or in the id? On Feb 4, 2005, at 2:18 PM, Graham wrote: On 4 Feb 2005, at 10:09 pm, Mark Nottingham wrote: The term version seems out of place here. What you're saying, in effect, is that the ID acts as a hash of entry's content, correct? If, so, what value does it bring? Pardon: Any two versions of the same entry must use the same id, [which requires that all characters are the same]. Any two different entries must have different ids, [which requires that at least one character is different]. It says different versions must have the same id. How is that a hash? Graham -- Mark Nottingham http://www.mnot.net/
Re: PaceProfile - new
Hmm. I'm thinking of profiles as fairly coarse-grained things; so coarse-grained, it wouldn't make sense to mix-and-match them in a single document (or, if you do, you either don't use a profile, or you invent a new one). I.e., does it make sense to mix a stock quote entry with a systems monitoring or blog entry? How would a UA present this? On Feb 4, 2005, at 8:15 AM, Bill de hÓra wrote: Mark Nottingham wrote: Bill, I'm sorry, I don't think I get what you're saying; the words all make sense, but I don't know how you got here. [../] The Pace doesn't place any requirements on Atom Processors WRT @profile; it's just an advisory flag that tells it what kinds of metadata it can count on appearing in the feed. Ok, I'll calm down and try again. If the advisory @profile scopes at the level of the feed, I think that's too broad a scope. It needs to scope at the level of the entry, or it's liable to becomes meaningless when entries are mixed and matched. I can barely figure out how to class individual entries, never mind entire feeds. Maybe there's a use case I'm not getting. -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems
Re: Call for final Paces for consideration: deadline imminent
Just a note; I'm planning to remove the Identity Construct from -06, because it's only used in one place (the definition of atom:id). Otherwise, this sounds like a reasonable start. On Feb 4, 2005, at 3:23 PM, Antone Roundy wrote: On Friday, February 4, 2005, at 03:12 PM, Antone Roundy wrote: An Identity construct is an element whose content conveys an unchanging identifier which MUST be universally unique within Atom Documents to the set of all versions and instantiations of the resource that the construct's parent instantiates. Okay, getting serious now, there is room for clarification. How about we replace this: 3.5 Identity Constructs An Identity construct is an element whose content conveys a permanent, universally unique identifier for the construct's parent. with this: 3.5 Identity Constructs An Identity construct is an element whose content conveys a permanent, universally unique identifier for the resource (instantiated|described) by the construct's parent element. An Atom Document MAY contain multiple (revisions|versions) of the same resource, in which case the content of the Identity construct for each would be identical. Applications MAY decline to display more than one version of each resource. Comments? Preferences? Better ideas? Is it ready for a Pace? -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems
Re: PaceEnclosuresAndPix status
Just a point of data; most logos are designed to look good at a 1-to-1 aspect ratio. On Jan 24, 2005, at 5:25 PM, Tim Bray wrote: On Jan 24, 2005, at 5:18 PM, Joe Gregorio wrote: +1 Should there be a suggested size for images? A suggested aspect ratio, actually. Drat. Brent Simmons loved this idea, and I meant to update the draft. Would anyone be upset if I updated the draft to say an aspect ratio of 2 (horizontal) to 1 (vertical)? -Tim -- Mark Nottingham http://www.mnot.net/
Re: Posted PaceEntryOrder (was Entry order)
My preference would be something like This specification assigns no significance to the order of atom:entry elements within an Atom Feed Document. Atom Processors MAY present entries in any order, unless a specific ordering is required by an extension. (I.e., I could come up with the UseLexicalOrdering extension, and require processors to understand it to use the feed, assuming our extensibility model supports that, which I very much hope it will). Seem reasonable? On Feb 2, 2005, at 4:18 PM, David Powell wrote: This specification assigns no significance to the order of atom:entry elements within the feed. Processors MAY present entries in a different order to which they are appear in an Atom Feed Document. -- Mark Nottingham http://www.mnot.net/
Re: PaceRemoveVersionAttr
In its present form, I want to get rid of the version attribute. That's not to say that I don't want something with more useful semantics, on a separate axis from the namespace. So, +1 to the Pace. On Feb 3, 2005, at 1:18 PM, Norman Walsh wrote: Some thoughts - It seems very likely to me that Atom will evolve over time. - For some applications, changing the namespace name with every version is entirely impractical. Atom may or may not be in this category. Do feeds become legacy? Are people storing entries with the expectation that the readers of 2014 will be able to display them, albeit perhaps imperfectly? Changing the namespace makes that *hard*. XSLT transformations and XML Queries for Namespace1 flatly will not work with Namespace2. - When I look at a feed, I am comforted on some emotional level by my ability to know what version the creator intended it to be. - Perhaps YAGNI applies. I think, on balance, I'm +1 for keeping it, but I doubt I'd lie down in the road over it. Be seeing you, norm -- Norman Walsh [EMAIL PROTECTED] | Life is a great bundle of little http://nwalsh.com/| things.--Oliver Wendell Holmes -- Mark Nottingham http://www.mnot.net/
Re: PaceRemoveInfoAndHost
+1 Welcome to the club :) On Feb 2, 2005, at 10:29 AM, Robert Sayre wrote: http://www.intertwingly.net/wiki/pie/PaceRemoveInfoAndHost Proposal --- Remove atom:info and atom:host from The Atom Syndication Format. Remove atom:host --- No one seems to like the atom:host element. It doesn't do what its original proponents wanted and many of its detractors still oppose it. Design by committee at its worst. Remove atom:info --- Back when we were arguing about IETF vs. W3C, mnot said in the IETF, its easier to shut down a solo raving loony. In the threads on atom:info, it seems I am playing the role of solo raving loony. So, let's have the process take over. Robert Sayre -- Mark Nottingham http://www.mnot.net/
Re: Call for final Paces for consideration: deadline imminent
Walter brings up an important point; has, or when will, the drafts be compared to the requirements in the charter? Cheers, On Feb 2, 2005, at 8:36 PM, Walter Underwood wrote: The charter says that Atom will work for archiving. We don't know that it will, and it hasn't been discussed for months. Is the current Atom spec sufficient for archiving? If not, we aren't done. wunder --On February 2, 2005 5:46:51 PM -0800 Paul Hoffman [EMAIL PROTECTED] wrote: Greetings again. And, thanks again for all the work people did on the last work queue rotation. We now have the end of the format draft squarely in sight. The WG still has a bunch of finished Paces that have not been formally considered, a (thankfully) much smaller number of unfinished Paces, and a couple of promises that I'll write that up as a Pace soon. We need to finish soon in order to make our milestone, and I believe we can do so gracefully. On Monday, Feb. 7, the Working Group's final queue rotation will consist of all Paces open at that time. Any Paces that have obvious holes in them (to be filled in later, more needs to go here, etc.) will be ignored. We have had over a year of time here, and many weeks since the previous attempt to close things out. On Monday, Feb. 14, we will assess WG consensus and ask the document authors to put together a final draft. Note that this is not the last opportunity for work on the Atom format. For one thing, there are plenty of non-core extensions that folks have been mulling over; having the core draft finally finished will help those to emerge. Further, we need to do the final work on the protocol document. Also, during the formal IETF Last Call, discussion of the format draft will be welcome from everyone (including people who have not read any of the earlier drafts). Please do *not* rush out to write a Pace unless it is for something that is *truly* part of the Atom core, and you really believe that it is likely that there will be consensus within a week. If your idea is appropriate as an extension, or is for something that is quite similar to something else that has explicitly gotten lack of consensus, please do not write a Pace. In the former case, please hold your extensions for a few weeks; in the latter case, please recognize that asking the WG to focus on something that they don't want will likely cause us to do a worse job at carefully reviewing things that we all want. So, if you have an incomplete Pace now, you have a few more days to complete it. Of course, everyone should feel free to continue talking about the current Paces now, and to continue to suggest editorial changes to the current Internet Draft. --Paul Hoffman, Director --Internet Mail Consortium -- Walter Underwood Principal Architect, Verity -- Mark Nottingham http://www.mnot.net/
Re: Feed State [was: Work Queue Rotation #16]
This is now PaceNoFeedState; http://www.intertwingly.net/wiki/pie/PaceNoFeedState On Jan 31, 2005, at 3:46 PM, Mark Nottingham wrote: x. Managing Feed State Atom Processors MAY keep state (e.g., metadata in atom:head, entries) sourced from Atom Feed Documents and combine them with other Atom Feed Documents, in order to facilitate a contiguous view of the contents of the feed. The manner in which Atom Feed Documents are combined in order to reconstruct a feed (including methods of identifying duplicate entries, updating metadata, and dealing with missing entries) is out of the scope of this specification, but may be defined by an extension to Atom. -- Mark Nottingham http://www.mnot.net/
Re: PaceFeedState status
I'm OK with dropping wholefeed; I'll edit the pace to accommodate that. WRT warning users; you realise that putting something in the user manual, README or box of the consumer (e.g., This product does not manage feed state.) would satisfy that requirement? Would SHOULD make you feel better? Cheers, On Jan 24, 2005, at 5:09 PM, Joe Gregorio wrote: I am +1 on the //atom:head/atom:[EMAIL PROTECTED]'prev'], but -1 on //atom:head/atom:[EMAIL PROTECTED]'wholefeed'] and -1 on any of the verbage that makes demands on clients, for example, Atom consumers MUST warn users when they do not have the complete state of a feed -joe On Mon, 24 Jan 2005 16:17:42 -0800, Tim Bray [EMAIL PROTECTED] wrote: If there were no further discussion: Like PaceSupersede, this model of publishing does not (so far) enjoy consensus support. -Tim -- Joe Gregoriohttp://bitworking.org -- Mark Nottingham http://www.mnot.net/
Re: On organization and abstraction
+1 - someone else made a comment about OPML which really hit the spot; if you try to make a format do all things, it does most of them badly... On Feb 3, 2005, at 6:37 PM, Tim Bray wrote: On reflection, I am growing very negative on almost all of the Organization Paces, including FeedRecursive, PaceEntriesElement, PaceCollection. Here's why: they represent to increase the level of abstraction in Atom, and in my experience, when the goal is interoperability across the network, abstraction hurts. I would like it if the markup found in Atom documents was as close as possible to a typical human mental model. The word feed has entered the vocabulary, even the non-geek vocabulary, and the notion that there are things (entries, stories, items, whatever) in feeds likewise. -- Mark Nottingham http://www.mnot.net/
Re: Comments on format-05
On Jan 31, 2005, at 10:31 AM, Antone Roundy wrote: Another option would be to allow one content with inline content, and alternative content by reference, eg. (not being careful about getting language tags correct): content type=TEXT xml:lang=en_USThis is a pen/content content type=text/html hreflang=en_US src=http://foo.com/abc; / content type=text/html hreflang=jp src=http://foo.com/aiu; / I think that's the same option as allowing multiple content elements (unless this implies some fairly strict rules about the combinations of content elements allowable, which IMO would be difficult to specify and enforce). If the concern about multiple content is solely that it will result in more bandwidth use, I think it's misplaced; people who are concerned about bandwidth won't publish multiple representations inline; forcing them not to by legislation is misguided. -- Mark Nottingham http://www.mnot.net/
Feed State [was: Work Queue Rotation #16]
On Jan 31, 2005, at 11:45 AM, Tim Bray wrote: PaceFeedState: If no further discussion: Like PaceSupersede, this model of publishing does not (so far) enjoy consensus support. Partially pro: 2 Contra: 0 Conclusion: Not enough interest. Close it. If this is the direction we go in on this, that's fine with me, but I think that the spec needs to say *something* about managing feed state, even if it's just this: [[[ x. Managing Feed State Atom Processors MAY keep state (e.g., metadata in atom:head, entries) sourced from Atom Feed Documents and combine them with other Atom Feed Documents, in order to facilitate a contiguous view of the contents of the feed. The manner in which Atom Feed Documents are combined in order to reconstruct a feed (including methods of identifying duplicate entries, updating metadata, and dealing with missing entries) is out of the scope of this specification, but may be defined by an extension to Atom. ]]] So, if we drop PaceFeedState, I propose the text above. -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems
Comments on format-05
A few comments as I read the latest draft; apologies if I've missed relevant discussion, a pointer would be greatly appreciated in that case. * 3.1 Text Constructs -- Since the atom:content no longer references this construct, my preference would be to remove this section altogether and make atom:title, atom:copyright, atom:summary, atom:tagline and atom:info have simple textual content. Do people have use cases where there needs to be markup in the atom:copyright, and will user agents really be able to use this information? If so, why doesn't atom:author allow markup for the Person's name as well? It would be odd, for example, to allow publishers to affect the presentation of the title, but not the author's name. My gut feeling is that removing the markup from these elements will make the spec much simpler and easier to implement, without sacrificing many (if any) use cases. If I'm not aware of someone's use case here, I'm sorry and I'd love to hear about it. * 4.2.1 Usage of atom:head within atom:entry -- This doesn't seem clean; I know that people have use cases for it, but I'd like to register my preference to get rid of it, FWIW. * 4.11 The atom:host Element -- I'm surprised to see this in an IETF specification; people are going to make bad assumptions about the content of this, and violate layering to populate it. At the VERY least, I'd expect to see text in Security Considerations about it. * 4.13.2 The scheme Attribute -- I have a slight preference for getting rid of atom:category/@scheme and making atom:category a generic categorisation mechanism; if people want to use a specific scheme, they can add an Atom extension. I can see arguments for how it's done, however. * 4.15 The atom:content Element -- I strongly believe that more than one atom:content should be allowed; there are use cases when there are multiple representations of the item, and it is useful and necessary to communicate this in the feed. I suggest that multiple atom:content elements be allowed, as long as they have different type attributes. * 4.21 The atom:info Element -- If it's not considered meaningful for processors, why does there need to be a standard element for it? At the very least, some sort of information about its semantics should be documented. My preference would be to drop it. -- Mark Nottingham http://www.mnot.net/
Re: Comments on format-05
On Jan 30, 2005, at 9:07 AM, Robert Sayre wrote: Mark Nottingham wrote: My gut feeling is that removing the markup from these elements will make the spec much simpler and easier to implement, without sacrificing many (if any) use cases. If I'm not aware of someone's use case here, I'm sorry and I'd love to hear about it. It doesn't really matter, and I'd wager everyone has a pet element that they want to use HTML in. (Blogger's titles...) I'll explain why it's not a problem. In Thunderbird's RSS reader, titles are displayed in the messages pane where the subject: header would reside, thus there can be no markup. However, RSS2 items are not required to have titles, so sometimes the description is excerpted. This requires stripping HTML. The code is already there. In a newspaper view in something like bloglines or feeddemon, more html could pass through. Fair enough. I think that everyone will end up stripping, and therefore everyone will end up having to implement that; for some people, that will raise the cost considerably (e.g. escaping broken, escaped HTML in XSLT). Sigh. * 4.11 The atom:host Element -- I'm surprised to see this in an IETF specification; people are going to make bad assumptions about the content of this, and violate layering to populate it. At the VERY least, I'd expect to see text in Security Considerations about it. I don't understand your objections here. I understood them when they were in the Person construct, but I don't anymore. Because people are going to do the same thing; well, this comment post came from that IP address via HTTP, so I'll put that into atom:host; after all, it's in the spec, so it must be important. * 4.21 The atom:info Element -- If it's not considered meaningful for processors, why does there need to be a standard element for it? At the very least, some sort of information about its semantics should be documented. My preference would be to drop it. People use this heavily already. One example is FeedBurner feeds that incorporate Atom feeds. They know they can show the info element as an explanation. Without a standard element, they will have to write 90% similar code for every blogging vendor. It should be standardized. It's great that they're using it, but from reading the spec, I have no idea what atom:info is suppose to contain, or why. What is a human-readable explanation of the feed format? My understanding was that publishers could include a PI that pointed to a transform that, in XSL-capable browsers that aren't aware of Atom, present a nice HTML page. That's great, but why does it need to be standardized? If they control the transform and the feed, they can do the exact same thing with a foo:blag extension element, with no loss of function, interoperability, or ease of implementation. What am I missing here? -- Mark Nottingham http://www.mnot.net/
Re: Comments on format-05
Big +1! On Jan 30, 2005, at 12:34 PM, Tim Bray wrote: On Jan 30, 2005, at 12:09 PM, Robert Sayre wrote: We should either explicitly allow application/xml in section 2, or remove this element. I'm not sure which I prefer. atom:info is useful during transformations. Tossing atom:info will result in interoperability problems. I don't see how application/xml relates, but if I were forced to make the choice, I would drop atom:info. I have ABSOLUTELY NO IDEA what it is that Sam and Rob are arguing about. I suspect I'm not the only one. Could someone please explain the problem in basic terms? -Tim -- Mark Nottingham http://www.mnot.net/
atom:host [was: Comments on format-05]
I'm not going to lie down in the road to get rid of atom:host, if there are a lot of people that want it badly. However, it should be more completely specified; i.e., some mention in security considerations, and also, more information about the association. Right now, it's just domain name or network address associated with an entry's origin. What is that association? It feels very much like atom:date, when it had a date, but no indication of what the date represented; i.e., it's just a datatype (in this case, a hostname or ip address), but no information about *how* it relates to the entry or feed. Cheers, On Jan 30, 2005, at 3:58 PM, Bill de hÓra wrote: Robert Sayre wrote: Mark Nottingham wrote: * 4.11 The atom:host Element -- I'm surprised to see this in an IETF specification; people are going to make bad assumptions about the content of this, and violate layering to populate it. At the VERY least, I'd expect to see text in Security Considerations about it. I don't understand your objections here. I understood them when they were in the Person construct, but I don't anymore. I kind of agree with Mark on this, but I'd be reluctant to re-open that debate. I think getting atom:host out of Person was a decent trade-off. cheers Bill -- Mark Nottingham http://www.mnot.net/
Re: Comments on format-05
OK. So, why is it necessary to standardise this element? Look at http://www.mnot.net/test/atom.xml which is the same feed, but with atom:info replaced by a 'foo' element. Because the atom document has to reference the CSS anyway, it's entirely reasonable to have the css specify what element to use for the info. No functionality is lost, and no interoperability is lost. BTW, if you don't want to put a link in the text, you don't need to put anything in the XML at all; it can be done with :after. And, if you use XSLT, it's also possible to do it all in-stylesheet, with or without links. Cheers, On Jan 30, 2005, at 2:57 PM, Sam Ruby wrote: Here is a live example of atom:info in use: http://www.shellen.com/atom.xml View source. View in your favorite browser. - Sam Rubys -- Mark Nottingham http://www.mnot.net/
Re: Obs on format-05
Hey Bill, Thanks for the detailed, well-justified and precise comments; this is a very helpful format to submit them in (hint, hint). Some selective feedback; On Jan 30, 2005, at 7:58 AM, Bill de hÓra wrote: Replace: [[[ Note that the choice of any namespace prefix is arbitrary and not semantically significant. ]]] with: Note that the choice of any namespace prefix is arbitrary. reason: semantically significant is redundant and/or not true depending on your processing . I tend to disagree; we can choose to say that it's not significant if we don't want it to be, and even if it is redundant, it's good to make sure this often-misunderstood point gets home. ** 3.1.1 type Attribute Replace: [[[ If the value is TEXT, the content of the Text construct MUST NOT contain child elements. Such text is intended to be presented to humans in a readable fashion. Thus, software MAY display it using normal text rendering techniques such as proportional fonts, white-space collapsing, and justification. ]]] with: If the value is TEXT, the content of the Text construct MUST NOT contain child elements. Such text is intended to be presented to humans in a readable fashion. (I have more to say on @type later) reason: we don't need to tell people what they can do with TEXT once we tell them what it is. I could go either way on this; I tend to agree with you, but it's not a big deal. ** 3.5.1 Dereferencing Identity Constructs [[[ ... and it is suggested that the Identity construct be stored along with the associated resource. ]]] problem: I didn't understand what that meant. With you there, but I know this was a contentious point. Replace: [[[ Because of the risk of confusion between URIs that would be equivalent if dereferenced, the following normalization strategy is strongly encouraged when generating Identity constructs: ]]] with: Because of the risk of confusion between URIs that would be equivalent if dereferenced, the following normalization strategy SHOULD be applied when generating Identity constructs: reason: change the passage a become a specification. Consensus IIRC was to make this non-normative text, and that's where personally I'd like to see it stay. ** 4.1.1 The version Attribute It bugged me that the version identifier has whitespace (some kind of personal neurosis no doubt). Replace: [[[ the content of this attribute is unstructured text ]]] with: the content of this attribute is unstructured text and is not intended for interpretation by software agents. reason: it's evidently structured (I can read it) and I'm sure software will want to be abale to distinguish from another version as a string, but it's not intended to be interpreted with respect to another version for version control purposes. I think we should wait for a definite direction WRT versioning and extensibility here. ** 4.20 The atom:generator Element Replace: [[[ The atom:generator element's content identifies the software agent used to generate a feed, for debugging and other purposes. ]]] with: The atom:generator element's content identifies the software agent used to generate a feed. reason: we don't to say what can be done with this metadata. Doesn't seem harmful to give examples of what this might be used for; what if an 'e.g.' were inserted before 'for'? ** 5 Managing Feed State proposal: I think this section can be dropped. Also, I'm not sure what Pace we have in the pipeline that are going to provide the content. (pace forthingcoming) I had a proposal in e-mail, will put into a Pace. ** 6 Securing Atom Documents ** 10 Security Considerations proposal: Perhaps we can move everything security related into section 10 and drop section 6. (pace forthingcoming) Sounds like a good idea, but I don't feel strongly about it if anyone wants it the way it is. Cheers, -- Mark Nottingham http://www.mnot.net/
Re: atom:info [was Re: Comments on format-05]
On Jan 30, 2005, at 7:03 PM, Graham wrote: On 31 Jan 2005, at 2:40 am, Mark Nottingham wrote: which is the same feed, but with atom:info replaced by a 'foo' element. Even better, you can drop foo and put the xhtml div as a direct child of feed. Then use feed div as the selector. Nice! And, if you use XSLT, it's also possible to do it all in-stylesheet, with or without links. Safari (and probably other things) don't do XSLT. Fair enough. -- Mark Nottingham http://www.mnot.net/
Re: Dereferencing Identity Constructs
On Jan 30, 2005, at 7:10 PM, Paul Hoffman wrote: I'm +1 on this, but would be fine if the WG doesn't want to change. Graham's wording is more useful to an implementer who wasn't on the mailing list last year (or was on and skipped over the permathreads). Ditto. (Actually I think even having a section with this name is asking for trouble. We could change it to Not Dereferencing Identity Constructs... How about Dereferencability of Identity Constructs? -- Mark Nottingham http://www.mnot.net/
Re: atom:info [was Re: Comments on format-05]
RFC 3023, Section 7: This document recommends the use of a naming convention (a suffix of '+xml') for identifying XML-based MIME media types, whatever their particular content may represent. This allows the use of generic XML processors and technologies on a wide variety of different XML document types at a minimum cost, using existing frameworks for media type registration. [...] Some areas where 'generic' processing is useful include: o Browsing - An XML browser can display any XML document with a provided [CSS] or [XSLT] style sheet, whatever the vocabulary of that document. [...] This convention will allow applications that can process XML generically to detect that the MIME entity is supposed to be an XML document, verify this assumption by invoking some XML processor, and then process the XML document accordingly. On Jan 30, 2005, at 7:53 PM, Robert Sayre wrote: Mark Nottingham wrote: So, the relevant question seems to be whether any browsers do something interesting with +xml media types; No, the relevant question is whether +xml media types can be reliably dispatched without any knowledge of a specific scheme. I don't know the answer, but I do know that it's a question for the RFC that supercedes 3023, not Atom. If I tell NetNewsWire to GET something in the subscribe dialog, my dispatching instructions are clear. Everything is a feed. Making up rules for application/xml, text/xml, and application/octet-stream will require superceding some RFCs that I'd rather not mess with. Robert Sayre -- Mark Nottingham http://www.mnot.net/
Re: PaceIRI status
Not to advocate one position or another, but RFC 3987 doesn't obsolete RFC 3986; we have a choice. On Jan 24, 2005, at 4:17 PM, Tim Bray wrote: If there were no further discussion: It's hard to see how to avoid adopting this now that IRIs are standards-track RFC. -Tim -- Mark Nottingham http://www.mnot.net/
Re: PaceFeedState
Hi Joe, I think a simple link rel=prev/ in the head of a feed which points to the 'previous' feed would be all that is required. The client can then, at their discretion, keep following 'prev's back until they are satisfied. Leave it up to the client what to do with duplicate entry id's if it encounters them (but note in the Pace that it could happen). The entire discussion of Feed State Model can be dropped, the heart of the Pace being: [...] But I would drop the part about until it encounters a link to a document it already has seen. That may not be a good metric to go by. I disagree. If clients have their own criteria for how far back they should look, or for how they combine the entries they see into a set, they'll act differently, and consistency is important here. One of the biggest complaints I have about RSS is that different aggregators have different concepts of what my feed is. By having a well-specified model of how to reconstruct the feed, as well as a model for what a feed is, we can assure that all consumers see the same set of entries. If we just leave it up to the consumer to decide whether they've seen all of the entries, they'll use heuristics to do it, and they'll fall into traps in figuring it out. I'd rather have one algorithm that's well-tested and known to work. For example, if a client decides that it's satisfied if the set of entries is the same as the last time it saw the feed, it won't go and look one further back. However, what if there were a series of snapshots that looked like this? entry1 entry2 entry3 --- entry4 entry5 entry6 --- entry1 entry2 entry3 A client that only saw the first one would look at the last one and miss the fact that 4,5 and 6 were in the middle. Likewise, if we don't say how to combine entries into a set, clients will use different rules. I actually think we need more guidance here; e.g., how to detect changed entries. For example, what if I have my top level feed with the last 10 items in it, and each feeds 'prev' link points back to the previous 10 entries? That means that if I have 100 entries on my site then I've got 9 'prev' links. http://example.org/feed.cgi?start=100 http://example.org/feed.cgi?start=90 http://example.org/feed.cgi?start=10 Now what if I add another entry to my site, 101. Then I have 10 *new* 'prev' links: http://example.org/feed.cgi?start=101 http://example.org/feed.cgi?start=91 http://example.org/feed.cgi?start=11 http://example.org/feed.cgi?start=1 Not the most efficient mechanism, but certainly plausible and it causes problems with your spidering heuristic. I agree that this is a problem with the approach I described earlier; thanks for pointing it out. Rather than take that approach, a fully dynamic server will need to keep a table in this form; [ 'snapshot15': ['entry111','entry112'], 'snapshot14': ['entry100' ... 'entry110'], ... ] where each snapshot corresponds to a Feed Document Resource (FDR?). Once enough entries is added to the most recent snapshot (15 here), another is created. So, when someone requests the latest feed, it will get a 'this' of http://www.example.com/feeddb?id=snapshot15 and a 'prev' of http://www.example.com/feeddb?id=snapshot14. (Once again, a server doesn't have to keep all of the snapshots back in time) Best leave it up to the client. I don't think this follows, for reasons explained above. Only the server can determine what a complete set of entries is. Cheers, -- Mark Nottingham http://www.mnot.net/
PaceFeedState server-side proof-of-concept implementation
FYI, I've put 'this' and 'prev' elements on my RSS feed as a proof-of-concept on the server side; see http://www.mnot.net/blog/index.rdf This was done with templates only on stock Moveable Type 2.6. -- Mark Nottingham http://www.mnot.net/