Re: Comment on process
On Jan 8, 2005, at 2:03 PM, Danny Ayers wrote: No-one gains anything from overly protracted discussion. But I don't seen any extraordinary circumstances that might justify the imposition of cloture. Is there something related to the (still unexplained) deadline mentioned in Tim's recent post? Tim already said (before yet another round of RDF waffling started) that the chairs were seeking to produce a final draft. What he did was extend his own deadline to give you time to make a concrete, forward proposal on what needs to be done, either in a PACE (like everyone else) or in a separate draft. If you can't do that in four days then it isn't going in the next draft, which is the only horizon that chairs have control over. As far as process goes, Tim (and Paul) have been exceedingly kind. I would have simply said Put up or shut up some six months ago and further discussion would be out of order until such time as there is a concrete proposal to be discussed. IETF WGs move forward on the basis of rough consensus and running code. Roy
PaceFeedRecursive
I just created a starting point for a proposal on making the feed element recursive, eliminating the need for RDF syntax, atom:head, and a bunch of things proposed in other paces regarding aggregators and digital signatures. http://www.intertwingly.net/wiki/pie/PaceFeedRecursive Unfortunately, I have a paper deadline on Tuesday and can't procrastinate any longer, so someone else can finish the details or I'll finish it later in the week. Cheers, Roy T. Fieldinghttp://roy.gbiv.com/ Chief Scientist, Day Software http://www.day.com/
Re: PaceFeedRecursive
On Jan 9, 2005, at 5:38 PM, Graham wrote: -1; Conceptual masturbation. Wow, you like it that much? It must be a really good idea, then, since we are all just doing this to keep you stimulated. Roy
Re: PaceFeedRecursive
On Jan 10, 2005, at 6:54 AM, Graham wrote: Seriously, though, all you've done is not bothered to flatten the data at any point. Why is that a goal? We know all of the data is in the leaves, so flattening them is what happens when the entries are processed depth-first. While this may please you, all it means is the software at the far end has to wade through a few more sets of angle brackets and implement a couple of stacks. Actually, a few less angle brackets (count them), and any client that tries to process XML without some awareness of hierarchy is not my concern. Design for the clients that matter. Don't think for a second that your format contains more data that the current flat format, I would hope not -- the point was to make that data available such that it would not have to be replicated in every entry. because since it's all implicit, it's worthless. Implicit? Did you mean inherited? Hierarchical containment is explicit in XML. If say, entries appear in two separate groups in a feed, it might mean that they came from separate sources, or it might nor it might not mean anything. Why not just make it explicit? You are not talking about my proposal, or at least it was fundamentally misunderstood. Wait until it gets filled-out. Roy
Re: PaceFeedRecursive
On Jan 13, 2005, at 3:55 PM, Sam Ruby wrote: I realize that the proposal isn't flushed out, but a question nevertheless. The proposal apparently is for feeds to contain other feeds by containment. My question is whether it would make sense to also support feeds containing other feeds by reference, perhaps via a link element or via a src= attribute (analogous to content/@src defined in 5.12.2). Yes, I was thinking the same would apply to entry, would it not? A use case can be found here: http://www.decafbad.com/blog/2005/01/13/feed_playlists_versus_feed_urls Also, the Nature feed of feeds that has been referenced a couple times. The only additional requirement I can see is that it might make sense to have a separate mime type for feeds which only reference other feeds, and feeds which contain content. Hmm, my employer is fond of saying everything is content -- feeds are just another name for collections, and most systems agree that collections create an association between members which is content in itself. Roy
Re: URI canonicalization
On Jan 31, 2005, at 7:10 PM, Martin Duerst wrote: 5) Add a note saying something like Comparison functions provided by many URI classes/implementations make additional assumptions about equality that are not true for Identity Constructs. Atom processors therefore should use simple string functions for comparing Identity Constructs. I think such a note could be a good balance to the normalization advice. That would be a falsehood. Identifiers are not subject to simplification -- they are either equivalent or not. We can add all of the implementation requirements we like to prevent software from detecting false negatives, but that doesn't change the fact that equivalent identifiers always identify the same resource. It is the author's responsibility to use URIs (or IRIs) that are actually different, not the responsibility of the protocol or implementation. I am disappointed that a MUST requirement was added to IRI in the last draft without working group review. This part Applications using IRIs as identity tokens with no relationship to a protocol MUST use the Simple String Comparison (see section 5.3.1). All other applications MUST select one of the comparison practices from the Comparison Ladder (see section 5.3 or, after IRI-to-URI conversion, select one of the comparison practices from the URI comparison ladder in [RFC3986], section 6.2) is completely missing the point of the ladder. The identifiers may or may not be equivalent and there is absolutely no reason for protocols to require inaccurate comparisons. The reason for simplification of comparison is ONLY that false negatives are an acceptable fact of life and their elimination is an implementation-specific decision that has no impact on interoperable use of identifiers. That is why there is no such requirement for URIs. Roy
Re: URI canonicalization
There is no reason to require any particular comparison algorithm. One application is going to compare them the same way every time. Two different applications may reach different conclusions about two equivalent identifiers, but nobody cares because AT WORST the result is a bit of inefficient use of storage. The guidance, if any, should simply state that identifier constructs must be unique. It is not our responsibility to prevent people from assigning the same (equivalent) identifiers to two different resources, nor do I care how many errors occur when they violate such a basic requirement. Roy
Re: URI canonicalization
On Jan 31, 2005, at 8:40 PM, Tim Bray wrote: Graham's right, the word identical is wrong, because in fact you will commonly encounter two instances of the same entry which aren't identical (e.g. the one in your cache and the one you just fetched). I suggest Software MUST treat any two entries which have the same ID as instances of the same entry, regardless of how they were obtained or any differences in their content. Or if you want it less prescriptive By definition, any two entries which have the same ID are instances of the same entry, regardless of how they were obtained or any differences in their content. Over-specification is just too fun. So that would mean I am required by Atom format to treat two different entries with the id http://tbray.org/uid/1000; as the same entry, even when I received the first one from tbray.org and the second from mymonkeysbutt.net? Oh yeah, that'll be fun. ;-) It is not necessary for the format to require world peace, or anything generally equivalent to it. Give implementations the freedom to do whatever they like with the format -- just tell them what the syntax means and the implementations will sort themselves out. Roy
Re: URI canonicalization
On Feb 1, 2005, at 4:48 AM, Sam Ruby wrote: Roy T. Fielding wrote: There is no reason to require any particular comparison algorithm. One application is going to compare them the same way every time. Two different applications may reach different conclusions about two equivalent identifiers, but nobody cares because AT WORST the result is a bit of inefficient use of storage. It is worse than that. To give a concrete example: Radio Userland's aggregator will not present to you an item that you have seen before, no matter how different the current content is. So, at worst, if two different feeds use the same id, then the first one received will eclipse all later ones. How does requiring a specific comparison algorithm change that? The goal should be different ids, not jumping through hoops to artificially differentiate between equivalent URIs. Besides, I don't think that existing sites with obvious security holes should be used as an example for format requirements, unless said requirements are going to close those holes. Roy
Re: URI canonicalization
On Feb 1, 2005, at 7:46 AM, Tim Bray wrote: On Jan 31, 2005, at 10:16 PM, Roy T. Fielding wrote: Over-specification is just too fun. So that would mean I am required by Atom format to treat two different entries with the id http://tbray.org/uid/1000; as the same entry, even when I received the first one from tbray.org and the second from mymonkeysbutt.net? Oh yeah, that'll be fun. ;-) Well, it would be much better than we have now. Anyone who subscribes to aggregations (for example, I subscribe to the planetsun.org aggregated feed), is used to seeing the same entry over and over and over again. This problem is only going to get worse. With Atom's ID semantics and compulsory updated timestamp, I would hope that my aggregator would have a chance of not showing me the same entry unless it's got a later timestamp than what I've seen. -Tim The problem is that the requirement you suggested removes their ability to use common sense in dealing with same-ID entries and remain compliant with the format. You don't need to tell aggregators how to implement their trust mechanism. Just tell them what the format means and they will either improve their presentation or lose to some other aggregator who does it right. If you do want to place requirements on aggregators, then please do so in a separate spec or in an appendix. That way, us folks who don't implement aggregators don't need to obey requirements that conflict with basic content management and security principles. Roy
Re: URI canonicalization
On Feb 1, 2005, at 5:12 PM, Graham wrote: On 2 Feb 2005, at 12:52 am, Roy T. Fielding wrote: There is no need to explain what different ids means -- any two URIs that are different identifiers will never compare as equivalent, regardless of the comparison algorithm used. Pardon? If I use case sensitive ids (eg base64 style http://www.example.com/aBCde; and http://www.example.com/aBCdE; are different), and the client is case-insensitive, that's not going to work. I meant regardless of the URI comparison algorithm used, as described in RFC3986. I wouldn't expect comparison based on number of 1 bits to work either. Roy
Re: mustUnderstand, mustIgnore [was: Posted PaceEntryOrder]
On Feb 5, 2005, at 9:48 AM, Mark Nottingham wrote: What does that mean? SOAP is a Must Ignore format, but it also has a way of saying that you have to understand a particular extension; as I said before, this is one of the big problems with HTTP. mustUnderstand should be used sparingly, but sometimes it's necessary. The problem with that statement (about HTTP) is that absence of a must-understand in HTTP is not one of its big problems. Yes, I know lots of people have talked about it as a limiting factor in the syntax of HTTP, but to call it an actual problem would say that it prevented some good from being accomplished. Henrik spent an enormous amount of time devising a must understand feature for HTTP, only to find that the features needing it were either not worth deploying in the first place or too risky to deploy even on a must understand basis. The features that people actually needed to deploy were able to be standardized because those people were willing to work hard to come to a common agreement on the benefit of the feature. One problem is that the must understand feature is intended to prevent dumb software from performing an action when it doesn't know how to do it right. In reality, software tends to have bugs and developers tend to be optimistic, and thus there is no way to guarantee the software is going to do it right even if it claims to know how. In the end, we just waste a bunch of cycles on unimplemented features and failed requests. Another problem is that the features that benefit from a must-understand bit tend to be socially reprehensible (and thus the only way they could be deployed is via artificial demand). As soon as one of those features get deployed, the hackers come out and turn off the must understand bit for that feature, defeating the protocol in favor of their own view of what is right on the Internet. Things that a syndication format might want to make mandatory are copyright controls and micropayments, but both have been shown in practice to require either a willingness on the part of the recipient to accept that specific restriction (i.e., human intervention and understanding) or forceful requirement by the sender (i.e., encryption). In both cases, agreements have to be established with the user in advance, before they even receive the content, and thus do not need a must understand feature. In fact, must understand has no value in a network-based application except as positive guidance for intermediaries, which is something that can still be accomplished under mustIgnore with a bit of old-fashioned advocacy. Roy
Re: BAG Question: What is a feed? Sliding-Window or Current-State?
On Feb 6, 2005, at 2:24 PM, Dan Brickley wrote: * John Panzer [EMAIL PROTECTED] [2005-02-06 13:58-0800] Since an entry is identified uniquely by its atom:id (though it can have different states at different times); As I understand the Web, the REST concepts that underpin HTTP are quite happy with their being multiple representations of the selfsame resource *at the time*. Also at multiple times. Yes, it's all about time, but also about resources. The entry is a resource, and that resource may have multiple representations at a single time (atom, rss1, rss2) and also over different times. A resource has a single state at one point in time, but may have different states over time (in some case they must have different states over time because that is the essence of what makes them a resource, just as a clock is expected to have different state). URIs identify resources, not necessarily singular states. Feeds are sliding window resources, but their representations do not slide -- they are fixed at a given point in time. If it is reasonable to say that, at any single point in time, only one representation of a given entry can appear in the feed's representation, then the only valid representation of a feed is one that does not contain any duplicate entry id's. An atom:feed document is a representation of a single feed resource at a single point in time. Aggregators do not consume feed resources -- they consume an iterative set of overlapping feed representations. Aggregators are therefore required by Atom to only include the latest version of an entry within their own resource representations. I believe that these requirements reflect the desires of most of the participants in this working group, so it seems to me that the question has been answered. Cheers, Roy T. Fieldinghttp://roy.gbiv.com/ Chief Scientist, Day Software http://www.day.com/
Re: BAG Question: What is a feed? Sliding-Window or Current-State?
On Feb 6, 2005, at 6:42 PM, Bob Wyman wrote: Roy T. Fielding wrote: Aggregators do not consume feed resources -- they consume an iterative set of overlapping feed representations. This is only true of pull based aggregators that poll for feeds. None of the aggregators that I use are polling based. I use the PubSub Sidebars and the Gush aggregator built by 2entwine.com. These aggregators consume streams of entries that are pushed to them using the Atom over HTTP protocol. No, they consume feed representations of length=1, which contains an entry representation. They are neither streams nor entries, and if we stop confusing the messages received with the rather abstract notion of what the author considers to be their entry, then it is much easier to understand what the id tells the recipient. This is not specific to the transfer protocol. It is an aspect of message passing architectures. Most network-based systems build tremendously complex synchronization and update mechanisms in an attempt to make message passing match what an ideal world would consider reality. Unfortunately for them, the theory of relativity is just as applicable to software as it is to us. HTTP (or at least the use of HTTP based on REST) changes the perspective by acknowledging that messages are not the same as what is identified. It seems a little odd, at first, but it makes a huge difference because clients stop assuming that they have a complete understanding, servers supply more explicit information about what they do send, and the overall system becomes less brittle during failures. However, HTTP only tries to match what is already true of message passing -- it does not make the rules. Regardless of the protocol used to receive an atom feed, the only things that will actually be received are representations. Computers can't transfer a temporal mapping that has yet to be defined. Roy
withdraw PaceFeedRecursive
Please withdraw PaceFeedRecursive because forcing everything to be an entry is sufficient justification to forbid inclusion by anything other than reference. The other (still needed) bits are in PaceHeadless. Cheers, Roy T. Fieldinghttp://roy.gbiv.com/ Chief Scientist, Day Software http://www.day.com/
Re: mustUnderstand, mustIgnore [was: Posted PaceEntryOrder]
On Feb 6, 2005, at 6:50 PM, Mark Nottingham wrote: On Feb 5, 2005, at 6:01 PM, Roy T.Fielding wrote: The problem with that statement (about HTTP) is that absence of a must-understand in HTTP is not one of its big problems. Yes, I know lots of people have talked about it as a limiting factor in the syntax of HTTP, but to call it an actual problem would say that it prevented some good from being accomplished. It arguably tipped some people towards SOAP when HTTP would have been adequate. That's not a prevention of good, but we've already seen enough fragmentation in the syndication world. Well, arguably, those same applications should have been tipped into the waste basket in the first place. But I don't think you followed my main point: must understand is a mechanism to enable fragmentation -- its very presence leads away from standardization. Lack of mU is one of the reasons that HTTP is not fragmented (along with me being a stubborn pain in the ass). Hence, it is only a problem for some applications that were of questionable character, and it remains unclear whether HTTP would have benefitted by having a mU feature or if its presence would have led to a complete meltdown. Things that a syndication format might want to make mandatory are copyright controls and micropayments, but both have been shown in practice to require either a willingness on the part of the recipient to accept that specific restriction (i.e., human intervention and understanding) or forceful requirement by the sender (i.e., encryption). In both cases, agreements have to be established with the user in advance, before they even receive the content, and thus do not need a must understand feature. I don't think mU is intended for such things; rather, the case for mU could be characterised as extensions that change the operation of the protocol in a manner that renders it useless or misleading to try to use the feed if you don't know what to do with the extension. It's advisory. Right, but look at my examples and try to think of any others that would require changes in operation on the behalf of recipients. There may be others, but I am not aware of any more. In fact, must understand has no value in a network-based application except as positive guidance for intermediaries, which is something that can still be accomplished under mustIgnore with a bit of old-fashioned advocacy. So, if I can restate your position, you're saying that you don't dispute that understanding some extensions may be required, but that it isn't necessary to make that visible to the processor, because it'll be co-ordinated beforehand (e.g., through market forces, out-of-band-communication), correct? No, my position is that it isn't necessary to include mU in the format. Within the control data of an interaction protocol, sure, but not within the payload of completed actions, wherein any such requirements are far too late and susceptible to abuse. Just to be clear that I am not completely against mU in all protocols, that feature does exist in waka because it is useful when talking through intermediaries. Roy
Re: BAG Question: What is a feed? Sliding-Window or Current-State?
On Feb 7, 2005, at 5:15 AM, Henry Story wrote: This is true only if you have the [Equivalence ID] interpretation of the id relation. (Ie you think of id as equivId.) Yes, and to make myself perfectly clear, that means functional ID, as you call it, would conflict with the design of Atom and any reasonable design of a system wherein the things being identified are allowed to be updated over time. I wonder if you read the rest of my message... Yes, though it seems to me that you swapped the meanings of funcId and equivId somewhere between your definitions and your example, so I lost interest. I gave an example that did not conflict with a reasonable design. On the contrary, you gave an example in which a feed contained two entries with the same id (where id here is defined as meaning the author intends this entry to supplant any prior entries with the same id). It is not reasonable to include more than one entry with the same id within the same feed representation, since all entry representations in the feed representation must validly represent the state of the feed at the time that the feed representation was generated (otherwise, it is not a feed representation at all -- it is an archive, history, ... whatever). This whole discussion presumes that it is not desirable to send all versions of all entries as part of the feed; i.e., the author specifically intends old versions of an entry to never appear in later versions of the feed. If you don't agree with that presumption, then I think you are talking about something other than what most bloggers call a feed. So, please, stop trying to make a real system fit an artificial model of graph theory that isn't even capable of describing the Web. Fix your model instead. Do you have a good paper that proves your incredibly strong statement above? I would be happy to understand this position, as it would save me a lot of time. Okay, but only if you promise not to extend this discussion on the atom list -- take it elsewhere if you wish to puncture my opinion -- atom should be allowed to progress in peace without invoking the defenders of RDF universality. http://www.w3.org/TR/rdf-mt/#intro: There are several aspects of meaning in RDF which are ignored by this semantics; in particular, it treats URI references as simple names, ignoring aspects of meaning encoded in particular URI forms [RFC 2396] and does not provide any analysis of time-varying data or of changes to URI references. It does not provide any analysis of indexical uses of URI references, for example to mean 'this document'. Some parts of the RDF and RDFS vocabularies are not assigned any formal meaning, and in some cases, notably the reification and container vocabularies, it assigns less meaning than one might expect. These cases are noted in the text and the limitations discussed in more detail. RDF is an assertional logic, in which each triple expresses a simple proposition. This imposes a fairly strict monotonic discipline on the language, so that it cannot express closed-world assumptions, local default preferences, and several other commonly used non-monotonic constructs. RDF has no capacity for temporally qualified assertions, no conceptual understanding of sink-like services, and very little habit of distinguishing between resources and representations (though it does have the capacity to deal with representations as b-nodes). Likewise, it does not differentiate between identification and use, which means it cannot be used to describe the many times in which a single URI can be used for many different things even if it only identifies (directly) one thing. The same comments apply to OWL. As previously stated, understanding the meaning of a resource is all about time. Trying to describe the Web without that dimension is a hopelessly futile effort that leads to such absurdities as declaring some URIs as being invalid or inappropriate simply because their use does not fit the artificial model of a timeless world. But I can't just take your word for it, as there are numerous people who don't seem to think the way you do, and in just as senior or more senior positions than your are. My position is not based on seniority. But perhaps I have missed some important development recently. (I am not being ironic here though it may sound like it. I really would like to understand more fully your position.) No, as far as I know, there have been no recent developments that would cause the semantic web to reconsider its central assumptions. Cheers, Roy T. Fieldinghttp://roy.gbiv.com/ Chief Scientist, Day Software http://www.day.com/
Re: Regarding Graphs
On Jan 9, 2005, at 4:23 AM, Danny Ayers wrote: There were a couple of points made in recent discussions that might have been misleading. One was that Atom is a tree. The XML may use that structure, and in it's simplest form the information being represented may be tree-shaped, but that isn't necessary the case. No, the element containment hierarchy that establishes default relationships between data and metadata is a tree. For example: feed xmlns... head link href=http://example.org/feed/ ... /head entry idhttp://example.org/entry/id link rel=related href=http://example.org/feed; / ... /entry /feed If resources are view as nodes, then http://example.org/feed has two parents. The containment tree is violated. Nonsense. The containment tree tells us what the tail of the link is, not the target of those links. The information is going to have far more relationships just by virtue of the content text, and thus the information will be a graph even if the relations are not made explicit via links. Their containment is only a very small subset of those relationships and cannot be violated by the mere presence of other types of links. The reason this topic came up was because people wanted to know how the format is extensible given the implicit relationship between feed, entry, and entry elements. Containment is the only answer needed because all other relationships are explicitly targeted by a URI. Personally, what I would change in the format is elimination of head and make feed a recursive element. Then there would be no doubt as to the hierarchical relations and feed vacuums can compose multiple feeds to their heart's delight without changing the entries at all. Roy
Re: Madonna
On Feb 16, 2005, at 12:09 AM, Henry Story wrote: Now I would like to show how the Madonna example is relevant to Atom. In a mail recently Roy Fielding pointed to the following passage of the the RDF Semantics document: No, you pointed to that passage. I pointed to the RDF semantics intro. Moreover, just prior to stupidly answering your question, I wrote: Okay, but only if you promise not to extend this discussion on the atom list ... and since you chose to ignore that, I cannot sensibly continue this discussion other than to say I was talking about RDF, not graph theory in general (obviously). Roy