Re: Fyi, Apache project proposal
--On May 23, 2006 3:18:18 PM +0200 Ugo Cei <[EMAIL PROTECTED]> wrote: Demokritos might be quite well advanced but unfortunately Python code is not very suited for us poor souls who still have to struggle with java environments ;-) The goal is a reference implementation. The goal is to be exactly correct. Being in a particular language, or even being fast enough to be usable, is beside the point. In particular, a reference implementation should always choose code readability over speed. If the goal is to have a standard, free implementation that everyone uses, that is different from a reference implementation and the goals should say that. wunder -- Walter Underwood Principal Software Architect, Autonomy (Ultraseek)
Re: Atom syndication schema
--On March 15, 2006 4:25:40 PM +1100 Eric Scheid <[EMAIL PROTECTED]> wrote: > Since the original discussion I've stumbled across something extra that > makes xml:lang relevant for atom:name. > > Seems that in writing Hungarian names, the pattern is always surname > followed by forename - e.g. Bartók Béla, where Béla is the personal name and > Bartók is the family name. Or Margittai Neumann János vs. John von Neumann. It can be more complicated than first/last or last/first. I'm pretty sure that I brought this up and the WG decided to punt. Representing personal names well means starting with X.500 and asking around to see what could be improved. That is well outside the Atom charter. Punting was the right thing to do, but it means that atom:name is minimal. xml:lang isn't enough information to sort out given name and family name. About all you can do with atom:name is print it out. xml:lang could be useful in deciding between Chinese and Japanese variants of a character for names. wunder -- Walter Underwood Principal Software Architect, Autonomy
Re: wiki mime type
It isn't "wiki". Those are used in blogs, and I use Markdown for simple HTML memos. Don't use "x-", either. Register a real type. wunder --On March 7, 2006 5:51:42 PM +0100 Henry Story <[EMAIL PROTECTED]> wrote: > > On 6 Mar 2006, at 18:54, James Tauber wrote: >> Agreed that this would be very useful and also that it needs to be >> done >> on a per wiki format basis. > > Is there a forum a large number of them tend to hang out on, so that one > could ask them to think about this? What would be the best to do in the > meantime? Something like > > text/x-wiki+textile > text/x-wiki+markdown > > perhaps? > > >> I think, however, that this is something the format creators should be >> encouraged to register, or at least suggest a convention for. >> >> James >> >> On Mon, 06 Mar 2006 07:59:10 -0800, "Walter Underwood" >> <[EMAIL PROTECTED]> said: >>> >>> --On March 6, 2006 3:59:39 PM +0100 Henry Story >>> <[EMAIL PROTECTED]> >>> wrote: >>>> >>>> Silly question probably, but is there a wiki mime type? >>>> I was thinking of "text/wiki" or "text/x-wiki" or something. >>>> >>>> I want people to be able to edit their blogs in wiki format in >>>> BlogEd and be able >>>> to distinguish when they do that from when they enter plain >>>> text, html or xhtml. >>>> Perhaps this is also useful for the protocol. >>> >>> It would be really useful, especially for feeds that archive the >>> content >>> of a blog. It would be best to use the official names of the formats, >>> like >>> "text/markdown" or "text/textile". The wikis and blogs that I use >>> can be >>> configured to accept different formats, so "text/wiki" doesn't work. >>> >>> wunder >>> -- >>> Walter Underwood >>> Principal Software Architect, Autonomy >>> >> -- >> James Tauber http://jtauber.com/ >> journeyman of somehttp://jtauber.com/blog/ > > -- Walter Underwood Principal Software Architect, Autonomy
Re: Atom logo where?
--On March 6, 2006 7:02:23 PM +0100 "A. Pagaltzis" <[EMAIL PROTECTED]> wrote: > > For that matter, who has seen Mena Trott’s alternative Atom logo > design and what do people think about it? 1. I don't see why Atom needs a logo. 2. The proposed logo is probably too close to the Autonomy logo. I cannot speak for Autonomy lawyers, but companies are faced with "defend it or lose it" on their trademarks. Autonomy is in the unstructured info business, so there is probably a conflict. It also looks like the logo for the Austin Bergtrom International Airport, but that doesn't conflict. wunder -- Walter Underwood Principal Software Architect, Autonomy
Re: wiki mime type
--On March 6, 2006 3:59:39 PM +0100 Henry Story <[EMAIL PROTECTED]> wrote: > > Silly question probably, but is there a wiki mime type? > I was thinking of "text/wiki" or "text/x-wiki" or something. > > I want people to be able to edit their blogs in wiki format in BlogEd and be > able > to distinguish when they do that from when they enter plain text, html or > xhtml. > Perhaps this is also useful for the protocol. It would be really useful, especially for feeds that archive the content of a blog. It would be best to use the official names of the formats, like "text/markdown" or "text/textile". The wikis and blogs that I use can be configured to accept different formats, so "text/wiki" doesn't work. wunder -- Walter Underwood Principal Software Architect, Autonomy
Re: atom:updated handling
It doesn't hurt to point it out. It could catch some developer errors. But it doesn't make an invalid feed. --wunder --On February 15, 2006 4:25:35 PM -0800 James M Snell <[EMAIL PROTECTED]> wrote: > > I personally think that the feedvalidator is being too anal about > updated handling. Entries with the same atom:id value MUST have > different updated values, but the spec says nothing about entries with > different atom:id's. > > - James > > James Yenne wrote: >> I'm using the feedvalidtor.org to validate a feed with entries >> containing atom:updated that may have the same datetime, although >> different atom:id. The validator complains that two entries cannot have >> the same value for atom:updated. I generate these feeds and the >> generator uses the current datetime, which may be exactly the same. I >> don't understand why the validator should care about these >> updated values from different entries per atom:id - these are totally >> unrelated entries. Is the validator wrong? It seems that otherwise I >> have to play tricks to make these entries have different updated within >> the feed. >> >> I'm not sure how this relates to the thread "More on atom:id handling" >> >> Thanks, >> James > > -- Walter Underwood Principal Software Architect, Autonomy
Re: atom:updated handling
--On February 15, 2006 4:07:35 PM -0800 James Yenne <[EMAIL PROTECTED]> wrote: > > I'm using the feedvalidtor.org to validate a feed with entries containing > atom:updated that may have the same datetime, although different atom:id. > The validator complains that two entries cannot have the same value for > atom:updated. I got the same spurious warning. My feed is search results, so it is perfectly OK for them to have the same atom:updated. It is OK for the validator to point this out, but it should be informational, not a warning. wunder -- Walter Underwood Principal Software Architect, Autonomy
Re: [Fwd: Re: todo: add language encoding information]
--On December 23, 2005 11:31:22 PM +0100 Henry Story <[EMAIL PROTECTED]> wrote: > > So you can't have a link pointing from an entry to an id, without losing some > very important information. We need something more specific. We need a link > pointing from A to C as shown by the blue line. Some people will need that in the guts of their publishing system. Why do we need it in Atom? Is there something essential that subscribers cannot do because this isn't represented? This sounds like something needed for the publishing/translation workflow, not for the general readership. Extended provenance information is sometimes needed, but there is almost no limit to that. It certainly does not stop at translation, source, and translator. I'm reading a new translation of Andersen's tales where "Thumbelina" is "Inchelina" because the translator knew the right dialect of Danish. That is significant, but does it need to be in Atom? The semantics here should be exactly the same as for dates -- the date means what the publisher thinks it means. Same for language info. Trying to get more exact means that the model will be wrong for some publishers that generate completely legal Atom. wunder -- Walter Underwood Principal Software Architect, Verity
Re: ACE - Atom Common Extensions Namespace
--On October 2, 2005 9:35:28 AM +0200 Anne van Kesteren <[EMAIL PROTECTED]> wrote: > > Having a file and folder of the same name is not technically possible. > (Although > you could emulate the effect of course with some mod_rewrite.) Namespaces aren't files, only names. So the limitations of some particular file name implementation are meaningless for namespaces. Also, some filesystem implementations do allow a file and a folder with the same name. wunder -- Walter Underwood Principal Software Architect, Verity
Re: Arr! Avast me hearties!
I think we just got a nomination for an April 1 RFC. Nice job. More accurate than the x-hacker locale on Google, because that is really still english, not some other "hacker" language. Besides, they didn't make the spell suggest work in l33t. wunder --On September 20, 2005 3:09:56 AM +0100 James Holderness <[EMAIL PROTECTED]> wrote: > A conforming client SHOULD perform an HTTP request for the feed with the > Accept-Language header set to "en-pirate" (or whatever the standard RFC 3066 > language tag for the pirate dialect of english). A conforming server SHOULD > return the pirate version of the feed with the Content-Language header set to > "en-pirate" and/or the xml:lang attribute set to "en-pirate" in the root > element. -- Walter Underwood Principal Software Architect, Verity
Re: "Top 10" and other lists should be entries, not feeds.
--On August 30, 2005 3:50:45 PM -0600 Peter Saint-Andre <[EMAIL PROTECTED]> wrote: >> One could read that to mean that feeds are fundamentally unordered or that >> Atom doesn't say what the order means. > > Is not logical order, if any, determined by the datetime of the published > (or updated) element? That is one kind of order. Other kinds are relevance to a search term (A9 OpenSearch), editorial importance (BBC News feeds), or datetime of original publication (nearly all blog feeds, not the same as last update). wunder -- Walter Underwood Principal Software Architect, Verity
Re: "Top 10" and other lists should be entries, not feeds.
--On August 30, 2005 3:50:45 PM -0600 Peter Saint-Andre <[EMAIL PROTECTED]> wrote: >> Otherwise, it is not possible to go from Atom to RSS 1.0. > > I assume you mean from RSS 1.0 to Atom. :-) No. You can go from a Bag to List by ignoring the order. RSS 1.0 is a List, so you would need to invent an order to put unordered items in it. wunder -- Walter Underwood Principal Software Architect, Verity
Re: "Top 10" and other lists should be entries, not feeds.
--On August 30, 2005 1:49:57 AM -0400 Bob Wyman <[EMAIL PROTECTED]> wrote: > I’m sorry, but I can’t go on without complaining. Microsoft has proposed > extensions which turn RSS V2.0 feeds into lists and we’ve got folk who are > proposing much the same for Atom (i.e. stateful, incremental or partitioned > feeds)… I think they are wrong. Feeds aren’t lists and Lists aren’t feeds. The Atom spec says: This specification assigns no significance to the order of atom:entry elements within the feed. One could read that to mean that feeds are fundamentally unordered or that Atom doesn't say what the order means. Other RSS formats are ordered, either implicitly or explicity (RSS 1.0). For interoperatility, lots of software is going to treat Atom as ordered. Otherwise, it is not possible to go from Atom to RSS 1.0. > What is a search engine or a matching engine supposed to return as a resul > if it find a match for a user query in an entry that comes from a list-feed? Maybe the list feed should have a noindex flag. > Should it return the entire feed or should it return just the entry/item > that contained the stuff in the users’ query? I'd return the entry. It is all about the entries. If the list position is semantically important to the entry, then include a link from the entry to the list. "This is movie 312 in wunder's queue." wunder -- Walter Underwood Principal Software Architect, Verity
Re: Don't Aggregrate Me
--On August 29, 2005 7:05:09 PM -0700 James M Snell <[EMAIL PROTECTED]> wrote: > x:index="no|yes" doesn't seem to make a lot of sense in this case. It makes just as much sense as it does for HTML files. Maybe it is a whole group of Atom test cases. Maybe it is a feed of reboot times for the server. wunder -- Walter Underwood Principal Software Architect, Verity
Re: Don't Aggregrate Me
--On August 30, 2005 11:39:04 AM +1000 Eric Scheid <[EMAIL PROTECTED]> wrote: > > Someone wrote up "A Robots Processing Instruction for XML Documents" > http://atrus.org/writings/technical/robots_pi/spec-199912__/ > That's a PI though, and I have no idea how well supported they are. I'd > prefer a namespaced XML vocabulary. That was me. I think it makes perfect sense as a PI. But I think reuse via namespaces is oversold. For example, we didn't even try to use Dublin Core tags in Atom. PI support is required by the XML spec -- "must be passed to the application." wunder -- Walter Underwood Principal Software Architect, Verity
Re: Don't Aggregrate Me
--On Monday, August 29, 2005 10:39:33 AM -0600 Antone Roundy <[EMAIL PROTECTED]> wrote: As has been suggested, to "inline images", we need to add frame documents, stylesheets, Java applets, external JavaScript code, objects such as Flash files, etc., etc., etc. The question is, with respect to feed readers, do external feed content (), enclosures, etc. fall into the same exceptions category or not? Of course a feed reader can read the feed, and anything required to make it readable. Duh. And all this time, I thought robots.txt was simple. robots.txt is a polite hint from the publisher that a robot (not a human) probably should avoid those URLs. Humans can do any stupid thing they want, and probably will. The robots.txt spec is silent on what to do with URLs manually-added to a robot. The normal approach is to deny those, with a message that they are disallowed by robots.txt, and offer some way to override that. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: Don't Aggregrate Me
--On August 26, 2005 9:51:10 AM -0700 James M Snell <[EMAIL PROTECTED]> wrote: > Add a new link rel="readers" whose href points to a robots.txt-like file that > either allows or disallows the aggregator for specific URI's and establishes > polling rate preferences > > User-agent: {aggregator-ua} > Origin: {ip-address} > Allow: {uri} > Disallow: {uri} > Frequency: {rate} [{penalty}] > Max-Requests: {num-requests} {period} [{penalty}] No, on several counts. 1. Big, scalable spiders don't work like that. They don't do aggregate frequencies or rates. They may have independent crawlers visiting the same host. Yes, they try to be good citizens, but you can't force WWW search folk to redesign their spiders. 2. Frequencies and rates don't work well with either HTTP caching or with publishing schedules. Things are much cleaner with a single model (max-age and/or expires). 3. This is trying to be a remote-control for spiders instead of describing some characteristic of the content. We've rejected the remote control approach in Atom. 4. What happens when there are conflicting specs in this file, in robots.txt, and in a Google Sitemap? 5. Specifying all this detail is pointless if the spider ignores it. You still need to have enforceable rate controls in your webserver to handle busted or bad citizen robots. 6. Finally, this sort of thing has been proposed a few times and never caught on. By itself, that is a weak argument, but I think the causes are pretty strong (above). There are some proprietary extensions to robots.txt: Yahoo crawl-delay: <http://help.yahoo.com/help/us/ysearch/slurp/slurp-03.html> Google wildcard disallows: <http://www.google.com/remove.html#images> It looks like MSNbot does crawl-delay and an extension-only wildcard: <http://search.msn.com/docs/siteowner.aspx?t=SEARCH_WEBMASTER_REF_RestrictAccessToSite.htm> wunder -- Walter Underwood Principal Software Architect, Verity
Re: Don't Aggregrate Me
I'm adding robots@mccmedia.com to this dicussion. That is the classic list for robots.txt discussion. Robots list: this is a discussion about the interactions of /robots.txt and clients or robots that fetch RSS feeds. "Atom" is a new format in the RSS family. --On August 26, 2005 8:39:59 PM +1000 Eric Scheid <[EMAIL PROTECTED]> wrote: > While true that each of these scenarios involve crawling new links, > the base principle at stake is to prevent harm caused by automatic or > robotic behaviour. That can include extremely frequent periodic re-fetching, > a scenario which didn't really exist when robots.txt was first put together. It was a problem then: In 1993 and 1994 there have been occasions where robots have visited WWW servers where they weren't welcome for various reasons. Sometimes these reasons were robot specific, e.g. certain robots swamped servers with rapid-fire requests, or retrieved the same files repeatedly. In other situations robots traversed parts of WWW servers that weren't suitable, e.g. very deep virtual trees, duplicated information, temporary information, or cgi-scripts with side-effects (such as voting). <http://www.robotstxt.org/wc/norobots.html> I see /robots.txt as a declaration by the publisher (webmaster) that robots are not welcome at those URLs. Web robots do not solely depend on automatic link discovery, and haven't for at least ten years. Infoseek had a public "Add URL" page. /robots.txt was honored regardless of whether the link was manually added or automatically discovered. A crawling service (robot) should warn users that the URL, Atom or otherwise, is disallowed by robots.txt. Report that on the status page for that feed. wunder -- Walter Underwood Principal Software Architect, Verity
Re: Don't Aggregrate Me
There are no wildcards in /robots.txt, only path prefixes and user-agent names. There is one special user-agent, "*", which means "all". I can't think of any good reason to always ignore the disallows for *. I guess it is OK to implement the parts of a spec that you want. Just don't answer "yes" when someone asks if you honor robots.txt. A lot of spiders allow the admin to override /robots.txt for specific sites, or better, for specific URLs. wunder --On August 25, 2005 11:47:18 PM -0500 "Roger B." <[EMAIL PROTECTED]> wrote: > > Bob: It's one thing to ignore a wildcard rule in robots.txt. I don't > think its a good idea, but I can at least see a valid argument for it. > However, if I put something like: > > User-agent: PubSub > Disallow: / > > ...in my robots.txt and you ignore it, then you very much belong on > the Bad List. > > -- > Roger Benningfield > > -- Walter Underwood Principal Software Architect, Verity
Re: Don't Aggregrate Me
I would call desktop clients "clients" not "robots". The distinction is how they add feeds to the polling list. Clients add them because of human decisions. Robots discover them mechanically and add them. So, clients should act like browsers, and ignore robots.txt. Robots.txt is not very widely deployed (around 5% of sites), but it does work OK for general web content. wunder --On August 25, 2005 10:25:08 PM +0200 Henry Story <[EMAIL PROTECTED]> wrote: > > Mhh. I have not looked into this. But is not every desktop aggregator a > robot? > > Henry > > On 25 Aug 2005, at 22:18, James M Snell wrote: >> At the very least, aggregators should respect robots.txt. Doing so >> would allow publishers to restrict who is allowed to pull their feed. >> >> - James >> > > -- Walter Underwood Principal Software Architect, Verity
Re: Don't Aggregrate Me
--On August 25, 2005 3:43:03 PM -0400 Karl Dubost <[EMAIL PROTECTED]> wrote: > Le 05-08-25 à 12:51, Walter Underwood a écrit : >> /robots.txt is one approach. Wouldn't hurt to have a recommendation >> for whether Atom clients honor that. > > Not many honor it. I'm not surprised. There seems to be a new generation of robots that hasn't learned much from the first generation. The Robots mailing list is silent these day. That is why we should make a recommendation about it. wunder -- Walter Underwood Principal Software Architect, Verity
Re: Don't Aggregrate Me
I can see reasonable uses for this, like marking a feed of local disk errors as not of general interest. I would not be surprised to see RSS/Atom catch on for system monitoring. Search engines see this all the time -- just because it is HTML doesn't mean it is the primary content on the site. Log analysis reports are one good example. /robots.txt is one approach. Wouldn't hurt to have a recommendation for whether Atom clients honor that. A long time ago, I proposed a Robots PI, similar to the Robots meta tag. That would get around the "only webmaster can edit" problem with /robots.txt. The Robots PI did not catch on, but I've still got the proposal somewhere. wunder --On August 24, 2005 11:25:12 PM -0700 James M Snell <[EMAIL PROTECTED]> wrote: > > Up to this point, the vast majority of use cases for Atom feeds is the > traditional syndicated content case. A bunch of content updates that are > designed to be distributed and aggregated within Feed readers or online > aggregators, etc. But with Atom providing a much more flexible content model > that allows for data that may not be suitable for display within a feed > reader or online aggregator, I'm wondering what the best way would be for a > publisher to indicate that a feed should not be aggregated? > > For example, suppose I build an application that depends on an Atom feed > containing binary content (e.g. a software update feed). I don't really want > aggregators pulling and indexing that feed and attempting to display it > within a traditional feed reader. What can I do? > > Does the following work? > > > ... > no > > > Should I use a processing instruction instead? > > > > ... > > > I dunno. What do you all think? Am I just being silly or does any of this > actually make a bit of sense? > > - James > > -- Walter Underwood Principal Software Architect, Verity
Re: If you want "Fat Pings" just use Atom!
--On August 23, 2005 9:40:44 AM +0300 Henri Sivonen <[EMAIL PROTECTED]> wrote: > There's nothing in the XML spec requiring the app to throw away the data > structures it has already built when the parser reports the error. There is also nothing requiring it. It is optional. The only reqired behavior is to report the error and stop creating parsed information. Otherwise, "results are undefined" according to the spec. The spec does require that normal processing stop at the error. The parser can make data past the error available, but it "must not continue to pass character data and information about the document's logical structure to the application in the normal way". This still feels like a hack to me. An unterminated document is not well-formed, and is not XML or Atom. Doing this should require another RFC that says, "we didn't really mean that it had to be XML." wunder -- Walter Underwood Principal Software Architect, Verity
Re: If you want "Fat Pings" just use Atom!
--On August 23, 2005 12:01:11 PM +0900 Martin Duerst <[EMAIL PROTECTED]> wrote: > > Well, modulo character encoding issues, that is. An FF will > look differently in UTF-16 than in ASCII-based encodings. Fine. Use two NULs. That is either one illegal UTF-16(BE or LE) character or two illegal characters in ASCII or UTF-8. Of course, a transport level multi-payload system would be preferred. wunder -- Walter Underwood Principal Software Architect, Verity
Re: If you want "Fat Pings" just use Atom!
--On August 22, 2005 2:01:45 PM -0400 Joe Gregorio <[EMAIL PROTECTED]> wrote: > Interestingly enough the FF separated entries method would also work > when storing a large quantity of entries in a single flat file where > appending an entry needs to be fast. The original application was logfiles in XML. wunder -- Walter Underwood Principal Software Architect, Verity
Re: If you want "Fat Pings" just use Atom!
--On August 22, 2005 12:36:17 AM -0400 Sam Ruby <[EMAIL PROTECTED]> wrote: > With a HTTP client library and SAX, the "absolute simplest solution" is > what Bob is describing: a single document that never completes. Except that an endless document can't be legal XML, because XML requires the root element to balance. An endless document never closes it. So, the endless document cannot be legal Atom. Worse, there is no chance for error recovery. One error, and the rest of the stream might not be parsable. So, it is simple, but busted. The standard trick here is to use a sequence of small docs, separated by ASCII form-feed characters. That character is not legal within an XML document, so it allows the stream to resyncronize on that character. Besides, form-feed actually has almost the right semantics -- start a new page. wunder -- Walter Underwood Principal Software Architect, Verity
Re: FYI: Expires Extension Draft
RSS 3? Eh? The RSS ttl element is a mess. RSS 3 Lite (could we spell that word correctly?) specifies it not as information about the feed, but as an attempt to remotely control robots. RSS 2 specifies it as a caching hint, but in minutes, not seconds. Regardless it is useless for a feed with a dedicated update schedule, because it requires updating the feed every second (or minute) as the publish time approaches. For more detail, see: <http://www.intertwingly.net/wiki/pie/PaceCaching> That was a proposal, and is *not* part of Atom, but it does have some useful discussion of cache hints. For caching, use the native HTTP cache features. wunder --On August 18, 2005 2:20:21 PM -0400 Elias Torres <[EMAIL PROTECTED]> wrote: > > I tried commenting on your site, but I have to register to comment. :-( > > You linked to RSS3 [1] and I spotted something related to this > extension that could be used instead. > > 7 > > It seems more elegant than having to convert to whatever you specified > in your spec. > > Just a thought. > > Elias > > > [1] http://www.rss3.org/rss3lite.html > > On 8/17/05, James M Snell <[EMAIL PROTECTED]> wrote: >> >> http://www.ietf.org/internet-drafts/draft-snell-atompub-feed-expires-00.txt >> >> Example: >> >> >> ... >> 2005-08-16T12:00:00Z >> ... >> >> >> or >> >> >> ... >> 2005-08-16T12:00:00Z >> 2 >> ... >> >> >> This is not to be used for caching of Atom documents; nor is it to be >> used as a mechanism for scheduling updates of local copies of Atom >> documents. >> >> - James >> >> > > -- Walter Underwood Principal Software Architect, Verity
Re: Spec explanations for Pebble?
--On August 13, 2005 8:34:49 AM + Simon Brown <[EMAIL PROTECTED]> wrote: > If Tim *moves* his blog to www.timbray.com/ongoing, would you expect his Atom > IDs to remain the same? Spec aside, this has some implications for storing > Atom > IDs next to content they identify, which I imagine doesn't happen in most CMS > tools at the moment. Of course they stay the same. At the risk of being rude, "duh". It is an ID, not an href. ID, ID, ID. If we need to clarify the spec further, though, let's do it now. I don't mind specifically saying that the ID stays the same when content is relocated. wunder -- Walter Underwood Principal Software Architect, Verity
Re: Spec explanations for Pebble?
--On August 12, 2005 6:52:28 AM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: > > Except for, a bunch of blogs might agree to share a categorization > scheme, so probably not "unique to each blog". For example, as libraries start delivering literature monitoring with feeds, we'll see LCSH or some other standard category system in those. wunder -- Walter Underwood Principal Software Architect, Verity
Re: Finishing up on whitespace in IRIs and dates
--On August 11, 2005 9:04:21 PM -0700 Paul Hoffman <[EMAIL PROTECTED]> wrote: > Note that there MUST be no whitespace in a Date construct or in any IRI. Some > XML-emitting implementations erroneously insert whitespace around values by > default, and such implementations will emit invalid Atom. Nice clear wording. +1 with "MUST be no" changed to "MUST NOT be", as suggested by Aristotle. wunder -- Walter Underwood Principal Software Architect, Verity
Re: Expires extension draft (was Re: Feed History -02)
--On August 10, 2005 1:56:05 PM +1000 Eric Scheid <[EMAIL PROTECTED]> wrote: > Aside: a perfect example of what sense of 'expires' is in the I-D itself... > > Network Working Group > Internet-Draft > Expires: January 2, 2006 Especially perfect because the HTTP header does not reflect the expiration. Honestly, another reason to put expiration inside the feed is that HTTP caching is just not used. Well, except to force reloads and show you new ads. But it is extremely rare to see it per-document cache information. wunder -- Walter Underwood Principal Architect, Verity
Re: Feed History -02
--On August 9, 2005 9:28:52 AM -0700 James M Snell <[EMAIL PROTECTED]> wrote: >> I made some proposals for cache control info (expires and max-age). >> That might work for this. >> > I missed these proposals. I've been giving some thought to an > and extension myself and was getting ready to write up a draft. > Expires is a simple date construct specifying the exact moment (inclusive) > that the entry/feed expires. Max-age is a non negative integer specifying > the number of miliseconds (inclusive) from the moment specified by > atom:updated when then entry/feed expires. The two cannot appear together > within a single entry/feed and follows the same basic rules as atom:author elements. Here it is: <http://www.intertwingly.net/wiki/pie/PaceCaching> Adding max-age also means defining IntegerConstruct and disallowing white space around it. Formerly, it was OK as a text construct, but the white space issues change that. Also, we should decide whether cache information is part of the signature. I can see arguments either way. wunder -- Walter Underwood Principal Architect, Verity
Re: Feed History -02
--On August 9, 2005 1:07:29 PM +0200 Henry Story <[EMAIL PROTECTED]> wrote: > > But I would really like some way to specify that the next feed document is an > archive (ie. won't change). This would make it easy for clients to know when > to stop following the links, ie, when they have cought up with the changes > since they last looked at the feed. I made some proposals for cache control info (expires and max-age). That might work for this. wunder -- Walter Underwood Principal Architect, Verity
Re: spec bug: can we fix for draft-11?
--On August 4, 2005 9:31:55 AM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: > > So for now, I'm -1 on an weakening or removing "The element's content MUST be > an IRI" or analogous text in any other section. I'll stop shouting if I'm in > a small minority here. -Tim Wow, this string has made my "away on vacation" mailbox fatter. I strongly favor making white space around IRIs illegal in Atom, whether they are an ID or somewhere else. Same for dates. This follows the robustness principle, where we are conservative in what we generate. Atom processors are free to be liberal in what they accept, so they can strip whitespace. Or not, I don't care. Note that a feed with whitespace around an IRI can never be aggregated into another feed, because a) the ID IRI cannot be changed, and b) the new feed cannot cannot contain whitespace. Making every single processor strip whitespace smells too much like the HTML tag soup processors that we all have to maintain. Yuk. wunder -- Walter Underwood Principal Architect, Verity
Re: FormatTests
--On July 17, 2005 3:45:26 PM +0100 Graham <[EMAIL PROTECTED]> wrote: > Now do you see why canonical ids are stupid and irrelevant? Not unless the robustness principal is stupid and irrelevant. Canonical IDs are more robust. Feeds that use them will work better in the quick-and-dirty, "Desperate Perl Hacker" environment of the internet. The updated warning is just right. Thank you for using Atom, here is how you can do a better job. wunder -- Walter Underwood Principal Architect, Verity
Re: Evangelism, etc.
--On July 16, 2005 11:16:44 AM -0400 Robert Sayre <[EMAIL PROTECTED]> wrote: > I found the criticism pathetic. A little lame, at least. You can't add precision and interoperability with innovation and extension. But there is a point buried under all that. What are the changes required to support Atom? It looks complicated, but how hard is it? Here is a shot at that information. For publishers, you need to be precise about the content. There are fallbacks, where if it is any sort of HTML, send it as HTML, and if it isn't, send it as text. The XHTML and XML options are there for extra control. Also, add an ID. It is OK for this to be a URL to the article as long as it doesn't change later. That is, the article can move to a different URL, but keep the ID the same. Add a modified date. The software probably already has this, and you can fall back to the file last-modified if you have to. But if there is a better date available, use it. The ID and date are required because they allows Atom clients and aggregators to "get it right" when tracking entries, either in the same feed or when the same entry shows up in multiple feeds. Extending Atom is different from extending RSS, because there are more options. The mechanical part of extensions are covered in the spec, to guarantee that an Atom feed is still interoperable when it includes extensions. The political part of extensions has two options: free innovation and standardization. Anyone can write an extension to Atom and use it. Or, they can propose a standard to the IETF (or another body). The standards process usually means more review, more interoperability, and more delay in deploying it. Sometimes, the delay is worth it, and we hope that is true for Atom. wunder -- Walter Underwood Principal Architect, Verity
Re: The Atomic age
--On July 14, 2005 11:37:05 PM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: > > So, implementors... to work. Do we have a list of who is implementing it? That could be used in the "Deployment" section of <http://www.tbray.org/atom/RSS-and-Atom>. Ultraseek will implement Atom. We need to think more about exactly what it means for a search engine to implement it, but we'll at least spider it. wunder "Creature with the Atom Brain, why is he acting so strange?" Roky Erickson -- Walter Underwood Principal Architect, Verity
Re: Mystery abbrevations in draft 9
--On July 6, 2005 11:05:33 AM -0700 Paul Hoffman <[EMAIL PROTECTED]> wrote: > > Spelling out the abbreviations as "Unicode Normalization Form C" and "Unicode > Normalization Form KC" is fine; referencing them is *not*. A reference to the > Unicode Standard inherently points to a particular version, and the tables > used > for NFC and NFKC change from version to version. XML already has normative references to Unicode, so we can't exactly avoid those without dropping XML. Of course, the correct choice of normalization rules for atom:id is not the ones from the XML spec, but the IRI rules from RFC 3987. We really could have two sets of "standard" normalization rules in one document, one for XML and one for atom:id URIs, so I think it is worth pointing to RFC 3987 for indirect references to NFC/NFKC. Without clarification, this is a legitimate chance for confusion. wunder -- Walter Underwood Principal Architect, Verity
Mystery abbrevations in draft 9
In 4.2.6 atom:id, the last sentence is: o Ensure that all components of the IRI are appropriately character- normalized, e.g. by using NFC or NFKC. "NFC" and "NFKC" need to be defined, with a reference to the Unicode spec. wunder -- Walter Underwood Principal Architect, Verity
RE: Roll-up of proposed changes to atompub-format section 5
--On Tuesday, July 05, 2005 11:48:44 AM -0700 Paul Hoffman <[EMAIL PROTECTED]> wrote: At 2:24 PM -0400 7/5/05, Bob Wyman wrote: I find it hard to imagine what harm could be done by providing this recommendation. Timing. If we change text other than because of an IESG note, there is a strong chance we will have to delay being finalized by two weeks, possibly more. I'm fine with the delay. Two or three weeks on top of 18 months is not a big deal. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: Roll-up of proposed changes to atompub-format section 5
--On Tuesday, July 05, 2005 10:45:29 AM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: Still -1, despite Bob's arguments, at least in part because we have no idea what kind of applications are going to be using signed entries and we shouldn't try to micromanage a future we don't understand. -Tim I'm +1, because this is a "when features collide!" issue for Atom. We don't have to make it a SHOULD or a MUST, just point out that signed entries need to be standalone if they will ever be used outside of their feed context. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: Roll-up of proposed changes to atompub-format section 5
--On July 5, 2005 9:53:42 AM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: >>> >> Bob can clarify exactly what he means but from my perspective it >> comes down to an aggregation problem. If a signature is generated >> over an entry that does not contain an author element or a source >> element, that entry cannot be re-enveloped into an aggregate feed >> that does not contain a top level author element without breaking >> the signature > > Well, yes. Anyone who understands digsig, even someone such as myself with > only a surface knowledge, can see this. You can't change a signed object > without breaking the sig, that's the point. If I want to sign an entry and > also want to make it available for aggregation then yes, I'd better put in > an atom:source. But this is inherent in the basic definition of digsig; not > something we need to call out. -Tim But it is an interoperability consequence of the Atom format and cascaded values. It would be worth commenting that signed entries need to be standalone in order to be aggregated in another feed and keep their signature. wunder -- Walter Underwood Principal Architect, Verity
Re: Clearing a "discuss" vote on the Atom format
--On July 1, 2005 4:44:23 PM +0900 Martin Duerst <[EMAIL PROTECTED]> wrote: > > The reason for this is to make sure we have interoperability > with a mandatory-to-implement (and default-to-use) canonicalization, > but that we don't disallow other canonicalizations that for one > or the other as of now not yet clear reason may be preferable in > some cases in the future (but in your wording would prohibit > the result to be called Atom at all). A potential future reason that we can't even characterize isn't enough reason for me to support this. If we discover weaknesses in the canonicalization, we'll need to change Atom anyway. Explicitly making room for future incompatible canonicalizations doesn't make any sense to me. What is the point of calling something "Atom" when it uses a canonicalization which prevents interop with legal Atom implementations? wunder -- Walter Underwood Principal Architect, Verity
Re: Google Sitemaps: Yet another "RSS" or site-metadata format and Atom "competitor"
--On June 7, 2005 3:17:04 AM -0700 gstein <[EMAIL PROTECTED]> wrote: > > "proprietary" connotes closed. We published the spec and encourage > other search engines to use it. There is no intent to close or control it. "Proprietary" means "owned". Google clearly owns "Google Sitemaps". The license requires derivative works to keep the same license. That is control. It was designed in isolation, for Google's use. That is a closed spec. For example, the priority element is not specified well enough for another engine to implement it compatibly. Does it apply to ranking, crawl order or duplicate preference? An open process would have at least looked at the proposed extensions for robots.txt and earlier formats like Infoseek sitelist.txt. wunder -- Walter Underwood Principal Architect, Verity
RE: Google Sitemaps: Yet another "RSS" or site-metadata format and Atom "competitor"
--On June 3, 2005 6:48:31 PM -0400 Bob Wyman <[EMAIL PROTECTED]> wrote: > I do still think it unfortunate that Google felt compelled to invent > yet-another-format for Sitemaps. Yep. The could have used good ol' Infoseek sitelist.txt. Here is a copy from eight years ago: <http://web.archive.org/web/19970529104229/http://software.infoseek.com/products/ultraseek/docs/sitelist.html> wunder -- Walter Underwood Principal Architect, Verity
Re: OpenSearch RSS
--On Tuesday, May 31, 2005 09:46:39 AM +0100 James Aylett <[EMAIL PROTECTED]> wrote: We were also a little concerned that the OpenSearch model was very simplistic ... ... This is kind of orthogonal to the OpenSearch issue, but if people are interested in discussing a richer search extension we can try to clear some time to pull it into shape. That was my feeling, too. OpenSearch is so limited that it is not very interesting. That's too bad, because most of the hard design work was done nearly ten years ago at the STARTS project at Stanford. A couple of years ago, we put together a fairly general search web service. It is time to update that, so maybe I'll look at doing it on Atom. STARTS is here: <http://www-db.stanford.edu/~gravano/starts_home.html> wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: Last and final consensus pronouncement
The atom:author element name is embarrassing. Make it atom:creator. There were no objections to that. wunder --On May 26, 2005 10:26:54 AM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: > > > On behalf of Paul and myself: This is it. The initial phase of the WG's > work in designing the Atompub data format specification is finished over, > pining for the fjords, etc. Please everyone reach around and pat yourselves > on the back, I think the community will generally view this as a fine piece > of work. > > Stand by for announcements on buckling down on Atom-Protocol. > > Note that this is a pronouncement, not a "call for further debate". Here > are the next steps: > > 1. Editors take the assembled changes and produce a format-09 I-D. Sooner > is better. > 2. They post the I-D. > 3. Paul sends Scott a message, cc'ing the WG, that we're done. > 4. At this point there may be objections from the WG. We decide whether to > accept the objections and pull the draft back, or tell the objectors they'll > have to pursue the appeal process. > 5. The IESG process takes over at this point and we'll eventually hear back > from them. > > Last two draft changes: > > 1. PaceAtomIdDOS > > We think that the WG has consensus that it is of benefit to add a warning to > section 8 "Security Considerations". The language from PaceAtomIdDos is > mostly OK, except that the late suggestion of talking about spoofing instead > of DOS seemed to get general support. I reworded slightly. We'll leave it > up to the editors to decide whether a new subsection of section 8 is > required. > > "Atom Processors should be aware of the potential for spoofing attacks where > the attacker publishes an atom:entry with the atom:id value of an entry from > another feed, perhaps with a falsified atom:source element duplicating the > atom:id of the other feed. Atom Processors which, for example, suppress > display of duplicate entries by displaying only one entry with a particular > atom:id value, perhaps by selecting the one with the latest atom:updated > value, might also take steps to determine whether the entries originated from the same publisher before considering them to be duplicates." > > 2. PaceAtom10 > > http://www.intertwingly.net/wiki/pie/PaceAtom10 > > We just missed this one in the previous consensus call; seeing lots of +1's > and no pushback, it's accepted. > > > > -- Walter Underwood Principal Architect, Verity
Re: Consensus snapshot, 2005/05/25
--On Wednesday, May 25, 2005 11:03:46 AM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: Have I missed any? Yes, there has been high-volume debate on several other issues; but have there been any other outcomes where we can reasonably claim consensus exists? Changing atom:author to atom:creator? No objections, so far. I paste together a PACE with the official Dublin Core definition. Should we mention DC for atom:contributor? wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: inheritance issues
--On May 24, 2005 7:39:40 AM -0600 Antone Roundy <[EMAIL PROTECTED]> wrote: > On Tuesday, May 24, 2005, at 01:52 AM, Henry Story wrote: >> Simplify, simplify. I am for removing all inheritance mechanisms... +1. Inheritance has very minor advantages and very serious disadvantages. Inheriting values saves typing. It does not save bandwidth, because HTTP compression will do nearly as well. It is confusing and tricky to specify and implement. It makes the entry different when it is standalone or in a feed. It multiplies the number of test cases needed for validation. Any one of those problems is serious. wunder -- Walter Underwood Principal Architect, Verity
Re: inheritance issues
--On May 24, 2005 1:02:54 AM +0100 Bill de hÓra <[EMAIL PROTECTED]> wrote: > > Inheritance suggests a programming model to allow the evaluator to be coded > for it. Which is why it shouldn't be called "inheritance". I'd prefer something like "cascading values". wunder -- Walter Underwood Principal Architect, Verity
Re: posted PaceAuthorContributor
--On May 23, 2005 10:52:47 AM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: > > If you're worried, one good way to address the issue would be to say that > "the semantics of this element are based on the Dublin Core's [dc:creator]", > DC is pretty clear as I recall. I've been thinking that would be a good idea > anyhow. Let's call it atom:creator, then, and actually use the DC definition. Not because DC is better, but because it makes the metadata crosswalks (interoperability) work smoothly. wunder -- Walter Underwood Principal Architect, Verity
PaceCaching
--On Tuesday, May 17, 2005 09:13:37 PM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: PaceCaching Multiple -1's, it fails. I'll address the objections anyway, because I (still) think this is important. 1. This introduces multiple caching schemes. Wrong. Right now we have multiple schemes, with HTTP caching, ad hoc client caching, and ad hoc server-side load shedding. This recommends one consistant scheme, which we know will work. The current multi-scheme approach is a mess, and we can be sure that it will have problems. 2. This applies protocol caching to a client. True, but not really an isssue. HTTP caching does work when used to manage a client cache. Compare a client working through an HTTP cache to one which checks the cache information internally before issuing HTTP requests. The HTTP server will see the same series of requests. Effectively, the client will run a virtual HTTP cache internally. 3. Server-side parsing is too much overhead. Maybe with 90 MHz Pentiums, but XML parsing is really fast these days. Parse the file, cache the values, and toss them if the file has changed when you stat it. Or, the blog server software can set the cache info out-of-band to the server. 4. This requires synchronized clocks. Those are a SHOULD for HTTP, too. And they ought to be a SHOULD for Atom anwyay, because you cannot date-sort entries from two servers with unsynchronized clocks. 5. This is just like HTTP-EQUIV and that has failed. Yes and no. Most HTTP servers ignore HTTP-EQUIV, but it is still useful for passing through things like content-language when there is no HTTP header present. For Atom, the caching info would be valid when there is no HTTP cache header. This is exactly where HTTP-EQUIV is effective today. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: multiple atom:author elements?
--On Friday, May 20, 2005 09:33:01 AM -0400 Robert Sayre <[EMAIL PROTECTED]> wrote: Those are three terrible use cases. Shall we go through every element in the format and evaluate their fitness for scientific journals, legal documents, and legislation? Here is a list of 341 scientific journals with RSS feeds. Soem of these use a single author element with multiple authors crammed in, some use multiple author elements. The author elements have other problems, like "Binks, J. J." vs. "Jar Jar Binks", but that is something that the WG has ruled out of scope. http://www.library.unr.edu/ejournals/alphaRSS.aspx We really should use "creator" instead of "author". Author is nonsense for photoblogs. We can do a lot of things with Atom, but reinventing Dublin Core badly should not be one of those. +1 for multiple author elements. +1 for "creator" instead of "author", if anyone wants to go there. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: PaceAllowDuplicateIdsWithModified
--On Thursday, May 19, 2005 01:12:22 AM +1000 Eric Scheid <[EMAIL PROTECTED]> wrote: (See the wiki for a survey of tools and the dates they support.) hmmm ... Blogger, Moveable Type, JournURL, bloxsom, ExpressionEngine, ongoing, Roller, Macsanomat, WordPress, and BigBlogTool all provide dates which represent the last date/time the entry was modified, and there is no info for LiveJournal. We abaondoned full LiveJournal compatability a long time ago by requiring time zones. Older LJ posts do not have time zones. Don't know about the current ones. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: Atom 1.0?
--On Tuesday, May 10, 2005 09:12:09 AM -0700 Paul Hoffman <[EMAIL PROTECTED]> wrote: At 9:09 PM -0700 5/9/05, Walter Underwood wrote: Seriously, I don't mind "Atom 1.0" as long as the next version is "Atom 2.0". +12 I'd also be happy with just "Atom" and saying "RFC Atom" when pressed for a version. Even with "Atom 1.0" we'll need to say which RFC. If we choose a specific name, it *must* be in the RFC. Because the RFC must be a hit for that search. wunder -- Walter Underwood Principal Architect Verity Ultraseek
RE: Last Call: 'The Atom Syndication Format' to Proposed Standard
--On May 10, 2005 8:57:47 AM -0400 Scott Hollenbeck <[EMAIL PROTECTED]> wrote: > > I have to agree with Paul. I don't believe that the issue of white space in > the syndicated content is really an Atompub issue. It might be an issue for > the content creator. It might be an issue for the reader. As long as the > pipe between the two passes the content as submitted, though, the pipe has > done its job. If publishers and subscribers have obstacles to using Atom, that sounds like a problem to me. "Everyone has this problem" is not a good reason to ignore it. Someone has to be the first to solve it, might as well be us. It is not acceptable to build formats for the "English Wide Web". That doesn't exist any more. wunder -- Walter Underwood Principal Architect, Verity
Re: Atom 1.0?
--On May 9, 2005 7:29:58 PM -0700 Tim Bray <[EMAIL PROTECTED]> wrote: > > Anyone have a better idea? --Tim Hey, let's vote on a *new* name. I'm +1 on "Naked News", because it delivers the news without chrome and crap. Or maybe that is what you get when Atom (Adam?) goes public. Or because sex sells. Seriously, I don't mind "Atom 1.0" as long as the next version is "Atom 2.0". Please don't increment the right-of-the-dot part forever, because I just had to fix some software that made the (reasonable) assumption that 5.10==5.1, even though "5.10" is really Solaris 10. wunder -- Walter Underwood Principal Architect, Verity
Re: Last Call: 'The Atom Syndication Format' to Proposed Standard
--On May 7, 2005 11:29:07 AM +0300 Henri Sivonen <[EMAIL PROTECTED]> wrote: > > Why would you put line breaks in the CJK source, then? Isn't the "problem" > solved with the least heuristics by the producer not putting breaks there? It would be even better if they would just speak English. :-) White space is not particularly meaningful in some of these languages, so we cannot expect them to suddenly pay attention to that just so they can use Atom. There will be plenty of content from other formats with this linguistically meaningless white space. If we get this wrong, Atom-delivered content will look broken in some languages, and a bunch of extra-spec practice will build up about how to fix it. Much better to get it right in 1.0. wunder -- Walter Underwood Principal Architect, Verity
Re: PaceCaching
--On May 6, 2005 4:28:44 PM -0700 Paul Hoffman <[EMAIL PROTECTED]> wrote: > > -1. Having two mechanisms in two different layers is a recipe for disaster. > If HTTP headers are good enough for everything else on the web, they're good > enough for Atom. That would be a problem. But this is one mechanism with two ways to specify it. One is out-of-band in a server-specific way, the other is in the document in a standard way. Either way, it is HTTP rules for caching at all intermediate caches and at the client. Architecturally, this is exactly the same as HTTP-EQUIV meta tags for HTTP headers, and very similar to the ROBOTS meta tag for /robots.txt. In both cases, they provide a way for the document author to specify something without having permissions on the server software config. Further, these should be implemented exactly like HTTP-EQUIV, where the server software reads them and sets the header. The HTTP-EQUIV meta tag is proof "put it in the header" is not good enough for everything else. If that wasn't needed, it would be deprecated by now. There is a problem here, though. We need to specify the priority of the in-document specs vs. the HTTP header specs. I propose following the HTTP standard, in saying that the HTTP headers trump anything in the body. I'll even assume that following the HTTP spec is non-controversial, and go update the PACE. wunder -- Walter Underwood Principal Architect, Verity
RE: Selfish Feeds...
--On May 6, 2005 4:37:23 PM -0400 Bob Wyman <[EMAIL PROTECTED]> wrote: > Frankly, I really wish that we had done the "blog architecture" work > many months ago so that we would all have a shared understanding of the > system-wide issues and components rather than the widely divergent personal > and partial views that are obvious in many our conversations today... Agreed. "A conceptual model of a resource" is up there at the front of our charter, and if we don't have that, it doesn't seem like the WG is done. wunder -- Walter Underwood Principal Architect, Verity
Re: Atom feed refresh rates
--On May 5, 2005 10:53:48 AM -0700 John Panzer <[EMAIL PROTECTED]> wrote: > > I assume an HTTP Expires header for Atom content will work and play well with > caches such as the Google Accelerator (http://webaccelerator.google.com/). > I'd also guess that a syntax-level tag won't. Is this important? The syntax-level tag is useful inside a client program with a cache. It can reduce the number of requests at the source, rather than reducing them in the middle of the network at an HTTP cache. There is extra benefit from putting that info into the HTTP headers, because the HTTP cache is shared between multiple clients. The source webserver sees one GET per HTTP cache instead of one GET per Atom client. The syntax-level tag also provides a way for the feed author to specify the info without depending on webserver-specific controls. It does depend on some extra bit of software to take that info and put it in the HTTP Expires or Cache-control headers. wunder -- Walter Underwood Principal Architect, Verity
Re: AtomPubIssuesList for 2005/05/05
--On May 5, 2005 7:17:00 AM -0400 Sam Ruby <[EMAIL PROTECTED]> wrote: > > Demonstrate that you have revisited the previous discussion, and that you > either > have something new to add, or can point out some evidence that the previous > consensus call was made in error. PaceCaching was not discussed and rejected based on false information. It was rejected because it was HTTP-specific (it is not), and because it was non-core (similar features are common in other RSS specs). It does not interact with other features, so it should be a fairly clean, quick discussion. wunder -- Walter Underwood Principal Architect, Verity
Re: Atom feed refresh rates
--On May 5, 2005 8:07:15 AM -0500 Mark Pilgrim <[EMAIL PROTECTED]> wrote: > > Not to be flippant, but we have one that's widely available. It's > called the Expires header. You need the information outside of HTTP. To quote from the RSS spec for ttl: This makes it possible for RSS sources to be managed by a file-sharing network such as Gnutella. Caching information is about knowing when your client cache is stale, regardless of how you got the feed. wunder -- Walter Underwood Principal Architect, Verity
RE: Atom feed refresh rates
--On May 5, 2005 8:15:10 AM +0100 Andy Henderson <[EMAIL PROTECTED]> wrote: > > here is no RSS2 feature I can see that allows feed providers to tell > aggregators the minimum refresh period. There's the ttl tag. That was, I > believe, introduced for a different purpose and determines the Maximum time > a feed should be cached in a certain situation. We need both a ttl (max-age) and expires. One or the other is appropriate for different publishing needs. We also need to specify what you do with those values, or you end up with a mess, like the RSS2 ttl meaning reversing over an undocumented value (Yikes!). > What has yet to be tried is a specific tag in the core feed standard that > promotes and determines good behaviour for aggregators refreshing their > feeds. Even if it were to prove only a limited benefit, it would still be a > benefit. It has been tried several ways, originally in robots.txt extensions and also in RSS. It doesn't work. The model is not rich enough for publishers or for spiders/aggregators. Max-age/expires is already designed and proven. By page count, 20% of the HTTP 1.1 spec is about caching. If we want to write a new caching/scheduling approach, we can expect it to be a 20 page spec, plus an additional 10 pages on how to work with the HTTP model. See the Notes section here for details on when to use max-age or expires, and on the problems with calendar-based schemes. <http://www.intertwingly.net/wiki/pie/PaceCaching> wunder -- Walter Underwood Principal Architect, Verity
Re: Atom feed refresh rates
PaceCaching uses the HTTP model for Atom, whether Atom is used over HTTP or some other protocol. PaceCaching was rejected by the editors because it was too late (two months ago) and non-core. I think that: a) it is never too late to get it right, and b) scalability is core. The PACE describes why refresh rates do not solve the problem adequately. wunder --On May 4, 2005 5:44:18 AM -0500 Brett Lindsley <[EMAIL PROTECTED]> wrote: > > > Andy, I recall bringing up the same issue with respect to portable devices. > My angle > was that firing up the transmitter, making a network connection and > connecting to > the server is still an expensive operation in time and power (for a portable > device) - even if the server returns nothing . There is no reason to check > feeds > that are not being updated, but then, there currently is no way to know this. > > I recall there was a proposal on cache control. That seemed like a good > direction, > but I don't recall it being discussed. As you indicated, if the feed had some > element that indicated it won't be updated (for example) for another day (e.g. > a "daily news summary"), then the end client would need to only check once > a day. > > Brett Lindsley, Motorola Labs > > Andy Henderson wrote: > >> If I'm asking this in the wrong place, sorry; please redirect me if you can. >> >> I am the author of an Aggregator and I'm looking for advice on refresh >> rates. There was some discussion in this group back in June about a >> possible 'Refresh rate' element. That seems to have been dismissed in >> favour of bandwidth throttling techniques, notably etag, last-modified and >> compression. I already support all these plus some additional ones. I am >> uncomfortable, though, with the implication that refresh rates don't matter >> and should be left to the end-user to decide. >> >> I am adding Atom support to my Agg. For RSS feeds, I have used the ttl and >> sy:updatePeriod / sy:updateFrequency elements to allow feed providers to >> limit refresh rates. I have, in any case, imposed a minimum refresh rate of >> one hour - because that seemed the decent thing to do. However, I'm coming >> under pressure to reduce that minimum limit for feeds that are clearly >> designed for shorter refresh periods - such as the Gmail Atom feeds. I'm >> reluctant to implement a free-for-all so I'm looking for guidance on how I >> should tackle this issue. >> >> Andy Henderson >> Constructive IT Advice >> >> >> > > > -- Walter Underwood Principal Architect, Verity
Re: FYI: More on duplicates in feeds: DoubleClick does ads the WRONG way!
How to make money from ads is off-topic, sorry for starting in that direction. I don't quite get this, though. What about e-mails and RSS is being compared? --On Monday, May 02, 2005 11:43:28 AM +0100 James Aylett <[EMAIL PROTECTED]> wrote: (2) but it's not that uncommon for people to need to track static instances - consider, for instance emails; HTTP level cache control is being used as well. There's no reason RSS feeds can't be considered in this way. The business of ads in feeds is pretty confused. How does a feed relate to an impression, when it might be viewed later or not at all? I would hope that advertisers figure out that impression-based approaches don't really mesh with feeds, but in the meantime, we are probably stuck with hacks like Bob Wyman describes. We punted explicit support for ads, so they will continue to show up in content and cause more work for Bob. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: FYI: More on duplicates in feeds: DoubleClick does ads the WRONG way!
--On May 2, 2005 5:32:22 PM +1000 Eric Scheid <[EMAIL PROTECTED]> wrote: > > Counting impressions is essential to their trade, and you'll find that it is > industry standard practice. Make that "was essential", and "should be a dying practice." Ads have moved to results-based billing, paying for clickthrough and conversion. wunder -- Walter Underwood Principal Architect, Verity
Re: PaceOptionalFeedLink
--On April 30, 2005 3:03:50 PM -0400 Robert Sayre <[EMAIL PROTECTED]> wrote: > > "atom:feed elements MUST NOT contain more than one atom:link element > with a rel attribute value of "alternate" that has the same > combination of type and hreflang attribute values." That actually specifies something different, the duplication, without saying whether atom:link is recommended. I recommend adding this text: "An atom:feed element SHOULD/MAY contain one such atom:link element." I'll let other people contribute on whether it is SHOULD or MAY. wunder -- Walter Underwood Principal Architect, Verity
Re: PaceOptionalSummary
--On Wednesday, April 27, 2005 10:38:03 AM -0400 Robert Sayre <[EMAIL PROTECTED]> wrote: I am willing to concede that there are valid reasons in particular circumstances to ignore the requirement for a summary. Are you willing to concede that there are implications to such a decision that must be understood and carefully weighed before chosing to omit a summary? What are the interoperability considerations that must be carefully weighed? I think the "full implications" you're worried about are non-technical. It certainly makes the feed much less useful for some kinds of applications. I work on a search engine, and, for us, a titles-only feed is very different one with content or summaries. A titles-only feed is basically a pretty version of "ls". The engine follows the links, but does not index the document. With summaries or content, the feed is a document in its own right, and it makes sense to index it under its own URL. A search engine won't ignore a titles-only feed, but it is less likely to treat it as a first-class document. Now I'm going to go re-read the PACE to see where I am on +/-. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: PaceOptionalSummary
--On Wednesday, April 27, 2005 02:02:43 AM +0200 "A. Pagaltzis" <[EMAIL PROTECTED]> wrote: So far I haven?t seen a cogent explanation of the significant semantics offered by an empty atom:summary inside an otherwise valid minimum atom:entry. It should be obvious. It means that when you summarize the content, you get nothing at all. It is an effectively, but not literally, content-free entry. I'm not being entirely silly here. We could distinguish between "I am not providing a summary" (no element) and "the summary is void" (empty summary). wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: NoIndex, again
A long time ago, I proposed a robots processing instruction that could be used in any XML format. I can find that again. An in-document robots directive is useful because it is controlled by the document author, rather than by the webmaster. "nofollow" is not particularly useful, because there is almost always another path to the document. Still, it can be a polite hint to the robot that all the links in the doc are junk. I would use exactly the model in the HTML robots meta tag, because: a) that is what robots already know how to deal with, and b) it has proven good enough. wunder --On April 19, 2005 11:15:16 PM -0400 Nikolas 'Atrus' Coukouma <[EMAIL PROTECTED]> wrote: > > Hi, > I've recently ended up in argument about what to do with feeds that > don't want to be reproduced. I e-mailed Dave Winer in the hope of > getting some information about RSS end of things. That resulted in a > blog entry with interesting comments [1], and I now know that Creative > Commons has an RDF schema for describing licensing [2]. > > The only common feature I want to include, and haven't found, is the > "noindex" type of behavior (do not include in search engines). I > searched the archives of this list and found an old thread discussing > this very issue [3]. It seems to have fizzled out and I haven't found > anything more recent documents or discussions. > > Was the issue simply forgotten or purposfully dropped? > > In the RSS discussion, it was suggested by Roger Benningfield that > search eninges and syndication sites use atom:summary instead of > atom:content to avoid the noarchive issue. The rationale is that > summaries are meant to be reproduced, much like an abstract for a paper. > > I'm not sure about nofollow, I think noindex is definitely needed. The > latter could be used to opt-out of services such as Feedster, > Technorati, and PubSub. > > Thoughts and comments? > > [1] http://www.reallysimplesyndication.com/2005/04/19#a445 > [2] http://web.resource.org/cc/ > [3] http://www.imc.org/atom-syntax/mail-archive/msg00183.html > > Regards, > -Nikolas 'Atrus' Coukouma > > -- Walter Underwood Principal Architect, Verity
Re: HTML/XHTML type issues, was: FW: XML Directorate Reviewer Comments
--On April 13, 2005 9:06:59 AM +0300 Henri Sivonen <[EMAIL PROTECTED]> wrote: > > Instead of saying "XHTML" it would be clearer to say "XHTML 1.x" or defining > it > in terms of the XHTML 1.x namespace URI. This could work. "XHTML 1.0" will not be confused with a media type. When XHTML 2.0 is ready, we can add a supplemental RFC which defines a new attribute value for that. wunder -- Walter Underwood Principal Architect, Verity
Re: PaceCoConstraintsAreBad
--On April 8, 2005 8:29:52 PM -0400 Robert Sayre <[EMAIL PROTECTED]> wrote: > > Please don't respond to me by saying that accessibility is important. I would never say that. Required or essential, but not merely important. wunder -- Walter Underwood Principal Architect, Verity
Re: PaceCoConstraintsAreBad
--On April 8, 2005 6:59:47 PM -0400 Robert Sayre <[EMAIL PROTECTED]> wrote: > > Walter, you are missing my point. You've said it yourself: > > "Maybe summaries are optional, but not because accessibility is optional."[0] That was in reply to a proposal to make accessibility an optional profile, and to make summaries required only in that profile. That approach is unacceptable. I would read my comment as "regardless of your position on summaries, accessibility is required." Local textual summaries are rather common on the web. The tag, for example. Current accessibility practice is to make the anchor text understandable out of context. In other words, to make it a summary of the linked resource. Even if the remote resource is text! For the tag, the alt tag is used to provide a local, textual equivalent. Again, this is required practice for accessibility. Same thing for graphs, charts, audio, and video. These are top-level requirements. They fit on the WAI pocket card. There are ten "quick tips" and five of them are about local textual equivalents: <http://www.w3.org/WAI/References/QuickTips/> wunder -- Walter Underwood Principal Architect, Verity
Re: PaceCoConstraintsAreBad
--On Friday, April 08, 2005 01:33:20 AM -0400 Robert Sayre <[EMAIL PROTECTED]> wrote: Accessibility is a non-starter absent expert opinion or substantially similar formats. Frankly, the notion that remote content constitutes an accessibility concern is absurd. Might as well write off the whole Web. No, non-accessible designs are non-starters. I am mystified when I see non-accessible web sites and technologies deployed. If a building was build with round doorknobs and steps, the architect would not get paid until it was fixed and made accessible. Why is discrimination OK on the web? Accessibility is required by law, and not just in the US. Plus, it is an "essential aspect" of the web. "The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect." -- Tim Berners-Lee <http://www.w3.org/WAI/> Is that expert enough for you? wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: Spaces supports slash:comments. Result = Duplicates Galore!
One way to look at this is to define what parts are local content as opposed to caches of remote, and base the Etag or other hash on that. I still think we should address caching in Atom 1.0. This would have been part of that. Scaling is an essential thing for syndication, and caching is the best known way to scale. wunder --On Thursday, April 07, 2005 02:48:07 PM -0400 Bob Wyman <[EMAIL PROTECTED]> wrote: Spaces.msn.com recently announced support for "slash:comments," an element which shows how many comments an RSS item has associated with it. As Dare Obasanjo explains[1]: "Another cool RSS enhancement is that the number of comments on each post is now provided using the slash:comments elements. Now users of aggregators like RSS Bandit can track the comment counts on various posts on a space. I've been wanting that since last year." Of course, the side effect of this change is that any aggregator that uses an MD5-like approach to detect changes will now think that an entry has been updated every time a new comment is made. This may or may not be what is desired by consumers of feeds... In any case, there are now millions of blogs whose entries are changed every time anyone comments on them. Should aggregators ignore changes that are limited to the "slash:comments" element? If so, are there other elements that should be ignored? Now, Spaces only publishes RSS feeds... However, if similar atom extensions were to be defined, the problem would appear with Atom feeds as well. bob wyman [1] http://spaces.msn.com/members/carnage4life/Blog/cns%211piiOwAp2SJRIfUfD95CnR Lw%21430.entry -- Walter Underwood Principal Architect Verity Ultraseek
Re: Date accuracy
--On March 25, 2005 1:47:29 PM + Graham <[EMAIL PROTECTED]> wrote: > > There are several RSS feeds out there that have dates where the day is > accurate > but the time is always the same (usually 10am for some reason), regardless of > the time of publication, ... > Proposal: Add to Date Construct section: > "Date values must have a granularity of one second" Precision and accuracy are very different things. Precise timestamps have a lot of numbers. Accurate timestamps are a correct measurement of a clock. Does this mean that they must have an accuracy of one second? That is, that the timestamp for the update or publish event must be correct within +/- 0.5s when compared to a trusted time standard? Atom already requires the timestamp to be precise to one second, but it is not practical to require (MUST) accuracy. We could do it, but we'd lock out the 99% of machines with bad clocks. Plus, some publications just don't have an accurate time -- archives digitized from paper, like old New Yorker issues or MIT AI lab tech notes. One approach to those is to choose a convention for the time portion, like noon UTC. That is most likely to be the same day in other time zones. We could mention that as a useful convention. The Atom spec should recommend that clocks be accurate. There is no point in having sortable timestamps without trustworthy clocks. This is already a SHOULD in the HTTP 1.1 spec, and you can grab that language from the caching Pace I put together. Or I can, if there is support for it. wunder -- Walter Underwood Principal Architect, Verity
Re: Alternative to the date regex
+1 on dropping the regex. It isn't from any of the other specs, it isn't specifically called out as explanatory and non-normative, and it is too long to be clear. Some examples would be nice, along with some examples of things which do not conform. wunder --On March 25, 2005 5:11:09 PM + Graham <[EMAIL PROTECTED]> wrote: > > Currently we have this > > "A Date construct is an element whose content MUST conform to the > date-time BNF rule in [RFC3339]. I.e., the content of this element > matches this regular expression: > > [0-9]{8}T[0-9]{2}:[0-9]{2}:[0-9]{2}(\.[0-9]+) > ?(Z|[\+\-][0-9]{2}:[0-9]{2}) > > As a result, the date values conform to the following specifications..." > > The problem with the regex is that it's entirely redundant. If we look at > Norm's message where the regex was suggested [1], he intends it as a profile > of xsd:dateTime, which allows a variety of date formats. However we're using > it as a profile of RFC3339, which already requires that date-times match the > regex 100%. Having the regex there as well is just confusing - until > preparing this email I was under the impression it made some additional > restrictions on RFC3339. > > The nearest thing I see to an additional restriction is that there must be a > capital T between the date and time, which the date-time BNF rule we mention > also requires, but the prose later mentions you might be allowed to use > something different. > > Proposal: > Replace the first para and regex with: > > A Date construct is an element whose content MUST conform to the > date-time BNF rule in [RFC3339]. Note this requires an uppercase letter T > between the date and time sections. > > Secondly, *all* RFC3339 date-times are compatible with the 4 specs mentioned, > so the wording of the second paragraph ("As a result...") is a bit strange, > since it's not as a result of anything we've done. Just say "Date values > expressed in this way are also compatible with...". > > Graham > > [1]http://www.imc.org/atom-syntax/mail-archive/msg13116.html > > -- Walter Underwood Principal Architect, Verity
Re: new issues in draft -06, was: Updated issues list
--On March 20, 2005 11:44:30 AM -0800 Tim Bray <[EMAIL PROTECTED]> wrote: > > Good point. My impression is that we do currently have SHOULD-level mandate > to > serve valid HTML; recognizing that most real-world implementors do make a > best-effort > with tag soup. Anyone who thinks that the language needs improving should > suggest > improvements. I support a SHOULD on that. The Robustness Principle would suggest exactly that. Consumers of Atom may make an attempt to parse arbitrary HTML-like content, but producers should make the effort to serve clean HTML. That free-range HTML is nasty stuff. In the past week, we had two customers freely mixing slash and backslash in their URL paths. Sigh. wunder -- Walter Underwood Principal Architect, Verity
Re: PaceRepeatIdInDocument solution
About logical clocks in atom:modified: --On February 21, 2005 3:30:13 AM +1100 Eric Scheid <[EMAIL PROTECTED]> wrote: > > Semantically, it would work ... for comparing two instances of one entry. It > wouldn't work for establishing if an entry was modified before or after > [some event moment] (eg. close of the stock exchange). Establishing sequences of events is rather tricky. See Leslie Lamport's "Time, Clocks, and the Ordering of Events in Distributed Systems" for how to do it with logical clocks. The core part of the paper is short, maybe five pages, and definitely worth reading if you care about this stuff. <http://research.microsoft.com/users/lamport/pubs/time-clocks.pdf> Synchronized clocks make this simpler. If Atom depends on comparing timestamps from different servers, then synchronized clocks are a SHOULD. See the text in PaceCaching for an example. Synchronized clocks are already a SHOULD for HTTP. wunder -- Walter Underwood Principal Architect, Verity
Re: Consensus call on last round of Paces
--On February 15, 2005 8:56:24 PM +0100 Anne van Kesteren <[EMAIL PROTECTED]> wrote: > Walter Underwood wrote: >> This also means that Atom cannot be used for BBC News, where order is >> significant and non-chronological. > > Could you elaborate on that? The BBC News feeds are ordered by "importance", not date. Since the order is not significant, intermediate nodes could re-order the feed and be perfectly legal Atom processors. A publishing date order can be recovered from the date information in Atom. Other orders cannot. wunder -- Walter Underwood Principal Architect, Verity
Re: Consensus call on last round of Paces
--On February 15, 2005 11:12:48 AM -0800 Tim Bray <[EMAIL PROTECTED]> wrote: > > PaceEntryOrder > One -1, but overwhelming support otherwise. > DISPOSITION: Accepted. I was the -1, and there is an open issue here. Accepting this means that Atom cannot represent RSS 1.0 feeds. Is that OK? If so, where do we state that in the spec? As far as I know, this is the only exception to interoperability with other RSS formats. This also means that Atom cannot be used for BBC News, where order is significant and non-chronological. wunder -- Walter Underwood Principal Architect, Verity
Re: "atom:entry elements MUST contain an atom:summary element in any of the following cases"
I don't think that accessibility is optional. It isn't a profile, it is a requirement. Maybe summaries are optional, but not because accessibility is optional. wunder --On February 14, 2005 8:48:08 PM -0800 James M Snell <[EMAIL PROTECTED]> wrote: > At the risk of beating the PaceProfile drum to death, I would think that an > Accessibility profile could be used to specify specific requirements for > accessible feeds. The core could do exactly as you suggest below -- not > require summary. -- Walter Underwood Principal Architect, Verity
RE: PaceHeadless
--On Tuesday, February 08, 2005 08:39:42 AM -0500 Bob Wyman <[EMAIL PROTECTED]> wrote: Linking to the feed is not an acceptable solution. It must be possible to embed feed metadata in an entry in a feed and in an Entry document. +1 The feed document *must* be standalone. Everything required to interpret the feed has to be in the feed. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: PaceProfile
--On February 7, 2005 7:13:21 PM -0500 Robert Sayre <[EMAIL PROTECTED]> wrote: > > So, you're looking for a way to include a "schema" association in the feed, > and you want a standard way to do it. The only processors that will do > anything > useful with this information are those that know about the "profile". Sounds like a job for . Or a processing instruction, but I seem to be the only person that likes those. wunder -- Walter Underwood Principal Architect, Verity
Re: PaceEntryOrder
--On February 7, 2005 4:27:12 PM -0500 Sam Ruby <[EMAIL PROTECTED]> wrote: > > Ultimately, the sentiment that I want conveyed is that publishers are not > safe to assume that clients will read anything into the order. And I think that the order should mean "the publisher put them in this order." The Pace forbids that interpretation. Clients can reorder things, show only a few, whatever. I'm not restricting client behavior. Do other specs in the RSS family say anything about order? If order is significant in those, then making it not significant in Atom will hurt interoperability. Hmm, I can't finding any ordering restrictions in a quick read of RSS 0.91 and 2.0, but RSS 1.0 does specify ordering. >From RSS 1.0: 5.3.5 An RDF Seq (sequence) is used to contain all the items rather than an RDF Bag to denote item order for rendering and reconstruction. <http://web.resource.org/rss/1.0/spec#s5.3.5> wunder -- Walter Underwood Principal Architect, Verity
Re: PaceEntryOrder
--On Monday, February 07, 2005 12:24:15 PM -0800 Paul Hoffman <[EMAIL PROTECTED]> wrote: At 11:07 AM -0800 2/7/05, Walter Underwood wrote: -1. I don't see the benefit. Clients MAY re-order them, but that doesn't mean they MUST ignore the order. The publisher may prefer an order which cannot be expressed in the attributes. The Macintouch and BBC New feeds cited before are good examples. I'm very confused. Clients that show the entries of those feeds in the received order are perfectly acceptable according to the wording of this Pace. Correct, clients may choose any order, including the original. This is about the publisher's order preference. The Pace says that the publisher cannot indicate a preferred order in the Atom format. The order is not significant. This is clearly counter to normal use, where the order does have some meaning. The meaning varies by publisher, but it is usually significant. wunder -- Walter Underwood Principal Architect Verity Ultraseek
Re: PaceEntryOrder
--On February 7, 2005 1:06:49 PM -0500 Robert Sayre <[EMAIL PROTECTED]> wrote: > Paul Hoffman wrote: >> >> +1. It is a simple clarification that shows the intention without >> restricting anyone. > > +1. Agree in full. -1. I don't see the benefit. Clients MAY re-order them, but that doesn't mean they MUST ignore the order. The publisher may prefer an order which cannot be expressed in the attributes. The Macintouch and BBC New feeds cited before are good examples. wunder -- Walter Underwood Principal Architect, Verity
Re: PaceCaching posted
This is not restricted to HTTP. It uses HTTP's cache age algorithms, because they are very carefully designed and have proven effective. But it can be used for any local copy in an Atom client. wunder --On Monday, February 07, 2005 10:08:48 AM -0800 Paul Hoffman <[EMAIL PROTECTED]> wrote: At 9:38 AM -0800 2/7/05, Walter Underwood wrote: I was holding this back as out of scope and too close to the deadline, but now that we are talking about sliding windows and delayed, cached state, it is quite relevant. Sorry, this is too late for consideration for the Atom core. Even if you had turned it in on time, I would give it a -1 for not being essential to the core for the Atom format. Atom will be distributed over many protocols, HTTP being one of them. Having said that, I think this would be an excellent extension, one that might keep the folks who don't understand HTTP scalability but feel free to talk about it anyway at bay. --Paul Hoffman, Director --Internet Mail Consortium -- Walter Underwood Principal Architect Verity Ultraseek
PaceCaching posted
I was holding this back as out of scope and too close to the deadline, but now that we are talking about sliding windows and delayed, cached state, it is quite relevant. This proposal uses HTTP caching algorithms, but does not require an HTTP transport. Atom over other transports can use these algorithms. <http://www.intertwingly.net/wiki/pie/PaceCaching> wunder -- Walter Underwood Principal Architect, Verity
RE: PaceArchiveDocument posted
I agree, but I would put it another way. The charter requires support for archives, but we don't have a clear model for those. Without a model, we can't spec syntax. So, it is not possible for the current doc to fulfill the charter, and this document is not ready for last call. wunder --On February 6, 2005 2:00:20 AM -0500 Bob Wyman <[EMAIL PROTECTED]> wrote: > > -1. > The use cases for archiving have not been well defined or well > discussed on this list. It is, I believe, inappropriate and unwise to try to > rush through something this major at the last moment before a pending Last > Call. > > bob wyman > > > -- Walter Underwood Principal Architect, Verity
Re: PaceClarifyDateUpdated
--On February 6, 2005 1:07:42 PM +0200 Henri Sivonen <[EMAIL PROTECTED]> wrote: > > Yes. Also as a spec expectation--that is, how often is the "SHOULD NOT" > expected > to be violated. Will the SHOULD NOT be violated so often that it dilutes the > meaning of all SHOULD NOTs? Roughly, a SHOULD or SHOULD NOT can be violated when the implementer understands and accepts the interoperability limitations they of that decision. So, the spec should (must?) explain what those are. wunder -- Walter Underwood Principal Architect, Verity
Re: Entry order
--On February 4, 2005 4:28:53 PM -0600 "Roger B." <[EMAIL PROTECTED]> wrote: >> If clients are told to ignore the order, and given only an updated timestamp, >> there is no way to show "most recent headlines"... > > At a single moment within a feedstream, sure... but the next time an > entry is added to that feed, I'll have no problem letting the user > know that this is new stuff. But if three are added, you can't order those three. wunder -- Walter Underwood Principal Architect, Verity
Re: Entry order
--On February 4, 2005 11:44:31 AM -0800 Tim Bray <[EMAIL PROTECTED]> wrote: > On Feb 4, 2005, at 11:27 AM, Walter Underwood wrote: > >> Is this a joke? This is like saying that the order of the entries in my >> mailbox is not significant. Note that ordering a mailbox by date is not >> the same thing as its native order. > > Except for, Atom entries have a *compulsory* date. So I have no > idea what semantics you'd attach to the "natural" order... -Tim Order the publisher wants to present them in. Conventionally, most recently published first. Entries may be updated without being reordered. If clients are told to ignore the order, and given only an updated timestamp, there is no way to show "most recent headlines", which is the primary purpose of the whole family of RSS formats. Right now, you can shuffle the entries and Atom says it is the same feed. Either we need a published date stamp or we need to honor the order. wunder -- Walter Underwood Principal Architect, Verity
RE: Entry order
--On February 3, 2005 11:21:50 PM -0500 Bob Wyman <[EMAIL PROTECTED]> wrote: > David Powell wrote: >> It looks like this might have got lost accidently when the >> atom:head element was introduced. Previously Atom 0.3 said [1]: >>> Ordering of the element children of atom:feed element MUST NOT be >>> considered significant. > +1. > The order of entries in an Atom feed should NOT be significant. This > is, I think, a very, very important point to make. -1 Is this a joke? This is like saying that the order of the entries in my mailbox is not significant. Note that ordering a mailbox by date is not the same thing as its native order. Feed order is the only way we have to show the publication order of items in a feed. I just looked at all my subscriptions, and there is only one where the order might not be relevant, a security test for RSS readers. That is clearly not within Atom's charter, so it doesn't count. wunder -- Walter Underwood Principal Architect, Verity
Re: xsd:dateTime vs. RFC 3339
--On February 4, 2005 6:46:33 PM +0100 Julian Reschke <[EMAIL PROTECTED]> wrote: >> Also, we have an unresolved issue with historic Livejournal entries, >> which do not have timezones. XML Schema explains exactly how to > > So what does it recommend? > >> handle those. We can have a SHOULD for timezone info, with an explanation >> of what you lose without that. Treating the datetime value as if it has an uncertainty equal to the maximum possible timezone offset. The other advantage of use XML Schema is that is defines how to order timestamps, which is the main thing we want to do with them. I think the section is pretty clear, and I'm picky about specs: <http://www.w3.org/TR/xmlschema-2/#dateTime> wunder -- Walter Underwood Principal Architect, Verity
Re: xsd:dateTime vs. RFC 3339
--On February 4, 2005 11:18:17 AM -0500 Norman Walsh <[EMAIL PROTECTED]> wrote: > > I know we're writing an IETF document, but I think there's going to be > a lot of off-the-shelf XML software that understands xsd:dateTimes and > I think it would be a lot better if we defined Date Constructs in > terms of W3C XML Schema Part 2 than RFC 3339. Strongly agree. Also, we have an unresolved issue with historic Livejournal entries, which do not have timezones. XML Schema explains exactly how to handle those. We can have a SHOULD for timezone info, with an explanation of what you lose without that. wunder -- Walter Underwood Principal Architect, Verity
Re: Atom for Archives (was:Re: Call for final Paces for consideration: deadline imminent)
--On February 3, 2005 1:31:45 PM +0200 Henri Sivonen <[EMAIL PROTECTED]> wrote: > On Feb 3, 2005, at 08:09, James Snell wrote: > >> What is the model for archiving with Atom? > > What's the *point* in archiving with Atom compared to eg. a zip archive with > some HTML or XHTML files in it (with relative links and a stipulation that > index.html and index.xhtml are magic names)? Cross-platform dump and load. Saving data that is in the database and not in the HTML. Backups. Dump and reload for an upgrade with a DB schema change. Consistent save from a live database (hold a read lock while you dump the archive). Insurance against your blog service going away on short notice. Sarbanes-Oxley compliance for corporate blogs (internal and external). And of course, so Brewster Kahle can keep a copy. The Wayback Machine has saved my butt a couple of times. wunder -- Walter Underwood Principal Architect, Verity
Re: Call for final Paces for consideration: deadline imminent
The charter says that Atom will work for archiving. We don't know that it will, and it hasn't been discussed for months. Is the current Atom spec sufficient for archiving? If not, we aren't done. wunder --On February 2, 2005 5:46:51 PM -0800 Paul Hoffman <[EMAIL PROTECTED]> wrote: > > Greetings again. And, thanks again for all the work people did on the last > work queue rotation. We now have the end of the format draft squarely in > sight. > > The WG still has a bunch of finished Paces that have not been formally > considered, a (thankfully) much smaller number of unfinished Paces, and a > couple of promises that "I'll write that up as a Pace soon". We need to > finish soon in order to make our milestone, and I believe we can do so > gracefully. > > On Monday, Feb. 7, the Working Group's final queue rotation will consist of > all Paces open at that time. Any Paces that have obvious holes in them ("to > be filled in later", "more needs to go here", etc.) will be ignored. We have > had over a year of time here, and many weeks since the previous attempt to > close things out. On Monday, Feb. 14, we will assess WG consensus and ask the > document authors to put together a final draft. > > Note that this is not the last opportunity for work on the Atom format. For > one thing, there are plenty of non-core extensions that folks have been > mulling over; having the core draft finally finished will help those to > emerge. Further, we need to do the final work on the protocol document. Also, > during the formal IETF Last Call, discussion of the format draft will be > welcome from everyone (including people who have not read any of the earlier > drafts). > > Please do *not* rush out to write a Pace unless it is for something that is > *truly* part of the Atom core, and you really believe that it is likely that > there will be consensus within a week. If your idea is appropriate as an > extension, or is for something that is quite similar to something else that > has explicitly gotten lack of consensus, please do not write a Pace. In the > former case, please hold your extensions for a few weeks; in the latter case, > please recognize that asking the WG to focus on something that they don't want will likely cause us to do a worse job at carefully reviewing things that we all want. > > So, if you have an incomplete Pace now, you have a few more days to complete > it. Of course, everyone should feel free to continue talking about the > current Paces now, and to continue to suggest editorial changes to the > current Internet Draft. > > --Paul Hoffman, Director > --Internet Mail Consortium > > -- Walter Underwood Principal Architect, Verity
Re: Format spec vs Protocol spec
Correct. --wunder --On February 2, 2005 12:35:31 PM -0700 Antone Roundy <[EMAIL PROTECTED]> wrote: > Let me make sure I understand you correctly--are you saying that it's fine > for the format and protocol to have their own elements in their own > namespaces, but 1.0 of each should be finished at the same time, to ensure > that we don't run into any surprises while finishing protocol 1.0 which > require a format revision (eg. 1.1) in order to make protocol 1.0 work? -- Walter Underwood Principal Architect, Verity