date:20050505

Re: Autodiscovery - different cases should use different rel

2005-05-05 Thread Eric Scheid

On 6/5/05 1:07 PM, "Nikolas 'Atrus' Coukouma" <[EMAIL PROTECTED]> wrote:

> Hrm. This is an interesting point. I'm not too concerned about "find
> every feed, regardless of relevance" because I think only search engines
> will be interested in it, especially if all the other cases are marked.

finding every feed is not my concern either.

> They can bear to check the feed and see what the root element is.

this won't work ... see below.

> This also makes rel="alternate" seem like an even worse choice for
> *feed* autodiscovery because it would make sense to link to an atom
> *entry* as rel="alternate" from the page for an individual entry.

absolutely!

> I really don't think @rel is the place to address concerns about type.
> That's really the job of @type (of course). If we need to declare more
> mime-types, then so be it.

Just to throw more fuel on the fire:

It is quite conceivable for an Atom Feed Document (AFD) to contain a set of
entries which won't grow or be updated, such as an AFD which contains all
postings for a calendar period, or an AFD which contains one entry for each
chapter of a book, and so on.

Thus, neither mime-types nor root-element-sniffing will be reliable enough
to discover the resource which is appropriate for "subscribing" to - ie.
discovering which Atom Feed Document is the one which will be updated as
time goes by in the usual sliding window manner, and not the monthly archive
that page happens to be contained within.

e.

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Robert Sayre


On 5/5/05, Graham <[EMAIL PROTECTED]> wrote:

> 
> PaceOptionalSummary simply says remove the "MUST", it doesn't say
> what it should be replaced with.

What part of OPTIONAL don't you understand? 

Robert Sayre

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Graham

On 6 May 2005, at 4:26 am, Robert Sayre wrote:
PaceTextShouldBeProvided: SHOULD have a summary
PaceOptionalSummary: MAY have a summary
No, Robert:
Current situation: MUST have a summary
PaceOptionalSummary: No explicit opinion
PaceTextShouldBeProvided: SHOULD have a summary
PaceOptionalSummary simply says remove the "MUST", it doesn't say  
what it should be replaced with.

Graham

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Robert Sayre


On 5/5/05, Sam Ruby <[EMAIL PROTECTED]> wrote:
> 
> The notes section is now gone.  PaceTextShouldBeProvides now contains a
> proper superset of the instructions contained in PaceOptionalSummary.
> 
> Perhaps now we can get beyond discussing alleged incompatibilities.

PaceTextShouldBeProvided: SHOULD have a summary
PaceOptionalSummary: MAY have a summary


Pick one.

Robert Sayre

Autodiscovery discussion & editorship

2005-05-05 Thread Tim Bray


Given the volume of debate, it's obvious there may be more work to do 
here.  Paul and I have asked Phil Ringnalda to co-edit the 
autodiscovery spec, and he's accepted.  Between Mark and Phil we should 
have no trouble getting the editing done.  Paul, is there any reason 
Mark or Phil shouldn't submit the most recent autodiscovery-01 as a 
committee draft?

The discussion in recent days has been lively but unstructured.  If I 
were forced to make a consensus call right now, I'm pretty sure I 
wouldn't be able to pick out any one spec change that I could say 
clearly has consensus.  So argue as much as you want, but if you really 
want a specific change to the draft, I think you need to draw up a Pace 
with specific change suggestions and see if you can use that to focus 
the discussion and gather support.

If possible, I'd like us to focus on getting the format draft done just 
now; we promise to have a focused discussion on autodisco once format 
is put to bed.   -Tim

Re: PaceOptionalFeedLink

2005-05-05 Thread Graham


On 6 May 2005, at 3:50 am, Sam Ruby wrote:
FYI: we have an instance proof of this requiring an existing tool  
to do additional work:

  http://www.imc.org/atom-syntax/mail-archive/msg13983.html
Tools will have to be updated to work with Atom? Scandalous.
+1 to the Pace
Graham

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Robert Sayre


On 5/5/05, Tim Bray <[EMAIL PROTECTED]> wrote:
> 
> Speaking not as the chair but as an interested WG member,  I read them
> about eight times and I do not understand why they are in conflict.
> Someone please explain, as simply as possible, what the problem is,
> because I just don't get it.  On the face of it, I am inclined to be +1
> to both PaceOptionalSummary and PaceTextShouldBeProvided. 

Everything in the proposal section is fine with me, as well. It's that
"Notes" section that's the problem.

> Note: I totally fail to understand the "Notes" bit at the end of
> PaceTextShouldBeProvided.  It is underspecified to the extent that I
> can't figure out what language change it is actually saying is
> necessary.

That section says is "If PaceOptionalSummary is 'accepted', this Pace
changes summary to SHOULD." That's OK to propose, but you can't accept
both of them. They conflict.

> 
> Basically, allowing title-only feeds seems OK to me, and encouraging
> people to provide text also seems OK to me, so what's the problem? 

Current spec: MUST contain a summary
after PaceOptionalSummary: MAY contain a summary
after PaceTextShouldBeProvided: SHOULD contain a summary

Robert Sayre

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Sam Ruby

Robert Sayre wrote:
On 5/5/05, Tim Bray <[EMAIL PROTECTED]> wrote:
No it doesn't, it says something about inserting the phrase "...is
either not present or..." which, by the way, I don't understand.  Are
we looking at the same document?
Ah, it's been updated since I last looked. The proposed text for 4.1.2
didn't used to account for an absent content element.
The notes section is now gone.  PaceTextShouldBeProvides now contains a 
proper superset of the instructions contained in PaceOptionalSummary.

Perhaps now we can get beyond discussing alleged incompatibilities.
- Sam Ruby

Re: Autodiscovery - different cases should use different rel

2005-05-05 Thread Nikolas 'Atrus' Coukouma


Eric Scheid wrote:

>On 6/5/05 7:22 AM, "Nikolas 'Atrus' Coukouma" <[EMAIL PROTECTED]> wrote:
>
>  
>
>>I've basically concluded that the keys to autodiscovery of feeds, in the
>>general sense, should not be three (rel, type, and href), but two (type
>>and href). Type is plenty of specification that it's a feed. Claiming
>>it's relationship as "feed" doesn't seem correct. There are a few
>>mime-types used, and the one for atom (application/atom+xml) will be an
>>official standard as soon as the draft is accepted by the IETF.
>>
>>
>
>Using @type only is not sufficient, since both Atom Feed Documents *and*
>Atom Entry Documents use the same mime-type. One is a feed, the other is
>not.
>
>Similarly, RSS 1.0 isn't clearly distinguished by mime-type - there are lots
>of other resources which are 'application/rdf+xml' (eg. FOAF)
>
>e.
>  
>
Hrm. This is an interesting point. I'm not too concerned about "find
every feed, regardless of relevance" because I think only search engines
will be interested in it, especially if all the other cases are marked.
They can bear to check the feed and see what the root element is.

This also makes rel="alternate" seem like an even worse choice for
*feed* autodiscovery because it would make sense to link to an atom
*entry* as rel="alternate" from the page for an individual entry.

I really don't think @rel is the place to address concerns about type.
That's really the job of @type (of course). If we need to declare more
mime-types, then so be it.

-Nikolas 'Atrus' Coukouma

Re: PaceFeedIdOrSelf

2005-05-05 Thread Sam Ruby

+1
From prior discussion, people have indicated that they want *something* 
that they can use to track feeds identity, and the consensus seemed to 
be that id and/or self was more appropriate than link.

- Sam Ruby

Re: PaceOptionalFeedLink

2005-05-05 Thread Sam Ruby

Tim Bray wrote:
+1
There are people who want to publish feeds without rel="alternate" 
links.  I'm against telling people they can't do something they want to 
do without strong reasons, as in loss of interoperability.  I don't see 
the reasons here as strong enough.  -Tim
FYI: we have an instance proof of this requiring an existing tool to do 
additional work:

  http://www.imc.org/atom-syntax/mail-archive/msg13983.html
- Sam Ruby

RE: Autodiscovery via body elements like a, div, span, etc.

2005-05-05 Thread Bob Wyman


Sjoerd Visscher wrote:
> Why not support hyperlinks too?
>  href="/xml/index.atom">Main Atom feed
One very interesting side-effect of putting the autodiscovery data
in elements of the body is that it then becomes visible in many more
environments. For instance, if I were to cut some text from a web page and
paste it into blog post or another web page, then if that text had links
with rel="alternate" in it, those links would travel along with the quote.
If I built a page of quotes from many sites, I might end up with "alternate"
links to many different sites, feeds, etc. This might not be as "bad" as it
sounds...
This might argue against using "alternate" in body elements that
might be cut out and pasted into or quoted in other pages since clearly, the
thing which is an alternate is an alternate of the page which originally
contained the quote -- not the page into which the quote has been pasted or
an alternate form of the quote itself.
While there are some problems to be worked on, I very much like the
idea of a chunk of HTML carrying around some indication of its provenance,
original source and alternate sources. 

bob wyman

Re: Autodiscovery - different cases should use different rel

2005-05-05 Thread Nikolas 'Atrus' Coukouma

fantasai wrote:

>
> Nikolas 'Atrus' Coukouma wrote:
>
>> fantasai wrote:
>>
>> An excellent point. Perhaps these should use rel="home" :)
>>
>> 
>
> ...
>
>>
>> The value of rel, if present, will vary based on relation
>> * the feed for *this* page - rel="alternate"
>> * the feed for main feed for this blog, in general - rel="home"
>> * other feeds the author reads or recommends - rel="suggested"
>> * any other feeds linked to for any reason at all - no rel, just the
>> type and href
>>
>> Is this acceptable? I'm not completely happy with "home" and "suggested"
>> because they're not specified as link types in the HTML specs [1].
>> Sadly, it seems the HTML authors didn't consider these cases. "home"
>> seems to be an informal standard. Close matches in the HTML list are
>> "index", "contents", and "start". All of these are inaccurate, but I
>> think "contents" is the best fit.
>
>
> Actually, I think "start" is the best fit. The main feed is often not a
> table of contents to the entire weblog, but something partial. It is,
> however, the "starting point of the collection".

Actually, I disagree with start because of the first sentence in the
HTML spec:
"Refers to the first document in a collection of documents."
This indicates that start should point to the first post in a weblog.
end would be the most recent (not that end exists in the HTML spec)

"This link type tells search engines which document is considered by the
author to be the starting point of the collection."
This is a completely different meaning and I'm not sure why it's bundled
with the first. According to this, start pointing to the homepage is fine.

 The end or last would be the most recent (not that the HTML specs have
an end or last rel)

>
> BTW, you might want to take a look at
>
>   http://fantasai.tripod.com/qref/Appendix/LinkTypes/ltdef.html
>   http://fantasai.tripod.com/qref/Appendix/LinkTypes/alphindex.html
>
> ~fantasai

No offense, but with all the tripod ads, I would have much preferred a
link to the "Hypertext links in HTML" draft [1]. Section four is what I
want. It's not indexed alphabetically and doesn't combine other
documents, but it's the covers everything pretty well.

[1] http://www.w3.org/MarkUp/draft-ietf-html-relrev-00.txt

-Nikolas 'Atrus'

Re: Autodiscovery

2005-05-05 Thread A. Pagaltzis


* fantasai <[EMAIL PROTECTED]> [2005-05-04 19:15]:
> How is a link from the top of my homepage to my friend's weblog
> feed designating a "substitute version for the document in
> which the link occurs"?

Itâs not. And itâs offtopic. Itâs called the âautodiscoveryâ
spec, not the âfeed directoryâ spec. If you want to link your
friendâs blogâs feed, supply an OPML or FOAF blogroll. Covering
this ground is not within the scope of autodiscovery.

Regards,
-- 
Aristotle

Re: Autodiscovery

2005-05-05 Thread fantasai

Sjoerd Visscher wrote:
Why not support hyperlinks too?
So besides:


also:
Main Atom feed

Most webpages already have a hyperlink to the feed, so they'd only need 
to add two attributes. It would be a waste to have to duplicate the 
information in the document head.
The difference between  and  is that
  -  applies to the document as a whole: it indicates a relationship
between this document and the href destination.
  -  is a contextual link: it indicates a relationship between the
linking context and the href destination.
They have different purposes. It is imho perfectly reasonable to limit
autodiscovery to s only. It is also perfectly reasonable to link
to feeds with , and expect that the UA will recognize it as a feed
rather than a generic XML document.
~fantasai

Re: Autodiscovery - different cases should use different rel

2005-05-05 Thread fantasai

Nikolas 'Atrus' Coukouma wrote:
fantasai wrote:
An excellent point. Perhaps these should use rel="home" :)

...
The value of rel, if present, will vary based on relation
* the feed for *this* page - rel="alternate"
* the feed for main feed for this blog, in general - rel="home"
* other feeds the author reads or recommends - rel="suggested"
* any other feeds linked to for any reason at all - no rel, just the
type and href
Is this acceptable? I'm not completely happy with "home" and "suggested"
because they're not specified as link types in the HTML specs [1].
Sadly, it seems the HTML authors didn't consider these cases. "home"
seems to be an informal standard. Close matches in the HTML list are
"index", "contents", and "start". All of these are inaccurate, but I
think "contents" is the best fit.
Actually, I think "start" is the best fit. The main feed is often not a
table of contents to the entire weblog, but something partial. It is,
however, the "starting point of the collection".
BTW, you might want to take a look at
  http://fantasai.tripod.com/qref/Appendix/LinkTypes/ltdef.html
  http://fantasai.tripod.com/qref/Appendix/LinkTypes/alphindex.html
~fantasai

Re: Autodiscovery - different cases should use different rel

2005-05-05 Thread Eric Scheid

On 6/5/05 7:22 AM, "Nikolas 'Atrus' Coukouma" <[EMAIL PROTECTED]> wrote:

> I've basically concluded that the keys to autodiscovery of feeds, in the
> general sense, should not be three (rel, type, and href), but two (type
> and href). Type is plenty of specification that it's a feed. Claiming
> it's relationship as "feed" doesn't seem correct. There are a few
> mime-types used, and the one for atom (application/atom+xml) will be an
> official standard as soon as the draft is accepted by the IETF.

Using @type only is not sufficient, since both Atom Feed Documents *and*
Atom Entry Documents use the same mime-type. One is a feed, the other is
not.

Similarly, RSS 1.0 isn't clearly distinguished by mime-type - there are lots
of other resources which are 'application/rdf+xml' (eg. FOAF)

e.

Re: Atom feed refresh rates

2005-05-05 Thread A. Pagaltzis


* Lance Lavandowska <[EMAIL PROTECTED]> [2005-05-04 19:00]:
> In the toy aggregator I wrote I played with a scheduler that
> tried to throttle itself based on the feeds response.

I believe that is the right way to do this, though your algorithm
is a little too simple IMHO. A better approach for Atom consumers
in absence of applicable HTTP headers would be to intelligently
calculate an average update interval based on atom:published /
atom:updated / etc.

[ Of course, for RSS feeds without pubDate and any applicable
HTTP headers, the algorithm is a *lot* more complicated.
Something like exponential backoff with a low radix reducing (not
resetting) the backoff depending on the number of new items you
got would keep the update interval close to an ideal. ]

Furthermore Iâd suggest not giving users any option to manually
change intervals â only a way to force a refresh immediately.
This is a usability win. When I started using an aggregator, I
never had any idea what value to realistically supply when the
aggregator asked me how often I wanted a feed to be refreshed.
How is my grandfather supposed to make an intelligent decision?
And why should he? The software has all the information it needs
to refresh about as often as it can expect to find new content.

This entire issue is just a matter of lazy aggregator
implementors, IMO.

Regards,
-- 
Aristotle

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Robert Sayre

On 5/5/05, Tim Bray <[EMAIL PROTECTED]> wrote:

> 
> No it doesn't, it says something about inserting the phrase "...is
> either not present or..." which, by the way, I don't understand.  Are
> we looking at the same document?

Ah, it's been updated since I last looked. The proposed text for 4.1.2
didn't used to account for an absent content element.

> >> Basically, allowing title-only feeds seems OK to me, and encouraging
> >> people to provide text also seems OK to me, so what's the problem?
> >
> > Current spec: MUST contain a summary
> > after PaceOptionalSummary: MAY contain a summary
> > after PaceTextShouldBeProvided: SHOULD contain a summary
> 
> So what you're actually objecting to is the last part of the Pace
> before the "Impacts" section, that wants 4.1.2 to say that summary
> SHOULD be there if Content is absent. 

Yes. If that SHOULD goes through, it becomes OK to write an Atom
Processor that catches fire when summary and content are both absent.
That is not what the folks who supported PaceOptionalSummary were
advocating. They conflict.

Robert Sayre

Re: PaceTextShouldBeProvided

2005-05-05 Thread Graham

On 5 May 2005, at 6:23 pm, Robert Sayre wrote:
It would be deeply bogus to accept a Pace whose sole action was to
remove a normative requirement, and simultaneously accept a Pace that
puts it back in. Seems obvious to me.
Not really. Assuming PaceOptionalSummary is accepted, there are two  
completely valid outcomes:

1. PaceTextShouldBeProvided rejected => summaries are not required,  
and textual content is not encouraged
2. PaceTextShouldBeProvided accepted => summaries are not required,  
but textual content is encouraged

I don't see a conflict there. What's wrong with accepting two similar  
paces because one corrects the flaws in the other?

We know exactly what issues optional content has, because all of the
other formats have it.
Yes, and we don't like it. A basic title only feed gives you fuck all  
to work with - it displays poorly when mixed with rich-content feeds,  
it isn't searchable because there aren't many keywords in the title,  
etc etc. They cause all sorts of problems.

So, we're looking for some way to say "provide as much information as
you can." The problem with saying SHOULD is that we purport to know
how much information the publisher can provide. It would be very easy
to explain this issue in the spec, and I have no objection to doing
so.
SHOULD here means "must unless you absolutely can't". That seems like  
a perfectly dandy explanation of the intention of encouraging high  
quality feeds.

Graham

Re: PaceOriginalAttribute

2005-05-05 Thread Robert Sayre


On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:

> >
> Yeah, they think they are, or at least claim to think so.  But isn't
> that the same thing that is stated if you see the following in two
> feeds?
> 
> 
> bar:bar
> 
> foo:bar
> 
> foo:foo
> 
> I may be an imposter
> 
> 

> This says that this feed is (or at least claims it is) forwarding the
> entry with the id "foo:bar" from the feed "foo:foo".
> 
> I am honestly trying to see more in this, but as yet, I don't.

OK, now let's say you're subscribed to "imposter" in PubSub.


  bar:bar
  
 quux:quux1
 
 foo:foo
 
 I may be an imposter

  
 quux:quux2
 
 baz:baz
 
 I may be an imposter

 

Robert Sayre

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Tim Bray

On May 5, 2005, at 3:52 PM, Robert Sayre wrote:
Everything in the proposal section is fine with me, as well. It's that
"Notes" section that's the problem.
Note: I totally fail to understand the "Notes" bit at the end of
PaceTextShouldBeProvided.  It is underspecified to the extent that I
can't figure out what language change it is actually saying is
necessary.
That section says is "If PaceOptionalSummary is 'accepted', this Pace
changes summary to SHOULD." That's OK to propose, but you can't accept
both of them. They conflict.
No it doesn't, it says something about inserting the phrase "...is 
either not present or..." which, by the way, I don't understand.  Are 
we looking at the same document?

Basically, allowing title-only feeds seems OK to me, and encouraging
people to provide text also seems OK to me, so what's the problem?
Current spec: MUST contain a summary
after PaceOptionalSummary: MAY contain a summary
after PaceTextShouldBeProvided: SHOULD contain a summary
So what you're actually objecting to is the last part of the Pace 
before the "Impacts" section, that wants 4.1.2 to say that summary 
SHOULD be there if Content is absent.  Or am I missing something? -Tim

PaceSourceRecs

2005-05-05 Thread Tim Bray

-1
Irrespective of whether I agree with this or not, I think the material 
belongs in an implementor's guide, not the specification.  -Tim

PaceEntryState

2005-05-05 Thread Tim Bray

-1
I want to decouple this stuff from the format spec, i.e. I believe in 
PacePubControl.  -Tim

PaceRecommendPlainTextContent

2005-05-05 Thread Tim Bray

-1, ill-formed.
The sentiment is worthy but this does not suggest specific language for 
the draft. -Tim

PaceDuplicateIDWithSource2

2005-05-05 Thread Tim Bray

-1
Either we're OK with duplicates or we're not.  If we're not, we 
shouldn't relax that condition just because they come from different 
feeds.  The definition of atom:id says it's supposed to be 
*universally* unique, not unique per-source.  -Tim

PaceOriginalAttribute

2005-05-05 Thread Tim Bray

+1
I'm not 100% convinced it solves the problems Rob says it does, but it 
seems cheap, lightweight, and unlikely to cause any harm. -Tim

PaceFeedIdOrSelf

2005-05-05 Thread Tim Bray

-1
Too complicated, not enough benefit.  -Tim

PaceDuplicateIDWithSource

2005-05-05 Thread Tim Bray

-1
Either we're OK with duplicates or we're not.  If we're not, we 
shouldn't relax that condition just because they come from different 
feeds.  The definition of atom:id says it's supposed to be 
*universally* unique, not unique per-source.  -Tim

PacePubControl

2005-05-05 Thread Tim Bray

+1 kind of, there was all sorts of discussion over in Atom-Protocol 
around improvements and extensibility of the various fields; so I don't 
think work is done.  In any case, this is the right place for this 
stuff, so its discussion is orthogonal to consideration of the Atompub 
format Format draft. -Tim

PaceCoConstraintsAreBad

2005-05-05 Thread Tim Bray

Uh, this one is redundant, right?  It's covered by various combinations 
of other Paces, or am I missing something? -Tim

PaceOptionalFeedLink

2005-05-05 Thread Tim Bray

+1
There are people who want to publish feeds without rel="alternate" 
links.  I'm against telling people they can't do something they want to 
do without strong reasons, as in loss of interoperability.  I don't see 
the reasons here as strong enough.  -Tim

PacePubStatusResource

2005-05-05 Thread Tim Bray

Mu
This is interesting but orthogonal to finalizing the format spec. -Tim

PaceBriefExample

2005-05-05 Thread Tim Bray

+1.  Precision is good. -Tim

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Tim Bray

On May 5, 2005, at 11:02 AM, Robert Sayre wrote:
I don't see a conflict there. What's wrong with accepting two similar
paces because one corrects the flaws in the other?
Graham, that's just not true. It wasn't called
PaceSummariesAreNotRequired, was it? It materially changes the only
action PaceOptionalSummary takes. They are not compatible.
In fact, let's get the chairs to clarify this.
Speaking not as the chair but as an interested WG member,  I read them 
about eight times and I do not understand why they are in conflict.  
Someone please explain, as simply as possible, what the problem is, 
because I just don't get it.  On the face of it, I am inclined to be +1 
to both PaceOptionalSummary and PaceTextShouldBeProvided.

Note: I totally fail to understand the "Notes" bit at the end of 
PaceTextShouldBeProvided.  It is underspecified to the extent that I 
can't figure out what language change it is actually saying is 
necessary.

Basically, allowing title-only feeds seems OK to me, and encouraging 
people to provide text also seems OK to me, so what's the problem?  In 
fact, these seem pretty orthogonal; it seems quite plausible that one 
could like either of these without liking both.  -Tim

Re: Autodiscovery

2005-05-05 Thread Sjoerd Visscher


Supporting general hyperlinks starts making more sense if we have cases
other than alternate (I've written elsewhere about this) because the
amount of duplicated information is much greater. If you're only
supporting feeds that serve as an alternate form of the content, then it
makes sense to repeat one link once just to make the programmer stuck
writing the user agent. I'd hope that whatever library/toolkit they're
using supports XPath queries. Using them makes it easy to pluck out
anything with type="application/atom+xml" and an href property.
Maybe atom needs only one link with a rel attribute, but there are 
others. I have a lot of hyperlinks with rel attributes on my weblog 
homepage, and I refuse to repeat them all as link elements.

--
Sjoerd Visscher
http://w3future.com/weblog/

RE: Autodiscovery, real-world examples

2005-05-05 Thread James Tauber


On Thu, 5 May 2005 16:35:21 -0400, "Bob Wyman" <[EMAIL PROTECTED]> said:
> Being able to distinguish between "alternates" for the current
> page and "just other feeds" that are linked to from the page would be
> very useful.

+1

> Also, in the case where there are multiple real alternates to the
> page, it would be useful to be able to mark which feed is "preferred."

+0.5

James

Re: status paces

2005-05-05 Thread Tim Bray


On May 5, 2005, at 6:13 AM, Robert Sayre wrote:
Throw them all back. No reason we can't do this in an extension
element. It would be big mistake to put these in right now.
+1.  -Tim

Re: Autodiscovery

2005-05-05 Thread Nikolas 'Atrus' Coukouma


Nikolas 'Atrus' Coukouma wrote:

>I'm on the fence about whether or not a link element should be the
>*required*, even when a hyperlink is present in the body.
>
>Supporting general hyperlinks starts making more sense if we have cases
>other than alternate (I've written elsewhere about this) because the
>amount of duplicated information is much greater. If you're only
>supporting feeds that serve as an alternate form of the content, then it
>makes sense to repeat one link once just to make the programmer stuck
>writing the user agent.
>  
>
correction:
"just to make [things easier for] the programmer stuck writing the user
agent."

-Nikolas 'Atrus' Coukouma

Re: Autodiscovery

2005-05-05 Thread Nikolas 'Atrus' Coukouma

Sjoerd Visscher wrote:

> Nikolas 'Atrus' Coukouma wrote:
>
>> Sjoerd Visscher wrote:
>>
>>> Why not support hyperlinks too?
>>>
>>> So besides:
>>>
>>> 
>>>
>>> also:
>>>
>>> >> href="/xml/index.atom">Main Atom feed
>>>
>>> Most webpages already have a hyperlink to the feed, so they'd only
>>> need to add two attributes. It would be a waste to have to duplicate
>>> the information in the document head.
>>>
>>
>> The intent of the head element is to indicate a feed that serves as a
>> substitute for the page you're currently viewing.
>>
>> This other case is locating all feeds linked to from a page. For that,
>> the type attribute should be sufficient to indicate that you're linking
>> to a feed.
>
>
> No, a hyperlink with a rel attribute means the same as a link element.
> The HTML spec describes the rel attribute on the a element thus:
>
> This attribute describes the relationship from the current document to
> the anchor specified by the href attribute. The value of this
> attribute is a space-separated list of link types.

What I was getting at is that link elements in the head are usually a
kind of metadata intended for the user agent. Hyperlinks are usually
meant to be displayed. This proposal is aimed at providing metadata for
the user agent, so it makes since to put it in a link element in the head.

I'm on the fence about whether or not a link element should be the
*required*, even when a hyperlink is present in the body.

Supporting general hyperlinks starts making more sense if we have cases
other than alternate (I've written elsewhere about this) because the
amount of duplicated information is much greater. If you're only
supporting feeds that serve as an alternate form of the content, then it
makes sense to repeat one link once just to make the programmer stuck
writing the user agent. I'd hope that whatever library/toolkit they're
using supports XPath queries. Using them makes it easy to pluck out
anything with type="application/atom+xml" and an href property.

It's worth noting that in recent versions of XHTML, anything with an
href property is a hyperlink. There's no requirement to use an anchor or
an xlink:link element.

-Nikolas 'Atrus' Coukouma

Re: Autodiscovery - different cases should use different rel

2005-05-05 Thread Nikolas 'Atrus' Coukouma

fantasai wrote:

>
> Nikolas 'Atrus' Coukouma wrote:
>
> > I think you have three separate cases of autodiscovery:
> > * the feed for *this* page - handled by this autodiscovery proposal
> > * other feeds the author reads or recommends - usually done by linking
> > to a separate file. Some quick searching reveals one suggestion to use
> > rel="blogroll" for this
> > * any other feeds linked to for any reason at all - seems to be little
> > interest in
> >
> > I don't think combining these three into one case will do any good. In
> > fact, I think it's confusing and unusable.
>
> That makes sense.
>
> I think that you're missing one key use case, though: autodiscovery of
> a blog's main feed from sub-parts of it. A lot of websites link to the
> main blog feed from individual entries, for example, and they're doing
> it with rel="alternate", which is not appropriate. It frustrates me that
> there is no way of changing these links to not use rel="alternate".

An excellent point. Perhaps these should use rel="home" :)

>
> And for linking to other pages.. Here's a real-world example:
> The mozilla.org main page  is an example
> of where rel="alternate" is a problem. There were three feeds on
> it: "Announcements", "mozillaZine News", and "Mozilla Weblogs"
> (now only two). Each one is an alternate of a web page, but of
> _other_ pages (http://www.mozilla.org/news.html,
> http://www.mozillazine.org/, and http://planet.mozilla.org/
> respectively), not the mozilla.org
> front page. The last few headlines for each feed are listed on
> the front page, and the designer felt it was appropriate for
> autodiscovery to work on this page -- but it is not appropriate
> for rel="alternate" to be used for those autodiscovery links.
> They are not alternate representations of the front page.

These other feeds are suggestion/blogroll cases.

>
> Here's another example:
> LiveJournal creates a "Friends" page, where it aggregates the
> blogs of all the users you've designated as "friends". It could
> create an Atom feed representing this aggregation, and mark that
> as rel="alternate". 

Actually, a patch was just committed to do this ;)

> What could also be useful, however, would be
> linking to each of these blogs' feeds individually as well so
> that they're represented individually in my aggregator and it
> can aggregate them itself. Unlike the pre-aggregated feed,
> however, these are not alternate representations of the Friends
> page, and shouldn't be marked as such.

I think this is a suggestion/blogroll case.

>
> Making it possible for pages to link to non-alternate autodiscoverable
> feeds without using rel="alternate" -- and encouraging this practice --
> would make it possible for UAs to actually /discriminate/ between
> alternate and non-alternate feeds. Right now they can't, because
> everything is indiscriminately marked as "alternate".

>
> ~fantasai

I've basically concluded that the keys to autodiscovery of feeds, in the
general sense, should not be three (rel, type, and href), but two (type
and href). Type is plenty of specification that it's a feed. Claiming
it's relationship as "feed" doesn't seem correct. There are a few
mime-types used, and the one for atom (application/atom+xml) will be an
official standard as soon as the draft is accepted by the IETF.

The value of rel, if present, will vary based on relation
* the feed for *this* page - rel="alternate"
* the feed for main feed for this blog, in general - rel="home"
* other feeds the author reads or recommends - rel="suggested"
* any other feeds linked to for any reason at all - no rel, just the
type and href

Is this acceptable? I'm not completely happy with "home" and "suggested"
because they're not specified as link types in the HTML specs [1].
Sadly, it seems the HTML authors didn't consider these cases. "home"
seems to be an informal standard. Close matches in the HTML list are
"index", "contents", and "start". All of these are inaccurate, but I
think "contents" is the best fit.

"suggested" is just my own idea. I mentioned the rel="blogroll" before,
but that seems overly specific. "bookmark" seems to be the closest match
in the HTML list. Not in the way it's defined in the list, but the way
people usually think of it. I'm not sure what the heck the HTML spec is
indicating with:
"Refers to a bookmark. A bookmark is a link to a key entry point within
an extended document. The title attribute may be used, for example, to
label the bookmark. Note that several bookmarks may be defined in each
document."
That definition makes it a close match to "home," I suppose. Really, the
definition there is so vague that it's useless.

I can think of a couple other cases:
- Comment feeds, which are only generated by a few pieces of software so
far. These are close to, but not quite, alternate. they're usually
missing the entry itself, from what I understand. I think more work
needs to be done with comment feeds in general before we

Re: Autodiscovery

2005-05-05 Thread Sjoerd Visscher

Nikolas 'Atrus' Coukouma wrote:
Sjoerd Visscher wrote:

Why not support hyperlinks too?
So besides:

also:
Main Atom feed
Most webpages already have a hyperlink to the feed, so they'd only
need to add two attributes. It would be a waste to have to duplicate
the information in the document head.
The intent of the head element is to indicate a feed that serves as a
substitute for the page you're currently viewing.
This other case is locating all feeds linked to from a page. For that,
the type attribute should be sufficient to indicate that you're linking
to a feed.
No, a hyperlink with a rel attribute means the same as a link element. 
The HTML spec describes the rel attribute on the a element thus:

This attribute describes the relationship from the current document to 
the anchor specified by the href attribute. The value of this attribute 
is a space-separated list of link types.

--
Sjoerd Visscher
http://w3future.com/weblog/

RE: Autodiscovery, real-world examples

2005-05-05 Thread Bob Wyman


Fantasia wrote:
> Making it possible for pages to link to non-alternate 
> autodiscoverable feeds without using rel="alternate" -- and 
> encouraging this practice -- would make it possible for UAs to 
> actually /discriminate/ between alternate and non-alternate feeds.
> Right now they can't, because everything is indiscriminately marked 
> as "alternate".
+1. Being able to distinguish between "alternates" for the current
page and "just other feeds" that are linked to from the page would be very
useful. Also, in the case where there are multiple real alternates to the
page, it would be useful to be able to mark which feed is "preferred." My
concern here is the transition between Atom V0.3 and Atom V1.0. A page might
link to feeds in both formats (as well as RSS, RDF, etc.) but it would be
good to know which of these feeds is considered the "preferred" feed by the
producer. In this way, people could migrate off the older feeds and one day
we'd actually be able to stop producing multiple feeds on each site.
We should also consider providing such "preferred" links in Atom,
RSS, RDF, etc. feeds. I'd like to be able to publish something in my Atom
0.3 feeds that tell people "Don't keep reading this feed. Read the Atom 1.0
feed instead..."

bob wyman

Re: Autodiscovery

2005-05-05 Thread Nikolas 'Atrus' Coukouma


Sjoerd Visscher wrote:

>
> Why not support hyperlinks too?
>
> So besides:
>
> 
>
> also:
>
>  href="/xml/index.atom">Main Atom feed
>
> Most webpages already have a hyperlink to the feed, so they'd only
> need to add two attributes. It would be a waste to have to duplicate
> the information in the document head.
>
The intent of the head element is to indicate a feed that serves as a
substitute for the page you're currently viewing.

This other case is locating all feeds linked to from a page. For that,
the type attribute should be sufficient to indicate that you're linking
to a feed.

-Nikolas 'Atrus' Coukouma

-Nikolas 'Atrus' Coukouma

Re: Autodiscovery

2005-05-05 Thread Sjoerd Visscher

Why not support hyperlinks too?
So besides:


also:
Main Atom feed

Most webpages already have a hyperlink to the feed, so they'd only need 
to add two attributes. It would be a waste to have to duplicate the 
information in the document head.

--
Sjoerd Visscher
http://w3future.com/weblog/

Re: Atom feed refresh rates

2005-05-05 Thread Mark Pilgrim

On 5/5/05, John Panzer <[EMAIL PROTECTED]> wrote:
> I assume an HTTP Expires header for Atom content will work and play well
> with caches such as the Google Accelerator
> (http://webaccelerator.google.com/).  I'd also guess that a syntax-level
> tag won't.  Is this important?

Yes, and yes.  This is exactly the sort of software that we're talking
about when we say that HTTP's native caching mechanism is widely
supported.  All the proxies in the world (which is what Google's Web
Accelerator is, except it runs on your own machine and listens on port
9100) are able to reduce network traffic and therefore make the end
user's experience faster because they understand and respect the HTTP
caching mechanism.  (Google Web Accelerator does other things too,
like proxying requests through Google's servers.  And what are those
servers running?  Another caching HTTP proxy.)  Many ISPs do this at
the ISP level, both to reduce their own upstream bandwidth costs and
to make their end users happier.  Many corporations do this as well (I
would bet good money that IBM does it).  At one time, I even had Squid
installed on my home network to do this. 

HTTP caching works.

> The HTML solution for people who could not implement Expires: seems to
> be META tags with in theory equivalent information.  Though in practice
> the whole thing is a mess, this seems like a conceptually simple
> workaround.  Is there something obviously wrong with it?

Other than being a God-awful mess?  No, there's nothing wrong with it. ;)

-- 
Cheers,
-Mark

Re: PaceAllowDuplicateIDs

2005-05-05 Thread John Panzer

Graham wrote:
On 5 May 2005, at 5:38 pm, Eric Scheid wrote:
Many wiki's offer options in displaying their change log with  either 
most
recent changes only, or all changes. Both models are commonly  supported
because some people want to see notifications of all changes, while  
others
just want to see the most recent change. That is part of wiki  
culture, all
the way back to ward's wiki.

OK that makes sense. I still think it's the wrong way to model a  
change log as a feed.

My other two criticisms still stand:
"atom:updated is used by the publisher to show what they consider a  
significant change. The user, on the other hand, probably wants to  
see the latest version, reliably, even if the publisher disagrees  
that the change was significant. This is the core problem with Tim's  
proposal. There is no way to create an aggregator that works in the  
way the user expects."
Just a thought:  On the other hand, perhaps this is an opportunity to 
operationally define "significant change":  A change which results in a 
new version being exposed on one's feed.  If you think your users would 
care about seeing the change, then change the atom:updated field and 
'republish' by adding to the feed.  If not, just change your content and 
don't republish.

Examples of this might include:  Fixing irrelevant typos.  Changing 
character set encodings.  Changing formatting to match a new style guide.

-John

Re: PaceTextShouldBeProvided

2005-05-05 Thread Robert Sayre


On 5/5/05, Graham <[EMAIL PROTECTED]> wrote:
> On 5 May 2005, at 6:23 pm, Robert Sayre wrote:
> 
> > It would be deeply bogus to accept a Pace whose sole action was to
> > remove a normative requirement, and simultaneously accept a Pace that
> > puts it back in. Seems obvious to me.
> 
> Not really. Assuming PaceOptionalSummary is accepted, there are two
> completely valid outcomes:
> 
> 1. PaceTextShouldBeProvided rejected => summaries are not required,
> and textual content is not encouraged
> 2. PaceTextShouldBeProvided accepted => summaries are not required,
> but textual content is encouraged
> 
> I don't see a conflict there. What's wrong with accepting two similar
> paces because one corrects the flaws in the other?

Graham, that's just not true. It wasn't called
PaceSummariesAreNotRequired, was it? It materially changes the only
action PaceOptionalSummary takes. They are not compatible.

In fact, let's get the chairs to clarify this.

> > So, we're looking for some way to say "provide as much information as
> > you can." The problem with saying SHOULD is that we purport to know
> > how much information the publisher can provide. It would be very easy
> > to explain this issue in the spec, and I have no objection to doing
> > so.
> 
> SHOULD here means "must unless you absolutely can't". 

 We've covered lots of perfectly valid reasons not to include a
summary, and we've heard from implementors that actually prefer its
absence.

Robert Sayre

Re: Atom feed refresh rates

2005-05-05 Thread John Panzer

I assume an HTTP Expires header for Atom content will work and play well 
with caches such as the Google Accelerator 
(http://webaccelerator.google.com/).  I'd also guess that a syntax-level 
tag won't.  Is this important? 

The HTML solution for people who could not implement Expires: seems to 
be META tags with in theory equivalent information.  Though in practice 
the whole thing is a mess, this seems like a conceptually simple 
workaround.  Is there something obviously wrong with it?

-John

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story


On 5 May 2005, at 19:23, Graham wrote:
On 5 May 2005, at 5:38 pm, Eric Scheid wrote:
Many wiki's offer options in displaying their change log with  
either most
recent changes only, or all changes. Both models are commonly  
supported
because some people want to see notifications of all changes,  
while others
just want to see the most recent change. That is part of wiki  
culture, all
the way back to ward's wiki.

OK that makes sense. I still think it's the wrong way to model a  
change log as a feed.
Cool. Rational argument does sometimes work :-)
My other two criticisms still stand:
What was wrong with the arguments I gave?
"atom:updated is used by the publisher to show what they consider a  
significant change. The user, on the other hand, probably wants to  
see the latest version, reliably, even if the publisher disagrees  
that the change was significant. This is the core problem with  
Tim's proposal. There is no way to create an aggregator that works  
in the way the user expects."
I think this is relatively simple. If the publisher publishes a feed  
document with two entries containing the same id and the same  
atom:updated timestamp then he is claiming that there are no  
significant changes between them. In fact the same is true if he  
publishes those two entries in two different feed documents. So there  
really is no problem here that is particular to PaceAllowDuplicateIDs.

If the feed reader wants reliability he can't get more reliability by  
making the spec tighter
than the feed producer is able to give. If the feed producer is  
unreliable, then nothing in the
spec can change that.

The feed producer must understand that by creating feeds where the  
above described situation is
true will lead to erratic behavior. If there is a significant  
difference between those entries
then he had better change the time stamp. Otherwise his unreliable  
behavior will create unreliable
effects. Nothing new here.

"Finally, at pubsub, what happens when they download an entry from  
one feed, then the user edits it, but doesn't modify atom:updated,  
then they download the new entry from a second feed associated with  
the site? Different content, identical atom:ids, identical  
atom:updated => Invalid feed. They're not in any better position  
than they were before. This doesn't even solve the problem it's  
meant to."
I imagine this is simple, but I will leave the full details for Bob  
to answer. My guess is that
if Bob comes across the above situation, then he will be confronted  
by a situation where he has
according to the publisher two entries that are not significantly  
different. He can therefore choose
between them randomly, and just add one of them. The publisher after  
all thinks that there are
no significant differences between them. If the initial publisher  
thought that there were significant differences between these  
entries, then he should have given them different time
stamps.


Basically, atom:updated doesn't properly differentiate versions,  
and the way atom:updated is being used by the proposal doesn't gel  
with the actual spec of the element.
I think the above argument shows that there is in fact nothing wrong  
with atom:updated.
We just have to think in terms of interoperability and not in terms  
of Platonic forms.

Henry Story
http://bblfish.net/blog/

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham

On 5 May 2005, at 5:38 pm, Eric Scheid wrote:
Many wiki's offer options in displaying their change log with  
either most
recent changes only, or all changes. Both models are commonly  
supported
because some people want to see notifications of all changes, while  
others
just want to see the most recent change. That is part of wiki  
culture, all
the way back to ward's wiki.
OK that makes sense. I still think it's the wrong way to model a  
change log as a feed.

My other two criticisms still stand:
"atom:updated is used by the publisher to show what they consider a  
significant change. The user, on the other hand, probably wants to  
see the latest version, reliably, even if the publisher disagrees  
that the change was significant. This is the core problem with Tim's  
proposal. There is no way to create an aggregator that works in the  
way the user expects."

"Finally, at pubsub, what happens when they download an entry from  
one feed, then the user edits it, but doesn't modify atom:updated,  
then they download the new entry from a second feed associated with  
the site? Different content, identical atom:ids, identical  
atom:updated => Invalid feed. They're not in any better position than  
they were before. This doesn't even solve the problem it's meant to."

Basically, atom:updated doesn't properly differentiate versions, and  
the way atom:updated is being used by the proposal doesn't gel with  
the actual spec of the element.

Graham

Re: PaceTextShouldBeProvided

2005-05-05 Thread Robert Sayre

On 5/5/05, Sam Ruby <[EMAIL PROTECTED]> wrote:
>
> > This Pace is incompatible with PaceOptionalSummary and incomplete. -1.
> 
> Something a little less curt would be appreciated.
> 
> The stated abstract of PaceOptionalSummary (i.e., "removing the
> requirement for ") is met.  In your mind, this equates to
> completely optional.  That has yet to be conclusively established.

Well, consider the name of the Pace, and then consider this sentence:

5. MAY   This word, or the adjective "OPTIONAL", mean that an item is
truly optional.

It would be deeply bogus to accept a Pace whose sole action was to
remove a normative requirement, and simultaneously accept a Pace that
puts it back in. Seems obvious to me.

> What concerns me more, however, is that interoperability issues that
> PaceOptionalSummary not only creates, but also uncovered during its
> discussion.

We know exactly what issues optional content has, because all of the
other formats have it.

> Unless there is some plan for addressing these interoperability issues
> (and by that, I mean something more constructive than "That's fine, but
> we're not here to tailor the format to your app."), then perhaps BOTH
> paces are incomplete.

Let's enumerate the issues, rather than insist they exist. Frankly, I
seriously doubt that anyone with customers will outright reject a
title-only feed.

> There are a number of ways to finesse the identification of the issue
> into the spec.  For example, take a look at how Tim worded
> PaceAllowDuplicateIDs.  Producers are put on effectively put on notice
> that if they include multiple entries with the same ID, that some or all
> of them may be ignored.

I don't think PaceAllowDuplicateIDs successfully finessed the issue.

> How should we convey a similar sentiment about the reality that entries
> without a readily available textual representation may suffer the same fate?

So, we're looking for some way to say "provide as much information as
you can." The problem with saying SHOULD is that we purport to know
how much information the publisher can provide. It would be very easy
to explain this issue in the spec, and I have no objection to doing
so. Why do we need the RFC2119 words?

Robert Sayre

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Antone Roundy

On Thursday, May 5, 2005, at 08:44  AM, Antone Roundy wrote:
If we accept this Pace, are we going to do anything to address the DOS 
issue for aggregated feeds?
Bob, if I may direct a few question to you, since you have the most 
experience with this issue: if PaceAllowDuplicateIDs is adopted, how 
would you anticipate that PubSub would go about handling entries with 
the same atom:id coming from different feeds?  What if each appears to 
be claiming to be the original feed for the entry?  What if both are 
getting aggregated into the same feed, but your system doesn't think 
they're really the same entry?

I'm in favor of the Pace, as far as it goes, but was surprised to see 
that it doesn't talk about these issues, given that it was motivated by 
a conversation with you.

Re: PaceTextShouldBeProvided

2005-05-05 Thread Sam Ruby

Robert Sayre wrote:
On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
+1 except for one thing: In section 4.1.2, I'd suggest something along
these lines:
atom:entry elements which do not contain an atom:content element, or
whose atom:content element's type attribute indicates a MIME media
type, SHOULD contain an atom:summary element.
This Pace is incompatible with PaceOptionalSummary and incomplete. -1.
Something a little less curt would be appreciated.
The stated abstract of PaceOptionalSummary (i.e., "removing the 
requirement for ") is met.  In your mind, this equates to 
completely optional.  That has yet to be conclusively established.

What concerns me more, however, is that interoperability issues that 
PaceOptionalSummary not only creates, but also uncovered during its 
discussion.

Unless there is some plan for addressing these interoperability issues 
(and by that, I mean something more constructive than "That's fine, but 
we're not here to tailor the format to your app."), then perhaps BOTH 
paces are incomplete.

There are a number of ways to finesse the identification of the issue 
into the spec.  For example, take a look at how Tim worded 
PaceAllowDuplicateIDs.  Producers are put on effectively put on notice 
that if they include multiple entries with the same ID, that some or all 
of them may be ignored.

How should we convey a similar sentiment about the reality that entries 
without a readily available textual representation may suffer the same fate?

- Sam Ruby

Re: PaceOriginalAttribute

2005-05-05 Thread Antone Roundy

On Thursday, May 5, 2005, at 10:14  AM, Robert Sayre wrote:
On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
It may help people avoid
accidentally generating invalid feeds (if we stick to not to allowing
duplication of atom:id within a feed), but it does it by simply
shunting the issue off into a different element which doesn't have
duplication constraints.
Incorrect. Think harder about what PubSub services do. They take an
entry, and munge it (people like that). They move the feed data to
atom:source, and probably add their own extension elements to it. I
think they are "forwarding" a message. My proposal preserves the
identity of the original message, while requiring the service to mint
an identifier for its forwarded message.
Okay, the forwarded message has it's own all-in-one-element identifier. 
 This sounds useful...except that someone may accidentally or 
intentionally duplicate that identifier too.

It doesn't address the DOS problem--neither
accidental nor intentional.
Oh yes it does. Each entry's provenance is documented. The data format
accurately states that the intermediary has munged the original entry.
Maybe I am missing something, but if so...well, by definition, I'm not 
seeing it.  Let's look at an example.  Say your aggregator sees these 
in different feeds:


foo:foo

foo:bar
I'm the real thing



bar:bar

bar:foo

foo:foo
foo:bar

I may be an imposter


Do you display one or both?  How would your decision making process 
differ from if you were to see the following in the second case?


bar:bar

foo:bar

foo:foo

I may be an imposter


And what if we added a third case?

qwerty:bar

bar:foo

foo:foo
foo:bar

I'm definitely an imposter


And it doesn't make it any easier to
determine whether or not entries in different feeds with the same
atom:id are really the same entry or not.  In fact, it just 
complicates
the task by requiring the inspection of two elements instead of one.
Incorrect. What it does is explicitly state that two different feeds
think they are fowarding the same entry.
Yeah, they think they are, or at least claim to think so.  But isn't 
that the same thing that is stated if you see the following in two 
feeds?


bar:bar

foo:bar

foo:foo

I may be an imposter


This says that this feed is (or at least claims it is) forwarding the 
entry with the id "foo:bar" from the feed "foo:foo".

I am honestly trying to see more in this, but as yet, I don't.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Eric Scheid

On 6/5/05 1:53 AM, "Graham" <[EMAIL PROTECTED]> wrote:

>> This proposal permits this, and it does not harm anyone else.
> 
> It harms everyone, by allowing a second, unrelated data model in Atom
> feeds. They may not be posting today, but I assure you, when other
> aggregator authors get the first user complaints about how Eric's
> wiki log displays incompletely in their program, they'll forgive Dave
> Winer everything.

Many wiki's offer options in displaying their change log with either most
recent changes only, or all changes. Both models are commonly supported
because some people want to see notifications of all changes, while others
just want to see the most recent change. That is part of wiki culture, all
the way back to ward's wiki.

It wouldn't be surprising to find the same options made available for wiki
logs in rss. Hey, here's one right now

http://www.intertwingly.net/wiki/pie/RecentChanges?action=rss_rc

Apparently, if you add an "unique=1" URL parameter you get a list of changes
where page names are unique, i.e. where only the latest change of each page
is reflected.

e.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story


On 5 May 2005, at 17:53, Graham wrote:
On 5 May 2005, at 4:22 pm, Henry Story wrote:
If you don't want to keep a history of the entries all you need to  
do is drop all but the
latest entry with the same id. There is nothing more to it. Just  
show the user the last
one you came across.

But, if we follow Eric's model of how a wiki changelog should be  
defined, I'll be missing entries in the log, because several  
different entries have the same id. Ergo, the user interface and  
data model for the new type of feed this proposal permits is very  
different.
If your tool (Is it Shrook2? [1]) only shows people the latest  
version available to you of
an entry, then by showing them only the latest version, Shrook2 will  
be giving the user what
he is expecting.

When your news reader currently reads feeds on the internet what does  
it do
with changed entries? Either it keeps the older version around, for  
the user to browse, or it
does not. If your users don't mind you throwing away the older  
versions of an entry, then
they won't mind you throwing away the older versions of the above  
entries either. There is no
difference in the behavior between allowing changed entries across  
feed documents and changed
entries inside a feed document. People who place two entries with the  
same id inside a feed
document should be aware that tools like yours will have the behavior  
they do, and that this is
ok.

Other people may be interested in looking at things historically.  
They will get a historical
viewer and be happy with it.

I think the current proposal is good exactly because it allows the  
wiki people to express
what they want to express correctly. Namely how their wiki entry is  
changing over time.

This proposal permits this, and it does not harm anyone else.
It harms everyone, by allowing a second, unrelated data model in  
Atom feeds. They may not be posting today, but I assure you, when  
other aggregator authors get the first user complaints about how  
Eric's wiki log displays incompletely in their program, they'll  
forgive Dave Winer everything.
Again, has anyone yet complained to you that you have not kept a  
historical and browse-able track
record of how the entries Shrook2  is looking at have changed over  
time? Clearly they could,
as you sometimes let them know that an entry that they already have  
read has been updated. They
could ask you what the changes were, no? How it changed, etc.

If your users don't care that much about the history of an entry,  
then you can dump all but the
latest entry. Or you could just keep the last two entries, so that  
you can show them a diff.

Graham
HJStory
http://bblfish.net/blog/
[1] http://www.fondantfancies.com/apps/shrook/

Re: Atom feed refresh rates

2005-05-05 Thread Mark Pilgrim


On 5/5/05, Dan Brickley <[EMAIL PROTECTED]> wrote:
> [googles a bit] OK it looks like Gnutella also uses HTTP for the file
> download part of it's protocol, fwiw. (including Range: header)
> http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf

You mean RSS's  element is even more useless than I thought?  I
didn't think that was possible.

-- 
Cheers,
-Mark

Re: Atom feed refresh rates

2005-05-05 Thread Dan Brickley

* Henri Sivonen <[EMAIL PROTECTED]> [2005-05-05 18:35+0300]
> 
> On May 5, 2005, at 16:24, Walter Underwood wrote:
> 
> >--On May 5, 2005 8:07:15 AM -0500 Mark Pilgrim <[EMAIL PROTECTED]> 
> >wrote:
> >>
> >>Not to be flippant, but we have one that's widely available.  It's
> >>called the Expires header.
> >
> >You need the information outside of HTTP. To quote from the RSS spec
> >for ttl:
> >
> >  This makes it possible for RSS sources to be managed by a 
> >file-sharing
> >  network such as Gnutella.
> >
> >Caching information is about knowing when your client cache is stale,
> >regardless of how you got the feed.
> 
> Virtually everyone with IP connectivity can do HTTP, and HTTP has the 
> Expires header. If this feature is important to you, why would you 
> switch to a transfer protocol that doesn't have the feature? (I am not 
> claiming anything about the actual Gnutella features.) To me, the "what 
> if the feed is not over HTTP" argumentation seems theoretical 
> over-generalization.

+1 

FWIW various P2P/filesharing protocols use HTTP, eg. Kazaa and others make 
use of HTTP's ability to request a byte range, handy if you're
requesting chunks of the same file from different servers. Those who
care to have HTTP header semantics show up in other environments can 
do various things (eg. reflect into an XML namespace). But it doesn't 
seem to me to be core business of the AtomPub WG to do this work...

[googles a bit] OK it looks like Gnutella also uses HTTP for the file 
download part of it's protocol, fwiw. (including Range: header)
http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf

Dan

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story

Hi Dave,
 nice to see you participate here. I understand your points, and  
I myself thought the
way you did for a while.

[Oops, I see now that you have retracted your point. Oh well. I had  
already started writing
the following]

On 5 May 2005, at 17:27, David M Johnson wrote:
I'm -1 on PaceAllowDuplicateIDs
Please consider the following points before you vote.
Reasons:
1) We're supposed to be standardizing current practice not  
inventing new things. Current best practice is to have unique IDs  
and current software (e.g. Javablogs.com) is predicated on this  
practice. I know, this practice is not followed widely enough, but  
that is another matter.
Atom is standardizing current practice, but it is also adding some  
features. For example name
spaces and ids. The atom charter also requires us to allow archives

[[
  * a complete archive of all entries in a feed
 ]]
Graham himself thinks that archives are possible, since he supports  
the use of an
 head element.

2) I think it is *much* more useful to think of an Atom Entry as an  
event that occurred at a specific time. Typically, an event is the  
publication of an article or blog entry on the web. For example:

   event: CNET published article
   subject: CNET
   object: article
But an event it could also represent other events.
   event: delivery van delivers package
   subject: delivery van
   object: package
   event: alarm system sends warning
   subject: alarm system
   object: warning
   event: server sends load warning
   subject: server
   object: load warning
If you think of Atom Entries as events, then it makes sense to  
consider the Atom Entry ID to be the ID of the event, not the ID of  
the subject or object of the event.
You are right. There are two types of objects that we need to think  
about:
   A- the event/state of a resource at a particular time
   B- the thing that makes these different states the state of the  
same thing

Clearly we need (B) or else all the talk about an entry changing over  
time (atom:updated)
would not make sense.

So let us start off, as I did a long time ago, by thinking that the  
the id of an entry
uniquely identifies the event/state of the entry. For every id there  
can be only one and
only one "" representation. That id is that  
representation. It is, if you
wish, the name of a state of something else... and that would be?

I think it is clear that one of the roles of the id is to make it  
possible for an
entry to be moved from one web site to another, so that if your blog  
service provider
lets you down, you can still refer to the entry even when you have  
moved it to a
different "alternate" position. Graham has made such a point quite  
often. Entries it
has often been said can change, but the id remains the same. I think  
this is clearly
the consensus on this list. So the id URI is what identifies the  
different
"..." representations as being representations of the  
same thing.


Events are unique (you can't have more than one version of an  
event) and can be assigned GUIDs and therefore you cannot have more  
than one entry with the same ID.
yes. But I don't think that this is the consensus on this group. The  
good thing is that
you can achieve the same identification of a state through the  
combination of the id and the
modification time.

[here I noticed that you had changed your mind, anyway. I think I had  
exactly the same
thought as you did when I first started thinking about this. ]


In the case of earthquake data, each new data report is a new event.
   event: agency reports earthquake data
   subject: agency
   object: earthquake data
The ID is the ID of the "data reported" event not the ID of the  
earthquake.

We don't know what subjects and objects people are going to use in  
the future, so we can't specify Atom elements or IDs for subjects  
and objects -- that's what extensions are for. If you want to  
create a feed to syndicate information about earthquakes, then you  
introduce an extension for uniquely identifying earthquakes. The  
same goes for earthquakes.

- Dave

Re: PaceOriginalAttribute

2005-05-05 Thread Robert Sayre

On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
> 
> -1.
> 
> I don't see that this solves any problem.  

I suggest you reread it. Your analysis is deeply flawed.

> It may help people avoid
> accidentally generating invalid feeds (if we stick to not to allowing
> duplication of atom:id within a feed), but it does it by simply
> shunting the issue off into a different element which doesn't have
> duplication constraints. 

Incorrect. Think harder about what PubSub services do. They take an
entry, and munge it (people like that). They move the feed data to
atom:source, and probably add their own extension elements to it. I
think they are "forwarding" a message. My proposal preserves the
identity of the original message, while requiring the service to mint
an identifier for its forwarded message.

> It doesn't address the DOS problem--neither
> accidental nor intentional.  

Oh yes it does. Each entry's provenance is documented. The data format
accurately states that the intermediary has munged the original entry.

> And it doesn't make it any easier to
> determine whether or not entries in different feeds with the same
> atom:id are really the same entry or not.  In fact, it just complicates
> the task by requiring the inspection of two elements instead of one.

Incorrect. What it does is explicitly state that two different feeds
think they are fowarding the same entry.

Robert Sayre

Re: Atom feed refresh rates

2005-05-05 Thread Tim Bray


Warning: we are into the end-game.  What really counts is the set of 
outstanding Paces.  When Paul and I are going through the list to 
figure out consensus calls, emails that don't have a Pace in the 
Subject line are apt to get ignored.

So I'm not sure this endless thread entitled "feed refresh rates" is 
doing anyone any good unless it can coalesce around a Pace. -Tim

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham

On 5 May 2005, at 4:22 pm, Henry Story wrote:
If you don't want to keep a history of the entries all you need to  
do is drop all but the
latest entry with the same id. There is nothing more to it. Just  
show the user the last
one you came across.
But, if we follow Eric's model of how a wiki changelog should be  
defined, I'll be missing entries in the log, because several  
different entries have the same id. Ergo, the user interface and data  
model for the new type of feed this proposal permits is very different.

This proposal permits this, and it does not harm anyone else.
It harms everyone, by allowing a second, unrelated data model in Atom  
feeds. They may not be posting today, but I assure you, when other  
aggregator authors get the first user complaints about how Eric's  
wiki log displays incompletely in their program, they'll forgive Dave  
Winer everything.

Graham

Re: PaceRecommendPlainTextContent

2005-05-05 Thread Eric Scheid


-1 also.

> Abstract
>
> Text containers and content blocks specifically may contain rich-text, which
> must be down-stripped by more basic aggregators. Simply removing tags from
> X/HTML streams can however easily truncate meaning as well.

I am subscribed to multiple feeds which already down-strip their X/HTML
content for their feeds, and definitely agree that such down-stripping does
clobber a lot of meaning.

This pace, if the recommendation takes effect, would result in yet more
publishers down-stripping their rich content at the publishing end. Let a
thousand buggy implementations bloom :-(

I say leave the down-stripping in the aggregator - that way if it really is
awful the user can choose a different aggregator.


> The examples are of course construed. Most blogging software doesn't emit such
> exotic HTML features.

A far more common example (ie. less exotic html) is where  or
 get stripped. Reading the stripped-text version of that can be quite
confusing.

e.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Dave Johnson

Immediately after sending this message, I had a rush  of second 
thoughts.

My point #2 is not very well thought out. I think it applies for things 
like earthquake data, but when Atom feeds represent blog entries or 
articles (in an archive or an Atom Protocol feed) the  ID represents 
the article not an event in the blog entry's life.  So, you can 
discount my second reason against the pace.

- Dave

On May 5, 2005, at 11:27 AM, David M Johnson wrote:
I'm -1 on PaceAllowDuplicateIDs
Reasons:
1) We're supposed to be standardizing current practice not inventing 
new things. Current best practice is to have unique IDs and current 
software (e.g. Javablogs.com) is predicated on this practice. I know, 
this practice is not followed widely enough, but that is another 
matter.

2) I think it is *much* more useful to think of an Atom Entry as an 
event that occurred at a specific time. Typically, an event is the 
publication of an article or blog entry on the web. For example:

   event: CNET published article
   subject: CNET
   object: article
But an event it could also represent other events.
   event: delivery van delivers package
   subject: delivery van
   object: package
   event: alarm system sends warning
   subject: alarm system
   object: warning
   event: server sends load warning
   subject: server
   object: load warning
If you think of Atom Entries as events, then it makes sense to 
consider the Atom Entry ID to be the ID of the event, not the ID of 
the subject or object of the event. Events are unique (you can't have 
more than one version of an event) and can be assigned GUIDs and 
therefore you cannot have more than one entry with the same ID.

In the case of earthquake data, each new data report is a new event.
   event: agency reports earthquake data
   subject: agency
   object: earthquake data
The ID is the ID of the "data reported" event not the ID of the 
earthquake.

We don't know what subjects and objects people are going to use in the 
future, so we can't specify Atom elements or IDs for subjects and 
objects -- that's what extensions are for. If you want to create a 
feed to syndicate information about earthquakes, then you introduce an 
extension for uniquely identifying earthquakes. The same goes for 
earthquakes.

- Dave

On May 5, 2005, at 12:02 AM, Tim Bray wrote:

http://www.intertwingly.net/wiki/pie/PaceAllowDuplicateIDs
This Pace was motivated by a talk I had with Bob Wyman today about 
the problems the synthofeed-generator community has.

Summary:
1. There are multiple plausible use-cases for feeds with duplicate IDs
2. Pro and Contra
3. Alternate Paces
4. Details about this Pace
1. Use-Cases
Here's a stream of stock-market quotes.
My Portfolio
 
 MSFT
  2005-05-03T10:00:00-05:00
  Bid: 25.20 Ask: 25.50 Last: 25.20
  
 MSFT
  2005-05-03T11:00:00-05:00
  Bid: 25.15 Ask: 25.25 Last: 25.20
  
 MSFT
  2005-05-03T12:00:00-05:00
  Bid: 25.10 Ask: 25.15 Last: 25.10
  

You could also imagine a stream of weather readings.  Bob's actual 
here-and-now today use-case from PubSub is earthquakes, an entry 
describes an earthquake and they keep re-issuing it as new info about 
strength/location comes in.

Some people only care about the most recent version of the entry, 
others might want to see all of them.  Basically, each atom:entry 
element describes the same Entry, only at a different point in time.

You could argue that in some cases, these are representations of the 
Web resources identified by the atom:id URI, but I don't think we 
need to say that explicitly.

Yes, you could think of alternate ways of representing stock quotes 
or any of the other use-cases but this is simple and direct and 
idiomatic.

2. Pro and Contra
Given that I issued the consensus call rejecting the last attempt to 
do this, which was  PaceRepeatIdInDocument, I felt nervous about 
revisiting the issue.  So I went and reviewed the discussion around 
that one, which I extracted and placed at 
http://www.tbray.org/tmp/RepeatID.txt for the WG's convenience.

Reviewing that discussion, I'm actually not impressed.  There were a 
few -1's but very few actual technical arguments about why this 
shouldn't be done.  The most common was "Software will screw this 
up".  On reflection, I don't believe that.  You have a bunch of 
Entries, some of them have the same ID and are distinguished by 
datestamp.  Some software will show the latest, some will show all of 
them, the good software will allow switching back and forth.  Doesn't 
seem like rocket science to me.

So here's how I see it: there are plausible use cases for doing this, 
and one of the leading really large-scale implementors in the space 
(PubSub) wants to do this right now.  Bob's been making strong claims 
about not being able to use Atom if this restriction remains in 
place.

I believe strongly that if there's something that implementors want 
to do, standards shouldn't get in the way unless there's real 
interoperability damage.  I'm certainly pre

Re: Atom feed refresh rates

2005-05-05 Thread Henri Sivonen

On May 5, 2005, at 16:24, Walter Underwood wrote:
--On May 5, 2005 8:07:15 AM -0500 Mark Pilgrim <[EMAIL PROTECTED]> 
wrote:
Not to be flippant, but we have one that's widely available.  It's
called the Expires header.
You need the information outside of HTTP. To quote from the RSS spec
for ttl:
  This makes it possible for RSS sources to be managed by a 
file-sharing
  network such as Gnutella.

Caching information is about knowing when your client cache is stale,
regardless of how you got the feed.
Virtually everyone with IP connectivity can do HTTP, and HTTP has the 
Expires header. If this feature is important to you, why would you 
switch to a transfer protocol that doesn't have the feature? (I am not 
claiming anything about the actual Gnutella features.) To me, the "what 
if the feed is not over HTTP" argumentation seems theoretical 
over-generalization.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Antone Roundy

On Thursday, May 5, 2005, at 09:15  AM, Eric Scheid wrote:
Have a look here:
http://en.wikipedia.org/w/index.php?title=Main_Page&action=history
There you have a reverse chrono list, each with an author, date, and
summary. Looks an awful lot like each one is an entry to me.
and looks to me like a stream of meta-data concerning the one entry to 
me.

and not distinct and separable entries like you'd find in your every 
day
blog.

Henry has the right idea -- the spec should allow both kinds, rather 
than
trying to shoe-horn everything into the one viewpoint of what is an 
entry.

+1 -- allow the publisher to decide which model fits their intent.

PaceSourceRecs

2005-05-05 Thread Antone Roundy

Looks good, but perhaps the recommendations could be slightly 
different: Start by calculating the the language of the atom:feed and 
the atom:entry.  Second, if the language of atom:entry isn't the same 
as the aggregate feed, set it.  Third, if the language of atom:feed 
isn't the same as the atom:entry, set it.  Same process with the Base 
IRI.  This process would prevent unnecessary duplication.

Re: PaceTextShouldBeProvided

2005-05-05 Thread Robert Sayre


On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
> 
> +1 except for one thing: In section 4.1.2, I'd suggest something along
> these lines:
> 
> atom:entry elements which do not contain an atom:content element, or
> whose atom:content element's type attribute indicates a MIME media
> type, SHOULD contain an atom:summary element.

This Pace is incompatible with PaceOptionalSummary and incomplete. -1.

Robert Sayre

Re: AtomPubIssuesList for 2005/05/05

2005-05-05 Thread Henry Story

Can you add PaceAlternateLinkWeakening. It was discussed but I never  
put it on
the wiki.

Henry
On 5 May 2005, at 13:17, Sam Ruby wrote:

*** REMINDER **
** Use more specific subject lines when responding to this note! **
*** REMINDER **
First the meat... here's the new atom pub issues list, conveniently  
sorted into categories:

  EntryId:
http://intertwingly.net/wiki/pie/PaceAllowDuplicateIDs
http://intertwingly.net/wiki/pie/PaceDuplicateIDWithSource2
http://intertwingly.net/wiki/pie/PaceDuplicateIDWithSource
http://intertwingly.net/wiki/pie/PaceExplainDuplicateIds
  FeedId:
http://intertwingly.net/wiki/pie/PaceFeedIdOrAlternate
http://intertwingly.net/wiki/pie/PaceFeedIdOrSelf
http://intertwingly.net/wiki/pie/PaceOptionalFeedLink
  Provenance:
http://intertwingly.net/wiki/pie/PaceOriginalAttribute
http://intertwingly.net/wiki/pie/PaceSourceRecs
  Status:
http://intertwingly.net/wiki/pie/PaceEntryState
http://intertwingly.net/wiki/pie/PacePubControl
http://intertwingly.net/wiki/pie/PacePubStatusResource
  Text:
http://intertwingly.net/wiki/pie/PaceBriefExample
http://intertwingly.net/wiki/pie/PaceCoConstraintsAreBad
http://intertwingly.net/wiki/pie/PaceOptionalSummary
http://intertwingly.net/wiki/pie/PaceRecommendPlainTextContent
http://intertwingly.net/wiki/pie/PaceTextShouldBeProvided
  Recommended for Closure:
http://intertwingly.net/wiki/pie/PaceXhtmlDivSuggestedOnly
http://intertwingly.net/wiki/pie/PaceXmlContentWrapper
Now for some administrivia.  No progress was made on the last  
published issued list[1], so I've gone ahead and marked those  
issues that were recommended for closure, closed; and those  
currently under discussion were moved back to Needing to Revisit.

Next, I'd like to remind everybody that last call for the Atom  
Format went out.  Operationally, what this means is that the  
secretary and co-chairs are going to be increasingly reluctant to  
revisit things simply because somebody wants to bring them up  
again.  What that means is that in order to successfully bring up  
an issue, you need to do a little homework.  Demonstrate that you  
have revisited the previous discussion, and that you either have  
something new to add, or can point out some evidence that the  
previous consensus call was made in error.

Tim has taken the opportunity to lead by example on this one with  
PaceAllowDuplicateIDs.  The secretary and co-chairs all are in  
agreement that the XhtmlDiv related paces don't meet this  
criteria.  If anyone disagrees, what we would like to ask is that  
you follow Tim's lead.

Because we are in last call, I've scheduled everything related to  
the Format document.  As one of the status paces touches on the  
format, I've scheduled all three.  All we need to resolve now is  
the extent to which this is going to affect the format document.

I believe that PaceBriefExample is truly editorial, meaning that  
the editors can act on this at their discretion.

- Sam Ruby
[1] http://www.imc.org/atom-syntax/mail-archive/msg13691.html

Re: PaceTextShouldBeProvided

2005-05-05 Thread Sam Ruby

Antone Roundy wrote:
+1 except for one thing: In section 4.1.2, I'd suggest something along 
these lines:

atom:entry elements which do not contain an atom:content element, or 
whose atom:content element's type attribute indicates a MIME media type, 
SHOULD contain an atom:summary element.
Incorporated.  Thanks!
- Sam Ruby

Re: PaceAllowDuplicateIDs

2005-05-05 Thread David M Johnson

I'm -1 on PaceAllowDuplicateIDs
Reasons:
1) We're supposed to be standardizing current practice not inventing 
new things. Current best practice is to have unique IDs and current 
software (e.g. Javablogs.com) is predicated on this practice. I know, 
this practice is not followed widely enough, but that is another 
matter.

2) I think it is *much* more useful to think of an Atom Entry as an 
event that occurred at a specific time. Typically, an event is the 
publication of an article or blog entry on the web. For example:

   event: CNET published article
   subject: CNET
   object: article
But an event it could also represent other events.
   event: delivery van delivers package
   subject: delivery van
   object: package
   event: alarm system sends warning
   subject: alarm system
   object: warning
   event: server sends load warning
   subject: server
   object: load warning
If you think of Atom Entries as events, then it makes sense to consider 
the Atom Entry ID to be the ID of the event, not the ID of the subject 
or object of the event. Events are unique (you can't have more than one 
version of an event) and can be assigned GUIDs and therefore you cannot 
have more than one entry with the same ID.

In the case of earthquake data, each new data report is a new event.
   event: agency reports earthquake data
   subject: agency
   object: earthquake data
The ID is the ID of the "data reported" event not the ID of the 
earthquake.

We don't know what subjects and objects people are going to use in the 
future, so we can't specify Atom elements or IDs for subjects and 
objects -- that's what extensions are for. If you want to create a feed 
to syndicate information about earthquakes, then you introduce an 
extension for uniquely identifying earthquakes. The same goes for 
earthquakes.

- Dave

On May 5, 2005, at 12:02 AM, Tim Bray wrote:

http://www.intertwingly.net/wiki/pie/PaceAllowDuplicateIDs
This Pace was motivated by a talk I had with Bob Wyman today about the 
problems the synthofeed-generator community has.

Summary:
1. There are multiple plausible use-cases for feeds with duplicate IDs
2. Pro and Contra
3. Alternate Paces
4. Details about this Pace
1. Use-Cases
Here's a stream of stock-market quotes.
My Portfolio
 
 MSFT
  2005-05-03T10:00:00-05:00
  Bid: 25.20 Ask: 25.50 Last: 25.20
  
 MSFT
  2005-05-03T11:00:00-05:00
  Bid: 25.15 Ask: 25.25 Last: 25.20
  
 MSFT
  2005-05-03T12:00:00-05:00
  Bid: 25.10 Ask: 25.15 Last: 25.10
  

You could also imagine a stream of weather readings.  Bob's actual 
here-and-now today use-case from PubSub is earthquakes, an entry 
describes an earthquake and they keep re-issuing it as new info about 
strength/location comes in.

Some people only care about the most recent version of the entry, 
others might want to see all of them.  Basically, each atom:entry 
element describes the same Entry, only at a different point in time.

You could argue that in some cases, these are representations of the 
Web resources identified by the atom:id URI, but I don't think we need 
to say that explicitly.

Yes, you could think of alternate ways of representing stock quotes or 
any of the other use-cases but this is simple and direct and 
idiomatic.

2. Pro and Contra
Given that I issued the consensus call rejecting the last attempt to 
do this, which was  PaceRepeatIdInDocument, I felt nervous about 
revisiting the issue.  So I went and reviewed the discussion around 
that one, which I extracted and placed at 
http://www.tbray.org/tmp/RepeatID.txt for the WG's convenience.

Reviewing that discussion, I'm actually not impressed.  There were a 
few -1's but very few actual technical arguments about why this 
shouldn't be done.  The most common was "Software will screw this up". 
 On reflection, I don't believe that.  You have a bunch of Entries, 
some of them have the same ID and are distinguished by datestamp.  
Some software will show the latest, some will show all of them, the 
good software will allow switching back and forth.  Doesn't seem like 
rocket science to me.

So here's how I see it: there are plausible use cases for doing this, 
and one of the leading really large-scale implementors in the space 
(PubSub) wants to do this right now.  Bob's been making strong claims 
about not being able to use Atom if this restriction remains in place.

I believe strongly that if there's something that implementors want to 
do, standards shouldn't get in the way unless there's real 
interoperability damage.  I'm certainly prepared to believe that this 
could cause interoperability damage, but to date I haven't seen any 
convincing arguments that it will.  I think that if we nonetheless 
forbid it, people who want to do this will (a) use RSS instead of 
Atom, (b) cook up horrible kludges, or (c) ignore us and just do it.

So my best estimate is that the cost of allowing dupes is probably 
much lower than the cost of forbidding them.

Finally, our charter does say that we're al

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story


On 5 May 2005, at 16:38, Graham wrote:
On 5 May 2005, at 3:32 pm, Henry Story wrote:
As I explained in my lengthy reply to your lengthy post, I think  
one should be able to do either.
Each way has its advantages and disadvantages. Let the publisher  
decide which mechanism to use.
Well please flag it so that I can provide a consistent user  
interface to people's whims?
What is the problem with the user interface that you have exactly? I  
have pointed you to
BlogEd that keeps a history of all the changes to an entry. Try it out:
http://blogs.sun.com/roller/page/bblfish/
Its open source, so you can also copy the code.

If you don't want to keep a history of the entries all you need to do  
is drop all but the
latest entry with the same id. There is nothing more to it. Just show  
the user the last
one you came across.

Since it does not cause any interoperability issues, what's the  
problem?
I have to come up with a new way to recognise and interpret such  
feeds where an entry (as defined by its id) isn't an entry but a  
feed of different entries.
No you don't. Just drop the old ones, if you don't care about the  
history. Really simple.
As Tim Bray's text says

 [[
   software MAY choose to display all of them or some subset of them
 ]]
So just drop the older versions.
I don't think that one would be using ids as a category system. If  
you go to
 you get todays front page. Tomorrow you get  
tomorrows front page.
What's the problem? Is  a hidden category system?
Charter: "Atom defines a feed format for representing resources  
such as Weblogs, online journals, Wikis,
and similar content"
yes, and it must also allow the representation of
[[
   * a complete archive of all entries in a feed
]]
This proposal permits this, and it does not harm anyone else.
Atom is not a replacement for HTTP. Google.com is a web page, not  
"similar content". It's not relevant here.
I don't know where you get the idea that I said atom is a replacement  
for HTTP. Take a breath
perhaps and relax before you answer.

Graham

PaceOriginalAttribute

2005-05-05 Thread Antone Roundy

-1.
I don't see that this solves any problem.  It may help people avoid 
accidentally generating invalid feeds (if we stick to not to allowing 
duplication of atom:id within a feed), but it does it by simply 
shunting the issue off into a different element which doesn't have 
duplication constraints.  It doesn't address the DOS problem--neither 
accidental nor intentional.  And it doesn't make it any easier to 
determine whether or not entries in different feeds with the same 
atom:id are really the same entry or not.  In fact, it just complicates 
the task by requiring the inspection of two elements instead of one.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Eric Scheid

On 6/5/05 12:45 AM, "Graham" <[EMAIL PROTECTED]> wrote:

> Have a look here:
> http://en.wikipedia.org/w/index.php?title=Main_Page&action=history
> 
> There you have a reverse chrono list, each with an author, date, and
> summary. Looks an awful lot like each one is an entry to me.

and looks to me like a stream of meta-data concerning the one entry to me.

and not distinct and separable entries like you'd find in your every day
blog.

Henry has the right idea -- the spec should allow both kinds, rather than
trying to shoe-horn everything into the one viewpoint of what is an entry.

e.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Eric Scheid

On 6/5/05 12:32 AM, "Henry Story" <[EMAIL PROTECTED]> wrote:

> Sorry I don't understand why we need atom:modified.

Graham suggested it : a reliable way for an aggregator to discern the latest
version of an entry.

> atom:updated is used by the publisher to show what they consider a
> significant change. The user, on the other hand, wants to see the
> latest version, reliably, even if the publisher disagrees that the
> change was significant. This is the core problem with Tim's proposal.
> There is no way to create an aggregator that works in the way the
> user expects.

... no way, that is, unless we have atom:modified making each same-id entry
distinct, and not just distinct but also time-ordered and time-distanced (an
advantage over just using something similar to the NewsML RevisionID
mechanism).

e.

PaceAlternateLinkWeakening

2005-05-05 Thread Henry Story

I have put PaceAlternateLinkWeakening on the wiki, though it was  
discusses on this
list, as it might not have cought the eye of the editors/secretary.

http://www.intertwingly.net/wiki/pie/PaceAlternateLinkWeakening
I think this is very uncontroversial clarification.
Henry Story

PaceRecommendPlainTextContent

2005-05-05 Thread Antone Roundy

-1.
This is entirely up to the publisher.  I think enough publishers are 
going to want things like links and line/paragraph 
breaks in their content that this recommendation would be so routinely 
ignored as to be meaningless.

PaceOptionalSummary

2005-05-05 Thread Antone Roundy

+1.
...oh, and the wording I just suggested for part of 
PaceTextShouldBeProvided would depend on this also being accepted.

PaceTextShouldBeProvided

2005-05-05 Thread Antone Roundy

+1 except for one thing: In section 4.1.2, I'd suggest something along 
these lines:

atom:entry elements which do not contain an atom:content element, or 
whose atom:content element's type attribute indicates a MIME media 
type, SHOULD contain an atom:summary element.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham


On 5 May 2005, at 3:23 pm, Eric Scheid wrote:
And what about the use case of a wiki's RecentChanges log? Each  
entry refers
to a specific page, and there may be multiple such entries for  
each  page as
it gets rapidly edited ... and wiki folks have found it important  
to be able
to monitor all change events.


Each log entry is an entry in itself, with its own id.
Sorry, that makes as much sense as changing the id for a blog entry  
if that
blog entry is updated.
Have a look here:
http://en.wikipedia.org/w/index.php?title=Main_Page&action=history
There you have a reverse chrono list, each with an author, date, and  
summary. Looks an awful lot like each one is an entry to me.

Graham

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Antone Roundy

If we accept this Pace, are we going to do anything to address the DOS 
issue for aggregated feeds?

Re: PaceCaching

2005-05-05 Thread James Aylett


On Thu, May 05, 2005 at 09:26:46AM -0500, Mark Pilgrim wrote:

> > seriously expect it to be interpreted as a promise that the feed
> > won't change for the next x minutes?
> 
> No, but I do seriously expect it to be interpreted that the feed
> publisher does not wish clients to check it for the next x minutes.

Indeed - Expires doesn't say it won't change, it says that you
shouldn't care whether it's changed.

James

-- 
/--\
  James Aylett  xapian.org
  [EMAIL PROTECTED]   uncertaintydivision.org

Re: AtomPubIssuesList for 2005/05/05

2005-05-05 Thread Sam Ruby

Walter Underwood wrote:
--On May 5, 2005 7:17:00 AM -0400 Sam Ruby <[EMAIL PROTECTED]> wrote:
Demonstrate that you have revisited the previous discussion, and that you either
have something new to add, or can point out some evidence that the previous
consensus call was made in error.
PaceCaching was not discussed and rejected based on false information.
It was rejected because it was HTTP-specific (it is not), and because
it was non-core (similar features are common in other RSS specs).
Actually, it never has been rejected.  I had miscategorized it as 
protocol.  I've fixing that now, and scheduled it for this cycle.

Sorry for the confusion.
- Sam Ruby

Re: status paces

2005-05-05 Thread Antone Roundy

On Thursday, May 5, 2005, at 07:13  AM, Robert Sayre wrote:
Throw them all back. No reason we can't do this in an extension
element.
+1

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham


On 5 May 2005, at 3:32 pm, Henry Story wrote:
As I explained in my lengthy reply to your lengthy post, I think  
one should be able to do either.
Each way has its advantages and disadvantages. Let the publisher  
decide which mechanism to use.
Well please flag it so that I can provide a consistent user interface  
to people's whims?

Since it does not cause any interoperability issues, what's the  
problem?
I have to come up with a new way to recognise and interpret such  
feeds where an entry (as defined by its id) isn't an entry but a feed  
of different entries.

I don't think that one would be using ids as a category system. If  
you go to
 you get todays front page. Tomorrow you get  
tomorrows front page.
What's the problem? Is  a hidden category system?
Charter: "Atom defines a feed format for representing resources such  
as Weblogs, online journals, Wikis,
and similar content"

Atom is not a replacement for HTTP. Google.com is a web page, not  
"similar content". It's not relevant here.

Graham

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Robert Sayre

On 5/5/05, Eric Scheid <[EMAIL PROTECTED]> wrote:
> 
> On 5/5/05 11:55 PM, "Graham" <[EMAIL PROTECTED]> wrote:
> >>
> > Each log entry is an entry in itself, with its own id.
> 
> Sorry, that makes as much sense as changing the id for a blog entry if that
> blog entry is updated.

Graham's got it exactly right. 

> The functional parallel is wiki-page = blog-entry, and if a blog-entry is
> updated then that is reflected in the feed as an updated entry - with the
> same id.

That's right, that is the functional parallel. No software I know of 
shows both revisions of the entry in the feed when it's updated. If
you are syndicating wiki changes, part of each entry is the diff and
revision id--each revision is a unique thing.

Another analogous use case would be a feed watching a certain file in
CVS. Every entry would be about the same file, but each would have its
own atom:id.

Once again, there remains a downstream problem for PubSub, etc. 

Robert Sayre

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story


On 5 May 2005, at 15:55, Graham wrote:
On 5 May 2005, at 2:26 pm, Eric Scheid wrote:

perhaps we needed atom:modified after all :-(
Yes we do, if we want to go down this route. I suggest appending  
the current time (or for old versions, the last time that version  
was current) at the source.
Sorry I don't understand why we need atom:modified.

And what about the use case of a wiki's RecentChanges log? Each  
entry refers
to a specific page, and there may be multiple such entries for  
each page as
it gets rapidly edited ... and wiki folks have found it important  
to be able
to monitor all change events.

Each log entry is an entry in itself, with its own id. That seems a  
far better functional parallel to the basic blog feed.
As I explained in my lengthy reply to your lengthy post, I think one  
should be able to do either.
Each way has its advantages and disadvantages. Let the publisher  
decide which mechanism to use.

As with the share price example, the topic of the entry (the  
company, or the wiki page) is far more analogous to a category that  
the entry belongs to, than to an its identity.
Again let the publisher choose what the identity criterion of his  
objects are. Some will stick
some will not. But it is not up to us to decide for our users.

Since it does not cause any interoperability issues, what's the problem?
Everyone stop trying to use ids as a category system.
I don't think that one would be using ids as a category system. If  
you go to
 you get todays front page. Tomorrow you get  
tomorrows front page.
What's the problem? Is  a hidden category system?

Henry Story
http://bblfish.net/blog/

Re: Atom feed refresh rates

2005-05-05 Thread James Aylett

On Thu, May 05, 2005 at 08:07:15AM -0500, Mark Pilgrim wrote:

> Not to be flippant, but we have one that's widely available.  It's
> called the Expires header.  I spoke with Roy Fielding at Apachecon
> 2003 and asked him this exact question: "If I set an Expires header on
> a feed of now + 3 hours, does that mean that I don't want the client
> to fetch the feed again for at least 3 hours?"  And he said yes,
> that's exactly what it means.

I think the problem here may be that the HTTP/1.1 spec gives the
impression that the Expires header is not designed to affect end
clients (user agents).

For instance, from 13.2.1 ("Server-Specified Expiration"), we get the
sentence:

"The expiration mechanism applies only to responses taken from a cache
and not to first-hand responses forwarded immediately to the
requesting client."

Now many clients themselves contain caches, but this distinction may
still be the source of some confusion, especially as the number of
people who know about the distinction (by having written a user agent)
compared to the number who are affected by it (by writing server
components) is pretty small.

James

-- 
/--\
  James Aylett  xapian.org
  [EMAIL PROTECTED]   uncertaintydivision.org

Re: PaceCaching

2005-05-05 Thread Mark Pilgrim

On 5/5/05, Graham <[EMAIL PROTECTED]> wrote:
> seriously expect it to be interpreted as a promise that the feed
> won't change for the next x minutes?

No, but I do seriously expect it to be interpreted that the feed
publisher does not wish clients to check it for the next x minutes.

-- 
Cheers,
-Mark

Re: PaceCaching

2005-05-05 Thread Graham

On 5 May 2005, at 3:02 pm, Walter Underwood wrote:
PaceCaching was not discussed and rejected based on false information.
It was rejected because it was HTTP-specific (it is not), and because
it was non-core (similar features are common in other RSS specs).
It does not interact with other features, so it should be a fairly
clean, quick discussion.
Unless we can make providing incorrect or misleading information in  
either of these elements lead to the immediate purging of first born,  
they're not at all useful to anyone. The "expires date" can't apply  
to 99% of feeds since they don't work on a fixed or predictable  
schedule. Meanwhile "max-age" doesn't provide any actionable  
information to caches, beyond "I chose this random number off the top  
of my head when I wrote my Atom script 3 years ago. Deal." Do you  
seriously expect it to be interpreted as a promise that the feed  
won't change for the next x minutes?

Graham

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Eric Scheid

On 5/5/05 11:55 PM, "Graham" <[EMAIL PROTECTED]> wrote:

>> And what about the use case of a wiki's RecentChanges log? Each entry refers
>> to a specific page, and there may be multiple such entries for each  page as
>> it gets rapidly edited ... and wiki folks have found it important to be able
>> to monitor all change events.
>> 
> Each log entry is an entry in itself, with its own id.

Sorry, that makes as much sense as changing the id for a blog entry if that
blog entry is updated.

The functional parallel is wiki-page = blog-entry, and if a blog-entry is
updated then that is reflected in the feed as an updated entry - with the
same id.

e.

Re: Atom feed refresh rates

2005-05-05 Thread Mark Pilgrim

On 5/5/05, Walter Underwood <[EMAIL PROTECTED]> wrote:
> You need the information outside of HTTP. To quote from the RSS spec
> for ttl:
> 
>   This makes it possible for RSS sources to be managed by a file-sharing
>   network such as Gnutella.

Ignoring, for the moment, that this is a horrible idea and no one
supports it, Gnutella has its own caching and time-to-live mechanisms
that the RSS spec is ignoring.

-- 
Cheers,
-Mark

Re: AtomPubIssuesList for 2005/05/05

2005-05-05 Thread Mark Pilgrim


On 5/5/05, Walter Underwood <[EMAIL PROTECTED]> wrote:
> It does not interact with other features, so it should be a fairly
> clean, quick discussion.

You must be new here.

/ducks

-- 
Cheers,
-Mark

Re: Autodiscovery, real-world examples

2005-05-05 Thread Robert Sayre

On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
> 
> On Thursday, May 5, 2005, at 01:27  AM, fantasai wrote:
> > front page. The last few headlines for each feed are listed on
> > the front page, and the designer felt it was appropriate for
> > autodiscovery to work on this page -- but it is not appropriate
> > for rel="alternate" to be used for those autodiscovery links.
> > They are not alternate representations of the front page.

I can see your point, but the autodiscovery spec isn't standardizing
all usages. Just the one we have a good grasp on. Secondly, there's
nothing stopping UAs from "discovering" other feeds. Safari 2.0
already does this.

Robert Sayre

Re: Last Call: 'The Atom Syndication Format' to Proposed Standard

2005-05-05 Thread A. Pagaltzis


* Thomas Broyer <[EMAIL PROTECTED]> [2005-05-03 19:35]:
> This means type="text" content is a single paragraph of text.
> If you need paragraphs, lists or any other "structural
> formatting", you have to use type="html" or type="xhtml" with
> the appropriate content.

Or type="text/plain", Iâd assume?

Regards,
-- 
Aristotle

Re: AtomPubIssuesList for 2005/05/05

2005-05-05 Thread Walter Underwood

--On May 5, 2005 7:17:00 AM -0400 Sam Ruby <[EMAIL PROTECTED]> wrote:
>
> Demonstrate that you have revisited the previous discussion, and that you 
> either
> have something new to add, or can point out some evidence that the previous
> consensus call was made in error.

PaceCaching was not discussed and rejected based on false information.
It was rejected because it was HTTP-specific (it is not), and because
it was non-core (similar features are common in other RSS specs).

It does not interact with other features, so it should be a fairly
clean, quick discussion.

wunder
--
Walter Underwood
Principal Architect, Verity

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham


On 5 May 2005, at 2:26 pm, Eric Scheid wrote:
perhaps we needed atom:modified after all :-(
Yes we do, if we want to go down this route. I suggest appending the  
current time (or for old versions, the last time that version was  
current) at the source.

And what about the use case of a wiki's RecentChanges log? Each  
entry refers
to a specific page, and there may be multiple such entries for each  
page as
it gets rapidly edited ... and wiki folks have found it important  
to be able
to monitor all change events.
Each log entry is an entry in itself, with its own id. That seems a  
far better functional parallel to the basic blog feed. As with the  
share price example, the topic of the entry (the company, or the wiki  
page) is far more analogous to a category that the entry belongs to,  
than to an its identity. Everyone stop trying to use ids as a  
category system.

Graham

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Robert Sayre

On 5/5/05, Eric Scheid <[EMAIL PROTECTED]> wrote:

> > Because feeds are feeds and archives are archives? They have
> > different audiences and different uses and different requirements.
> 
> And what about the use case of a wiki's RecentChanges log? Each entry refers
> to a specific page, and there may be multiple such entries for each page as
> it gets rapidly edited ... and wiki folks have found it important to be able
> to monitor all change events.

I'm much more sympathetic to the aggregate feed problem of multiple
IDs. People advocating this type of thing seem to think the default
action should be grouping, so they want to use the same ID. I think
that's a bad idea, and there are plenty of other ways to indicate the
fundamental sameness of entries. For example, NewsML URNs have a
NewsItemID and a RevisionID, which would allow smart aggregators to
group the entries without violating Atom's constraint.

Robert Sayre

RE: Atom feed refresh rates

2005-05-05 Thread Andy Henderson


>>>You seem to want the ttl element so that you have the publisher's
permission to check less often. Why not just do so anyway if it causes so
many problems? If that degrades the user experience too much, you're free to
check more often. How is the ttl element useful to you?<<<

I allow anyone to specify any refresh interval higher than the greater of
ttl or 60 minutes.

The ttl allows me to extend the minimum refresh interval beyond 60 minutes.
'MSDN just published' at http://msdn.microsoft.com/rss.xml includes
1440.  I therefore set the refresh interval to 1 day when the
feed is added and I do not allow people to specify a lower refresh interval.

If the ttl tag simply described the minimum refresh interval, I would also
use it to allow people to specify refresh intervals less than 60 minutes
knowing that was acceptable to the feed provider.  Unfortunately, the
genesis of the ttl tag means that lower ttl values are unreliable.  The BBC,
for example, specifies a ttl of 5 which I'm sure refers to that tag's
original use, not a minimum refresh interval.

Andy

1 2 >

1 - 100 of 119 matches

Mail list logo