Re: PaceRepeatIdInDocument solution

2005-02-19 Thread Henry Story
I think I can prove that the two versions are perfectly compatible and 
orthogonal. I can prove that logically there is no inconsistency, and
some empirical backing that this is feasible. But I am not alone. Bob
Wyman I believe has a lot more empirical support.

You on the other hand, as usual I notice, have absolutely no argument to
defend your case.
Henry Story
On 18 Feb 2005, at 23:55, Graham wrote:
Allowing more than one version of the same entry in a syndication feed 
is unacceptable in itself, which is fundamentally incompatible with 
archive feeds, no matter what the conceptual definition of id is.

Graham



Re: PaceRepeatIdInDocument solution

2005-02-19 Thread Henry Story

On 18 Feb 2005, at 23:55, Graham wrote:
Allowing more than one version of the same entry in a syndication 
feed is unacceptable in itself, which is fundamentally incompatible 
with archive feeds, no matter what the conceptual definition of id 
is.

Graham
Let me make my point even clearer. If something is fundamentally 
incompatible,
then it should be *dead-easy* to prove or reveal this incompatibility.

So develop your thought a little, and you can only come out the winner.
Henry


Re: PaceRepeatIdInDocument solution

2005-02-19 Thread Graham
On 19 Feb 2005, at 11:23 am, Henry Story wrote:
Let me make my point even clearer. If something is fundamentally 
incompatible,
then it should be *dead-easy* to prove or reveal this incompatibility.
i) Syndication documents shouldn't ever contain multiple versions of 
the same entry*.
ii) Archive documents apparently need to be able to contain multiple 
versions of the same entry.

* for the simple reason that it makes them an order of magnitude harder 
to process and display correctly (and often impossible to display 
correctly, since it won't always be clear which is the latest version).

Your wittering on about conceptual models doesn't make you better than 
us.

Graham


Re: Consensus call on last round of Paces

2005-02-19 Thread Danny Ayers

Hmm, I've been a little distracted, but I thought
PaceExtensionConstruct did get a reasonable amount of support.
+1 from me anyway.



On Tue, 15 Feb 2005 11:12:48 -0800, Tim Bray [EMAIL PROTECTED] wrote:
 
 Methodology: Paul  I went through *all* the WG emails that directly
 commented on the currently open issues (see
 http://www.intertwingly.net/wiki/pie/AtomPubIssuesList); in most cases
 the calls were pretty clear.  As always, we may have mis-read the
 group, feel free to say so if you think so.
 
 The intent is that this email serve as guidance for the editors in
 preparing the format draft that we send out for IETF last call.  We do
 not expect further material discussion of format-related Paces, it's
 now over to the whole IETF.  On the other hand, discussion of editorial
 changes is always fair game, please keep reporting what you find to the
 list, and we wouldn't be surprised if the editors turned up some corner
 cases too.
 
 PaceAggregationDocument  PaceAggregationDocument2
 One -1, nobody unambiguously in favor.
 DISPOSITION: Close them.
 
 PaceAggregationInSeparateSpec
 Only a couple of voices in favor, some support conditional on profiles.
 DISPOSITION: Close it, but given that the other aggregation-related
 Paces seem to be failing, it seems like a separate spec is the only
 place that this kind of work gets done.
 
 PaceArchiveDocument
 A howling mob of sharp pointy -1's.
 DISPOSITION: Close it.
 
 PaceClarifyDateUpdated
 A couple of -1's, one fuzzy +1.
 DISPOSITION: Close it.
 
 PaceCollection
 One pro, one contra, not much discussion.
 DISPOSITION: Not enough support, close it.
 
 PaceCommentFeeds
 Two contra, one pro, some suggestion that we really need
 link/@rel=comment.
 DISPOSITION: Not enough support, close it.
 
 PaceDatesXSD
 Everyone seems to like it, enough people want the regex that it's
 accepted too.
 DISPOSITION: Accepted, Sayre suggested good wording calling in all the
 specs that are covered.
 
 PaceEntriesElement
 Drowning in -1's.
 DISPOSITION: Close it.
 
 PaceEntryOrder
 One -1, but overwhelming support otherwise.
 DISPOSITION: Accepted.
 
 PaceExtensionConstruct
 One -1, 1.5 +1's.
 DISPOSITION: Not enough support, close it.
 
 PaceFeedRecursive
 Lots of -1's.
 DISPOSITION: Close it.
 
 PaceFeedState
 Lots of -1's.
 DISPOSITION: Close it.
 
 PaceNoFeedState
 A few +1's, nothing negative.
 DISPOSITION: Accepted.
 
 PaceFormatSecurity
 Evenly split +1's and -1's.
 DISPOSITION: No consensus and we have PaceSecuritySection, close it.
 
 PaceHeadless
 Lots of talk, more -1's than +1's.
 DISPOSITION: No consensus, close it.
 
 PaceIconAndImage
 One -1, but broad support otherwise.
 DISPOSITION: Accepted.
 
 PaceLangSpecific
 Not a lot of discussion, but pretty positive.
 DISPOSITION: Borderline, but accepted.
 
 PaceLinkEnclosure
 A little bit of support, but with reservations.
 DISPOSITION: A messy Pace and not enough support, close it.
 
 PaceLinkRelVia
 Moderate support, no objections.
 DISPOSITION: Borderline, but accepted.
 
 PaceMultipleImages
 Lots of -1's.
 DISPOSITION: Close it.
 
 PaceMustBeWellFormed
 Very little discussion, no unambiguous support.
 DISPOSITION: Our longest-lived Pace is finally closed.
 
 PaceOrderSpecAlphabetically
 A bit of support, not much talk.
 DISPOSITION: This is editorial, let the editors run with it.
 
 PaceProfile
 Changed along the way, quite a few +1's but even more -1's.  A certain
 amount of +1 on concept, -1 on syntax which doesn't help.
 DISPOSITION: No consensus, close it.
 
 PaceProfileAttribute
 No significant support.
 DISPOSITION: Close it
 
 PaceRemoveInfoAndHost
 Not much discussion, but no opposition.
 DISPOSITION: Borderline, but accepted.
 
 PaceRemoveVersionAttribute
 Quite a bit of support, quite a bit of grumbling but no loud -1's.
 DISPOSITION: Borderline, but accepted.
 
 PaceRepeatIdInDocument
 Lots of discussion, more -1's than +1's.
 DISPOSITION: No consensus, close it.  But now we have a problem, in
 that this removed ambiguity in one direction, just closing it leaves
 the ambiguity.  So the only logical conclusion is that the WG is
 directing the editors to put language in that explicitly forbids
 entries with duplicate atom:id in an atom:feed.
 
 PaceSecuritySection
 More support than opposition, but some feeling that the IETF is going
 to make us do something like this anyhow.
 DISPOSITION: Borderline, but accepted.
 
 PaceTextRules
 Only one (positive) comment.
 DISPOSITION: Not enough support, close it.
 
 PaceXhtmlNamespaceDiv
 This is tough.  There are some people who are really against this and
 who aren't moving.  On the other hand, there are way more who are in
 favor.  Reviewing the discussion, the contras are saying that this is
 sloppy and unnecessary and if it solves a problem, that problem really
 shouldn't be there, but they don't seem to be saying it's actively
 harmful.  But the people in favor are arguing that this will reduce
 errors and improve interop.  Also, the Pace was changed in 

Re: PaceRepeatIdInDocument solution

2005-02-19 Thread Roger B.

 i) Syndication documents shouldn't ever contain multiple versions of
 the same entry*.

Graham: +1.

 ii) Archive documents apparently need to be able to contain multiple
 versions of the same entry.

I don't even buy that much, personally.

--
Roger Benningfield



Re: PaceRepeatIdInDocument solution

2005-02-19 Thread Eric Scheid

On 20/2/05 2:46 AM, Graham [EMAIL PROTECTED] wrote:

 i) Syndication documents shouldn't ever contain multiple versions of
 the same entry*.
 
 * for the simple reason that it makes them an order of magnitude harder
 to process and display correctly (and often impossible to display
 correctly, since it won't always be clear which is the latest version).

Think of a feed as a stream of entry instances (not hard to do), and process
accordingly. The same thing with a feed document. Whether you read from the
top of the document to the bottom, or vice versa, shouldn't matter - you can
identify the more recent entry by atom:updated. If two instances with the
same atom:id have the same atom:updated, then there is no significant
difference between the two, so go with a random choice (that's not hard
either) (and lobby for atom:modified while you're at it).

For feed readers that already support entry persistence and entry
replacement when an entry is updated from one document to the next, why is
this an order of magnitude more difficult to do in the one document?

e.



Re: PaceRepeatIdInDocument solution

2005-02-19 Thread Graham
On 19 Feb 2005, at 11:06 pm, Eric Scheid wrote:
If two instances with the same atom:id have the same atom:updated, 
then there is no significant difference between the two, so go with a 
random choice
*that the author considered significant*. If you've told the use 
they're getting the latest version, and they see something else, that 
doesn't fit my definition of working correctly. A paradigm where the 
instance in the feed is always the newest version works much much 
better.

For feed readers that already support entry persistence and entry
replacement when an entry is updated from one document to the next, 
why is
this an order of magnitude more difficult to do in the one document?
I was talking about feed readers that don't.
And even those that do, you now need to look for duplicates within the 
feed instead of just comparing the new set to the old set. ie Instead 
of removing duplicates that exist between set A and set B, I now also 
have to look within set A as well. You seem to have suggested earlier 
that entries be added to the store one by one. This is not possible in 
Shrook because of the various layers of idiot proofing.

Graham


Re: PaceRepeatIdInDocument solution

2005-02-19 Thread Henry Story

On 19 Feb 2005, at 16:46, Graham wrote:
On 19 Feb 2005, at 11:23 am, Henry Story wrote:
Let me make my point even clearer. If something is fundamentally 
incompatible,
then it should be *dead-easy* to prove or reveal this incompatibility.
i) Syndication documents shouldn't ever contain multiple versions of 
the same entry*.
ii) Archive documents apparently need to be able to contain multiple 
versions of the same entry.

* for the simple reason that it makes them an order of magnitude 
harder to process and display correctly (and often impossible to 
display correctly, since it won't always be clear which is the latest 
version).
I don't accept that it makes it an order of magnitude harder to process 
these
documents, or if it is an order of magnitude harder, its an order of 
magnitude
larger than an infinitesimal amount, which is still an infinitesimal 
amount.
I am writing such a tool, so I think I have some grasp on the subject.

But accepting for the sake of argument that you are right, you need 
compare the difficulty of writing a feed reader with the difficulty of 
writing a feed
itself. Not allowing duplicate versions of an entry in a feed just 
pushes the
complexity of writing the feed from the feed reader to the feed writer:
now the feed writer has to contain the logic to make sure than no 
duplicates
appear in the feed.  Instead of the feed writer just being able to 
paste the
new entry to the end of the feed, it has to parse the whole feed 
document
and make sure it contains no duplicates.

Since I can see very good reasons to make life easier for the feed 
writer, in the
same way as one has tried to keep html simple for the common html 
writer, I
think your argument may in fact turn out to be a good supporting 
argument for
allowing multiple versions of an entry in the same feed document.

Your wittering on about conceptual models doesn't make you better than 
us.
I never pretended it does make me better.
I have been exploring tools such as rdf, as I believe that they can 
bring a
lot of clarity to debates such as this one. Just as engineers don't 
hesitate
to use mathematics to help them in their tasks, so I think using logical
analysis should help us here. I hope that as I understand these tools 
better
I will be able to explain the insights these disciplines bring in 
plainer
english.

In the mean time I have a lot of respect for Tim Berners Lee, and
I try my best to understand the direction he is going in, the tools he
is developing and the insights these lead to.
Henry Story
http://bblfish.net/
Graham



Re: PaceRepeatIdInDocument solution

2005-02-19 Thread Graham
On 20 Feb 2005, at 1:27 am, Eric Scheid wrote:
hmmm ... looking back in the archives I see you were opposed to
atom:modified, you couldn't see any use case where you would want the 
entry
instances to clearly indicate which is more recent. Hashes won't help 
you
here.
Yes, if you want multiple versions you need atom:modified. I oppose 
both.

A paradigm that fails completely once a reader starts traversing
@rel=prev
Not if the url in the prev is properly thought through; ie instead of 
asking for page 2 the uri query asks for entried before n, where n 
is the oldest entry number in the page before.

Anyway, rel=prev doesn't exist last time I checked.
or they have a planet aggregator in their subscriptions which
has fallen behind due to ping lags.
Are there really aggregators naïve enough to take an entry with the 
same id from one feed and paste over the last retrieved entry from 
another? There are far more problems with that before you start 
worrying about what is the latest entry.

the newest version is something which should be publisher 
controlled, not
left to the variable circumstances of protocol happenstance and
idiosyncratic personal subscription lists.
or picking randomly, as you suggested not 2 emails ago.
OK, lets look at feed readers that don't then [etc]
This is where Eric dictates how other people's feed readers should work 
to fit the flaws in his preferred proposition.

[1] do you know of any publishing software which currently emits feeds 
with
multiple instances of entries? I can't think of any.
None. That's why it should be explicitly barred, since no software is 
expecting it.

Graham


RE: PaceRepeatIdInDocument solution

2005-02-19 Thread Bob Wyman

Graham wrote:
 [1] do you know of any publishing software which currently emits
 feeds with multiple instances of entries? I can't think of any.
 None. That's why it should be explicitly barred, since no software
 is expecting it.
PubSub regularly produces feeds with multiple instances of the same
atom:id. No one has every complained about this to us.
Given that history shows that publishing repeated ids has never
bothered anyone enough to cause them to complain, we should permit this
benign practice to continue. 
It is particularly important to avoid prohibiting this benign
practice since it is so important to generators of aggregated feeds.
Aggregated feed generators are supposed to maintain atom:id unchanged when
they copy entries into an aggregate feed. However, the Atom format doesn't
provide rigorous guarantees that atom:id's will be unique across feeds.
Thus, aggregated feed publishers are left with the choice of 1) Trusting
feed publishers or 2) Assigning new atom:id's to all entries published. The
first option will inevitably result in repeated ids and the second results
in massive amounts of work, difficulties in duplicate detection, violation
of the maintain atom:id rule, etc.
Forbidding repeated ids causes damage. History shows, however, that
allowing repeated ids is benign.

bob wyman




Re: PaceRepeatIdInDocument solution

2005-02-19 Thread Eric Scheid

On 20/2/05 1:47 PM, Graham [EMAIL PROTECTED] wrote:

 On 20 Feb 2005, at 1:27 am, Eric Scheid wrote:
 
 hmmm ... looking back in the archives I see you were opposed to
 atom:modified, you couldn't see any use case where you would want the entry
 instances to clearly indicate which is more recent. Hashes won't help you
 here.
 
 Yes, if you want multiple versions you need atom:modified. I oppose both.
 
atom:modified also helps in distinguishing multiple instances found in
separate feed documents.

You oppose atom:modified, and yet you insist on kludging a hack for
identifying which of two entries is the most recent. A hack which isn't even
mentioned in the spec, so gawd help software developers all arriving at the
same hacky solution to the problem.

You opposed it because you couldn't foresee any use case for it, and now you
have a use case for it but you say that that use case should be banned
because you opposed atom:modified.

I forget: is this the circular reasoning logical fallacy, or the begging the
question fallacy?

 A paradigm that fails completely once a reader starts traversing @rel=prev
 
 Not if the url in the prev is properly thought through; ie instead of asking
 for page 2 the uri query asks for entried before n, where n is the oldest
 entry number in the page before.

Where are these special semantics codified into a specification?

Also, define oldest. Is this the one with the oldest atom:updated, even
though you earlier (and rightly) dissed that because it was that the author
considered significant. Or is oldest defined by atom:published, which as
you might recall is an *optional* element for atom:entry.

Also, define number in entry number. Entries are not numbered, they have
id's, and while it's often easy to use an incrementing serial that is not
always the case. 

 Anyway, rel=prev doesn't exist last time I checked.

This does: @rel=http://www.example.org/atom/link-rels#prev;, and that's a
valid value for the @rel attribute. There is also no language in the spec
that prevents someone registering prev in the Registry of Link Relations.

http://atompub.org/2005/01/27/draft-ietf-atompub-format-05.html#rfc.section
.9.1

So you might as well assume it does exist.

 or they have a planet aggregator in their subscriptions which has fallen
 behind due to ping lags.
 
 Are there really aggregators naïve enough to take an entry with the same id
 from one feed and paste over the last retrieved entry from another?
 
Naïve or smart? I subscribe to one feed which is the top headlines for that
site, and I also subscribe to all headlines for one category at that site.
The naïve thing to do there would be to not conflate entries with
identical id's.

Another use case: I subscribe to a feed from the publisher's website, but
later he sets up a link at feedburner.com or similar. The naïve thing would
be to assume that all the entries from feedburner.com are completely
different from those retrieved from example.com, despite having the same
id's.

 There are far more problems with that before you start worrying about what is
 the latest entry.
 
You forget: I determine what entries I subscribe to, as you do for yourself.
If a bad actor starts screwing with id's then I can also unsubscribe.

So, leaving aside hand waving scare-mongering statements like far more
problems, just what problems are there Graham?

 the newest version is something which should be publisher controlled, not
 left to the variable circumstances of protocol happenstance and idiosyncratic
 personal subscription lists.
 
 or picking randomly, as you suggested not 2 emails ago.
 
Glad you agree with me there.

We wouldn't need to pick randomly if we had atom:modified.

 OK, lets look at feed readers that don't then [etc]
 
 This is where Eric dictates how other people's feed readers should work to fit
 the flaws in his preferred proposition.
 
A gross sophistry on your part. If you don't want to argue the merits and
prefer ad hominem attacks, then there really isn't much point continuing.

 [1] do you know of any publishing software which currently emits feeds with
 multiple instances of entries? I can't think of any.
 
 None. That's why it should be explicitly barred, since no software is
 expecting it.

Nonsense. That's like arguing that http agents should only support those
mime-types which were already defined oh so many years ago. No software
currently exists that can possibly be expecting application/foo, but that
doesn't mean application/foo is an illegal mime-type.

e.