Re: URI canonicalization

2005-02-01 Thread Bill de hÓra
Roy T. Fielding wrote: Over-specification is just too fun. So that would mean I am required by Atom format to treat two different entries with the id http://tbray.org/uid/1000; as the same entry, even when I received the first one from tbray.org and the second from mymonkeysbutt.net? Oh yeah,

Re: URI canonicalization

2005-02-01 Thread Danny Ayers
On Tue, 01 Feb 2005 10:07:52 +, Bill de hÓra [EMAIL PROTECTED] wrote: Otherwise, this thread sounds like resources and representations all over with the caveat that entry representations are being sourced from multiple origin servers. It was suggested ages ago that the use of a different

Re: URI canonicalization

2005-02-01 Thread Danny Ayers
Nearly forgot - +1 to including some kind of explanatory note on comparisons, Martin's version looks better than the current text -- http://dannyayers.com

Re: URI canonicalization

2005-02-01 Thread Graham
On 1 Feb 2005, at 5:27 am, Roy T. Fielding wrote: Identifiers are not subject to simplification -- they are either equivalent or not. We can add all of the implementation requirements we like to prevent software from detecting false negatives, but that doesn't change the fact that equivalent

Re: URI canonicalization

2005-02-01 Thread Tim Bray
On Jan 31, 2005, at 10:16 PM, Roy T. Fielding wrote: Over-specification is just too fun. So that would mean I am required by Atom format to treat two different entries with the id http://tbray.org/uid/1000; as the same entry, even when I received the first one from tbray.org and the second

Re: URI canonicalization

2005-02-01 Thread Sam Ruby
Tim Bray wrote: On Jan 31, 2005, at 10:16 PM, Roy T. Fielding wrote: Over-specification is just too fun. So that would mean I am required by Atom format to treat two different entries with the id http://tbray.org/uid/1000; as the same entry, even when I received the first one from tbray.org

Re: URI canonicalization

2005-02-01 Thread Antone Roundy
On Monday, January 31, 2005, at 10:57 PM, Roy T. Fielding wrote: There is no reason to require any particular comparison algorithm. One application is going to compare them the same way every time. Two different applications may reach different conclusions about two equivalent identifiers, but

Re: URI canonicalization

2005-02-01 Thread Roy T. Fielding
On Feb 1, 2005, at 4:48 AM, Sam Ruby wrote: Roy T. Fielding wrote: There is no reason to require any particular comparison algorithm. One application is going to compare them the same way every time. Two different applications may reach different conclusions about two equivalent identifiers, but

Re: URI canonicalization

2005-02-01 Thread Roy T. Fielding
On Feb 1, 2005, at 7:46 AM, Tim Bray wrote: On Jan 31, 2005, at 10:16 PM, Roy T. Fielding wrote: Over-specification is just too fun. So that would mean I am required by Atom format to treat two different entries with the id http://tbray.org/uid/1000; as the same entry, even when I received the

Re: URI canonicalization

2005-02-01 Thread Tim Bray
On Feb 1, 2005, at 4:28 PM, Roy T. Fielding wrote: Anyone who subscribes to aggregations (for example, I subscribe to the planetsun.org aggregated feed), is used to seeing the same entry over and over and over again. This problem is only going to get worse. With Atom's ID semantics and

Re: URI canonicalization

2005-02-01 Thread Graham
On 2 Feb 2005, at 12:52 am, Roy T. Fielding wrote: There is no need to explain what different ids means -- any two URIs that are different identifiers will never compare as equivalent, regardless of the comparison algorithm used. Pardon? If I use case sensitive ids (eg base64 style

Re: URI canonicalization

2005-02-01 Thread Roy T. Fielding
On Feb 1, 2005, at 5:12 PM, Graham wrote: On 2 Feb 2005, at 12:52 am, Roy T. Fielding wrote: There is no need to explain what different ids means -- any two URIs that are different identifiers will never compare as equivalent, regardless of the comparison algorithm used. Pardon? If I use case

Re: URI canonicalization

2005-02-01 Thread Eric Scheid
On 2/2/05 11:52 AM, Roy T. Fielding [EMAIL PROTECTED] wrote: any two URIs that are different identifiers will never compare as equivalent, regardless of the comparison algorithm used. what about false negatives though? e.

Re: URI canonicalization

2005-01-31 Thread Antone Roundy
On Sunday, January 30, 2005, at 05:43 PM, Robert Sayre wrote: How about Make sure your id is unique from a character-by-character perspective, but also unique in the face of scheme-specific comparisons. That is, don't lean on scheme-specific comparisons to match URIs, but they don't have to be

Re: IRI - URI canonicalization

2005-01-31 Thread DJWS
I am not sure this is relevant but all this is supporting IRI? jfc At 13:24 31/01/2005, Bjoern Hoehrmann wrote: * Robert Sayre wrote: Suppose your user is subscribed to a feed containing 1000 entries. One day, the host name is no longer capitalized. Are you going to put 1000 new, duplicate

Re: URI canonicalization

2005-01-31 Thread Sam Ruby
Bjoern Hoehrmann wrote: * Robert Sayre wrote: Suppose your user is subscribed to a feed containing 1000 entries. One day, the host name is no longer capitalized. Are you going to put 1000 new, duplicate entries in front of the user? It seems the Working Group is split on the requirements for

Re: URI canonicalization

2005-01-31 Thread Martin Duerst
I have just looked at the text in question in -05.txt, and read through the discussion. I'll give my comments here, but they are not specifically on this mail. First, for me, the goal of having reproducible id comparison is most important; this is the basic requirement. Second, given that there

Re: URI canonicalization

2005-01-31 Thread Tim Bray
On Jan 31, 2005, at 8:20 PM, Graham wrote: This makes it clear that we are talking about here is how you do it, rather than here's one way to do it. We might be treading on toes making that assertion. Yes, but it's not only correct, it's good advice, so we should put it in. 4) Add a

Re: URI canonicalization

2005-01-31 Thread Robert Sayre
Robert Sayre wrote: 4) Add a sentence saying something like Feeds or Entries are identical if their IDs compare identical.. Seems obvious, but isn't stated anywhere. No. Feeds/entries with the same id are different versions or instances of a common ancestor. They are not the same. Martin

Re: URI canonicalization

2005-01-31 Thread Roy T. Fielding
On Jan 31, 2005, at 7:10 PM, Martin Duerst wrote: 5) Add a note saying something like Comparison functions provided by many URI classes/implementations make additional assumptions about equality that are not true for Identity Constructs. Atom processors therefore should use simple

Re: URI canonicalization

2005-01-31 Thread Roy T. Fielding
There is no reason to require any particular comparison algorithm. One application is going to compare them the same way every time. Two different applications may reach different conclusions about two equivalent identifiers, but nobody cares because AT WORST the result is a bit of inefficient use

Re: URI canonicalization

2005-01-31 Thread Roy T. Fielding
On Jan 31, 2005, at 8:40 PM, Tim Bray wrote: Graham's right, the word identical is wrong, because in fact you will commonly encounter two instances of the same entry which aren't identical (e.g. the one in your cache and the one you just fetched). I suggest Software MUST treat any two entries

URI canonicalization

2005-01-30 Thread Graham
This controversial text is still in: Because of the risk of confusion between URIs that would be equivalent if dereferenced, the following normalization strategy is strongly encouraged when generating Identity constructs: o Provide the scheme in lowercase characters. o Provide the

Re: URI canonicalization

2005-01-30 Thread Graham
I suppose I should offer an alternative solution. Two scenarios were given to justify canonicalization: 1) A publisher accidentally uses a different, though very similar, URI for their id. They then apply the canonicalization rules and the error is erased. This will only work if they remember

Re: URI canonicalization

2005-01-30 Thread Tim Bray
On Jan 30, 2005, at 9:50 AM, Graham wrote: 2) An intermediary automatically c14nizes all URIs it processes. If URIs come pre-c14nized from the publisher, this won't do any damage. This is valid, but the problem is that these intermediaries are currently imaginary. I may be moving toward

Re: URI canonicalization

2005-01-30 Thread Robert Sayre
Tim Bray wrote: On Jan 30, 2005, at 9:50 AM, Graham wrote: 2) An intermediary automatically c14nizes all URIs it processes. If URIs come pre-c14nized from the publisher, this won't do any damage. This is valid, but the problem is that these intermediaries are currently imaginary. I may be

Re: URI canonicalization

2005-01-30 Thread Anne van Kesteren
Robert Sayre wrote: How about this: The only comparison method Atom Processors MUST support is character-by-character comparison [RFC3986]. Atom Processors MAY perform additional scheme-specific comparisions. If you do this: http://Example.org/thing http://example.org/thing You cannot

Re: URI canonicalization

2005-01-30 Thread Julian Reschke
Robert Sayre wrote: Tim Bray wrote: On Jan 30, 2005, at 9:50 AM, Graham wrote: 2) An intermediary automatically c14nizes all URIs it processes. If URIs come pre-c14nized from the publisher, this won't do any damage. This is valid, but the problem is that these intermediaries are currently

Re: URI canonicalization

2005-01-30 Thread Sam Ruby
Paraphrasing Tim [1] I'm definitely -1 on losing 3.5.1, the canonicalization warning is a hard-won compromise and seems to cause no-one any pain. We discussed this at extreme length, and no new arguments have been brought forward. Rough consensus does not mean absolute consensus. - Sam Ruby

Re: URI canonicalization

2005-01-30 Thread Robert Sayre
Julian Reschke wrote: Robert Sayre wrote: Um, the spec doesn't say you can. If the comparision is done with URI.equals(), it will be positive. If it is done with String.equals(), it will be negative. That text is a refelection of reality.

Re: URI canonicalization

2005-01-30 Thread Graham
On 30 Jan 2005, at 7:06 pm, Sam Ruby wrote: Paraphrasing Tim [1] I'm definitely -1 on losing 3.5.1, the canonicalization warning is a hard-won compromise and seems to cause no-one any pain. We discussed this at extreme length, and no new arguments have been brought forward. Rough consensus

Re: URI canonicalization

2005-01-30 Thread Robert Sayre
Eric Scheid wrote: On 31/1/05 6:17 AM, Robert Sayre [EMAIL PROTECTED] wrote: Instances of Identity constructs can be compared to determine whether an entry or feed is the same as one seen before. Processors MUST compare Identity constructs on a character-by-character basis in a case-sensitive

Re: URI canonicalization

2005-01-30 Thread Eric Scheid
On 31/1/05 10:16 AM, Robert Sayre [EMAIL PROTECTED] wrote: I agree with you in principle, but I find the current text unrealistic. It's just fodder for stupid arguments. what? and Atom Processors MAY perform additional scheme-specific comparisions won't lead to stupid arguments? Here's one

Re: URI canonicalization

2005-01-30 Thread Robert Sayre
Eric Scheid wrote: On 31/1/05 10:16 AM, Robert Sayre [EMAIL PROTECTED] wrote: I agree with you in principle, but I find the current text unrealistic. It's just fodder for stupid arguments. what? and Atom Processors MAY perform additional scheme-specific comparisions won't lead to stupid

Re: URI canonicalization

2005-01-30 Thread Graham
On 30 Jan 2005, at 11:43 pm, Eric Scheid wrote: what? and Atom Processors MAY perform additional scheme-specific comparisions won't lead to stupid arguments? Yeah, that's a horrible loose end to leave hanging. The spec should (nay, MUST) mandate a method of comparing Identity Constructs which will

Re: URI canonicalization

2005-01-30 Thread Robert Sayre
Graham wrote: On 31 Jan 2005, at 12:16 am, Robert Sayre wrote: Graham wrote: Yeah, that's a horrible loose end to leave hanging. No, it isn't. URI comparison is not our problem, and what our spec says about it doesn't matter a bit. Yes it is times one million. Ha ha I win et cetera Defining

Re: URI canonicalization

2005-01-30 Thread Asbjørn Ulsberg
On Sun, 30 Jan 2005 14:06:49 -0500, Sam Ruby [EMAIL PROTECTED] wrote: We discussed this at extreme length, and no new arguments have been brought forward. Rough consensus does not mean absolute consensus. Thank you. We've discussed this too much already. Please let's leave this horse; it's

Re: URI canonicalization

2005-01-30 Thread Robert Sayre
Graham wrote: On 31 Jan 2005, at 12:43 am, Robert Sayre wrote: How about Make sure your id is unique from a character-by-character perspective, but also unique in the face of scheme-specific comparisons. That is, don't lean on scheme-specific comparisons to match URIs, but they don't have to be

Re: URI canonicalization

2005-01-30 Thread Joe Gregorio
On Sun, 30 Jan 2005 14:06:49 -0500, Sam Ruby [EMAIL PROTECTED] wrote: Paraphrasing Tim [1] I'm definitely -1 on losing 3.5.1, the canonicalization warning is a hard-won compromise and seems to cause no-one any pain. We discussed this at extreme length, and no new arguments have been