Re: Paging, Feed History, etc.

2006-06-08 Thread Thomas Broyer


2006/6/8, James Holderness [EMAIL PROTECTED]:


Mark Nottingham wrote:
 Are you talking about using ETag HTTP response headers, If-Match  request
 headers, and 304 Not Modified response status codes? That's a  gross
 misapplication of those mechanisms if so, and this will break
 intermediaries along the path.

For the first page I'm talking about an Etag (or Last-Modified) HTTP
response header and If-None-Match (or If-Modified-Since) request headers for
the retrievals a month later.


What you described is RFC3229 w/ feeds [1], but you failed to include
the new request and response headers and the specific status code,
which are necessary because you're changing the behaviour of
If-None-Match and 304 (Not Modified) as defined in HTTP/1.1.


For page two onwards the state information (date, query and page number)
comes from the link urls returned by the first page.


That means you need to keep entry revisions as well, so that if an
entry is updated while a client is navigating the paged result set, it
is sent the old revision (corresponding to the date parameter).


 Even if it's cast as a query parameter in the URI (for example), it
 requires query support on the server side, a concept of discovered  time
 (as you point out), and places constraints on the ordering of  the feed.

The ordering is not necessarily important. As long as the server can filter
out entries that don't match a specific time criteria it can return those
entries in any order.


Yes, ordering is not important. If ranking is necessary, then use the
Feed Rank extension (but that means that potentially a great number of
entries will be sent back as modified in 226 (IM Used) responses
just because their ranking has changed)


 Are you proposing this instead of the mechanism currently described  in
 FH? Alongside it?

What I'm proposing would work with the FH as currently specified as long as
the client supported ETag or Last-Modified as well. For me that means no
change at all.


You're trying to change HTTP/1.1 behaviour wrt the If-None-Match
request-header field and the 304 (Not Modified) status code, so you
need to implement RFC3229 w/ feeds (which means dealing with some new
headers and a new status code).

As I already said, I highly suggest not using paging for 226 (IM Used)
responses and rather fall back to standard GET in case there are too
many changes (i.e. behaving the same way as servers that don't support
RFC3229 w/ feeds).

My main concern is that RFC3229 w/ feeds is being deployed more and
more widely and is still not even an I-D (or I missed something).
Maybe FH could be the place to spec it, as another optimization
algorithm…

[1] http://bobwyman.pubsub.com/main/2004/09/using_rfc3229_w.html

--
Thomas Broyer



Re: Paging, Feed History, etc.

2006-06-08 Thread James Holderness


Thomas Broyer wrote:

What you described is RFC3229 w/ feeds [1], but you failed to include
the new request and response headers and the specific status code,
which are necessary because you're changing the behaviour of
If-None-Match and 304 (Not Modified) as defined in HTTP/1.1.


Yep. Sorry, forgot to mention that.


That means you need to keep entry revisions as well, so that if an
entry is updated while a client is navigating the paged result set, it
is sent the old revision (corresponding to the date parameter).


Why? If an entry has been revised either don't send it (they'll get it then 
next time they refresh), or send it anyway (they'll just get it again the 
next time they refresh).

Is that such a big deal? Or am I missing something?


Yes, ordering is not important. If ranking is necessary, then use the
Feed Rank extension (but that means that potentially a great number of
entries will be sent back as modified in 226 (IM Used) responses
just because their ranking has changed)


I would have thought IM only applied to the first page. All subsequent pages 
have a specific query that includes the query, page and time. You're not 
sending back a partial result in that case.


What I'm proposing would work with the FH as currently specified as long 
as

the client supported ETag or Last-Modified as well. For me that means no
change at all.


You're trying to change HTTP/1.1 behaviour wrt the If-None-Match
request-header field and the 304 (Not Modified) status code, so you
need to implement RFC3229 w/ feeds (which means dealing with some new
headers and a new status code).


No change at all for *me*. As in my client. I already support FH. I already 
support Etags. I already support 3229.



As I already said, I highly suggest not using paging for 226 (IM Used)
responses and rather fall back to standard GET in case there are too
many changes (i.e. behaving the same way as servers that don't support
RFC3229 w/ feeds).


I don't get why this is a problem, but if you don't like it don't use it.

All I'm saying is, if you're a search engine and you what to create 
subscribable paged results, this is a method that you can use right now, and 
it will work with at least one existing FH capable client (I suspect others 
too). The other proposal on the table is to change all your link names. 
Arguably a much better proposal than what I'm offering - it certainly seems 
to have got a lot of +1s - but it will work with precisely no one.


Regards
James



Re: Paging, Feed History, etc.

2006-06-08 Thread Thomas Broyer


2006/6/8, James Holderness [EMAIL PROTECTED]:


Thomas Broyer wrote:
 That means you need to keep entry revisions as well, so that if an
 entry is updated while a client is navigating the paged result set, it
 is sent the old revision (corresponding to the date parameter).

Why? If an entry has been revised either don't send it (they'll get it then
next time they refresh), or send it anyway (they'll just get it again the
next time they refresh).
Is that such a big deal? Or am I missing something?


Sorry, I thought you wanted search engines to produce snapshots...

(side note: but in this case, is there a need to pass a date
parameter to following pages? and if pages are kind of live, isn't
there a risk of data loss? –I mean, this is the Web, so you'll end up
doing the request for each page, just returning different chunks of
the result set; if an entry changes between the request to the first
page and the retrieval a following page, your request might put it
somewhere else in the result set, changing ordering of entries based
on updated time stamps, discovery date, ranks or else, so your
chunks would be different than if the entry hadn't changed, and an
entry that have not been retrieved might end up in an already
retrieved chunk by page number, hence the client missing an entry– I
think this is Mark's concern: this might be an acceptable behaviour in
some cases but not all)


 You're trying to change HTTP/1.1 behaviour wrt the If-None-Match
 request-header field and the 304 (Not Modified) status code, so you
 need to implement RFC3229 w/ feeds (which means dealing with some new
 headers and a new status code).

No change at all for *me*. As in my client. I already support FH. I already
support Etags. I already support 3229.


OK, I though you only supported Etags, as defined by HTTP/1.1 for
efficient caching and bandwidth saving.


 As I already said, I highly suggest not using paging for 226 (IM Used)
 responses and rather fall back to standard GET in case there are too
 many changes (i.e. behaving the same way as servers that don't support
 RFC3229 w/ feeds).

I don't get why this is a problem, but if you don't like it don't use it.


Yep, sorry, this is not a problem.


All I'm saying is, if you're a search engine and you what to create
subscribable paged results, this is a method that you can use right now, and
it will work with at least one existing FH capable client (I suspect others
too).


So we agree ;-)

Could you read my recent mails in this thread and confirm that it's the case?


The other proposal on the table is to change all your link names.
Arguably a much better proposal than what I'm offering - it certainly seems
to have got a lot of +1s - but it will work with precisely no one.


So there now are two -1, isn't it? ;-)

--
Thomas Broyer



Re: Copyright, licensing, and feeds

2006-06-08 Thread A. Pagaltzis

* Karl Dubost [EMAIL PROTECTED] [2006-06-08 04:30]:
 Which will not remove abuse :)

Well, will anything short of not publishing your content?

I think the point of such an effort is to make life easier for
third parties who want to respect your wishes, not to make it
harder for third parties who are intent on violating them.

Regards,
-- 
Aristotle Pagaltzis // http://plasmasturm.org/



Re: when should two entries have the same id?

2006-06-08 Thread Elliotte Harold


James M Snell wrote:


That's not quite accurate.  Two entries with the same atom:id may appear
within the same atom:feed only if they have different atom:updated
elements.  The spec is silent on whether or not two entries existing in
*separate documents* may have identical atom:id and atom:updated values.



They're ids, not guids. Certainly I would expect that there'll be some 
accidental conflicts. For instance one site might number its posts 
post1, post2, post3,...; and a different, unrelated site might do the same.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/



Re: when should two entries have the same id?

2006-06-08 Thread Henry Story



On 8 Jun 2006, at 14:44, Elliotte Harold wrote:



James M Snell wrote:

That's not quite accurate.  Two entries with the same atom:id may  
appear

within the same atom:feed only if they have different atom:updated
elements.  The spec is silent on whether or not two entries  
existing in
*separate documents* may have identical atom:id and atom:updated  
values.


They're ids, not guids. Certainly I would expect that there'll be  
some accidental conflicts. For instance one site might number its  
posts post1, post2, post3,...; and a different, unrelated site  
might do the same.


No, they are guids. The datatype for an id is a IRI, which is a  
generalisaiton of URI. IRIs are constructed in such a way that it  
should be easy to construct universally unique ones without ever  
having name clashes. If name clashes there are, this will either be  
due to incompetence or to malevolence.


Henry



--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/ 
cafeaulaitA/




Re: when should two entries have the same id?

2006-06-08 Thread Julian Reschke


Elliotte Harold schrieb:


James M Snell wrote:


That's not quite accurate.  Two entries with the same atom:id may appear
within the same atom:feed only if they have different atom:updated
elements.  The spec is silent on whether or not two entries existing in
*separate documents* may have identical atom:id and atom:updated values.



They're ids, not guids. Certainly I would expect that there'll be some 
accidental conflicts. For instance one site might number its posts 
post1, post2, post3,...; and a different, unrelated site might do the same.


Sorry? That would be a bug. They *are* supposed to be globally unique.

See: http://greenbytes.de/tech/webdav/rfc4287.html#rfc.section.4.2.6.

Best regards, Julian



Re: Copyright, licensing, and feeds

2006-06-08 Thread M. David Peterson
Very well stated, Aristotle!On 6/8/06, A. Pagaltzis [EMAIL PROTECTED] wrote:
* Karl Dubost [EMAIL PROTECTED] [2006-06-08 04:30]: Which will not remove abuse :)Well, will anything short of not publishing your content?I think the point of such an effort is to make life easier for
third parties who want to respect your wishes, not to make itharder for third parties who are intent on violating them.Regards,--Aristotle Pagaltzis // http://plasmasturm.org/
-- M:D/M. David Petersonhttp://www.xsltblog.com/


RFC3229 w/ feeds [was: Paging, Feed History, etc.]

2006-06-08 Thread Mark Nottingham



On 2006/06/07, at 11:40 PM, Thomas Broyer wrote:


My main concern is that RFC3229 w/ feeds is being deployed more and
more widely and is still not even an I-D (or I missed something).


I have that concern as well.

I am also concerned that RFC3229 is an extension of HTTP, but some  
implementers are acting as if it chages the semantics of already- 
defined parts of HTTP. For example, a delta must be a subset of the  
current representation that is returned to a GET; if you GET the  
feed, it has to return all of the entries that you could retrieve by  
using delta.


I have a feeling that many people are treating it as a dynamic query  
mechanism that's capable of retrieving any entry that's ever been in  
the feed, while still only returning the last n entries to a plain  
GET. If so, they're breaking HTTP, breaking delta, and should use  
something else.


Is this the case, or am I (happily) mistaken?

--
Mark Nottingham http://www.mnot.net/



Re: Paging, Feed History, etc.

2006-06-08 Thread James Holderness


Thomas Broyer wrote:
Could you read my recent mails in this thread and confirm that it's the 
case?


I'm sorry, but I can no longer participate in this discussion. I hope 
everything works out ok.


Regards
James



Re: Copyright, licensing, and feeds

2006-06-08 Thread Karl Dubost



Le 06-06-08 à 19:40, A. Pagaltzis a écrit :

* Karl Dubost [EMAIL PROTECTED] [2006-06-08 04:30]:

Which will not remove abuse :)


Well, will anything short of not publishing your content?

I think the point of such an effort is to make life easier for
third parties who want to respect your wishes, not to make it
harder for third parties who are intent on violating them.


Agreed. And it's why my message (which was really badly written -  
fatigue) was separating the issue. It's a very important issue, and I  
really believe a clear spec, framework or let's say technical  
solution would improve the field. Definitely.


I would love to see that happening as soon as possible. It's a mix  
between social and technical issues. Finding interoperable solutions  
would help to soften the social issues and frustrations.


so again +1 a thousand of times ;)




--
Karl Dubost - http://www.w3.org/People/karl/
W3C Conformance Manager, QA Activity Lead
  QA Weblog - http://www.w3.org/QA/
 *** Be Strict To Be Cool ***