Re: Polling Sucks! (was RE: Atom feed synchronization)
At 10:06 PM -0400 6/17/05, Sam Ruby wrote: P.S. Why is this on atom-sytax? Is there a concrete proposal we are talking about here? Is there likely to be? Wearing my co-chair hat: IETF WG mailing lists are normally used for creating specs that are listed in the charter. They are also used for general discussion on the topic of the WG, even if that general discussion is not covered in the charter. The latter is perfectly proper if it does not interfere with the former. This WG is now in a waiting state on the format draft, and is actively discussing the protocol draft on a different mailing list ([EMAIL PROTECTED]). Thus, discussions of things related to the use of the Atom format is fine on this list as long as it doesn't get "out of hand", for some very vague definition of "out of hand". Discussions that might lead to individual (non-WG) submissions of Internet Drafts are expressly encouraged at this time. That is not to say "let's start adding a bunch of needless extensions and provisions", but certainly "I see a need and I think I might propose a solution" is a Very Good Thing for this list. --Paul Hoffman, Director --Internet Mail Consortium
Re: Polling Sucks! (was RE: Atom feed synchronization)
Joe Gregorio wrote: On 6/17/05, Sam Ruby <[EMAIL PROTECTED]> wrote: P.S. Why is this on atom-sytax? Is there a concrete proposal we are talking about here? Is there likely to be? Were you expecting [atom-syntax] to vanish in a puff of smoke once we have a couple RFCs under our belt? Given the technology, and the participants, I would expect [atom-syntax] to have the longevity of [xml-dev]. Silly me. And here I thought it fertile grounds for a protocol discussion. Carry on, then. - Sam Ruby
Re: Polling Sucks! (was RE: Atom feed synchronization)
On 6/18/05, Joe Gregorio <[EMAIL PROTECTED]> wrote: > > On 6/17/05, Sam Ruby <[EMAIL PROTECTED]> wrote: > > P.S. Why is this on atom-sytax? Is there a concrete proposal we are > > talking about here? Is there likely to be? > > Were you expecting [atom-syntax] to vanish in a puff of smoke > once we have a couple RFCs under our belt? Given the technology, > and the participants, I would expect [atom-syntax] to have the > longevity of [xml-dev]. Nooo
Re: Polling Sucks! (was RE: Atom feed synchronization)
On 6/17/05, Sam Ruby <[EMAIL PROTECTED]> wrote: > P.S. Why is this on atom-sytax? Is there a concrete proposal we are > talking about here? Is there likely to be? Were you expecting [atom-syntax] to vanish in a puff of smoke once we have a couple RFCs under our belt? Given the technology, and the participants, I would expect [atom-syntax] to have the longevity of [xml-dev]. -joe -- Joe Gregoriohttp://bitworking.org
Re: first request for an atom extension: Re: Polling Sucks! (was RE: Atom feed synchronization)
Henry Story wrote: [...] Something like: http://.../next"; href="http://bblfish.net/blog/archive/ 2005-05-10.atom"> would be really useful. Henry, Mark Nottingham did something on this a while back; try digging through the archives. cheers Bill
Re: Polling Sucks! (was RE: Atom feed synchronization)
Eric Scheid wrote: how does Atom over XMPP help in this scenario: 1) wake up 2) scratch myself, stagger around in morning fog 3) turn on computer, launch feed reader 4) wonder what changes happened during the night This is not the thread you're looking for - go back to bed! cheers Bill
Re: Polling Sucks! (was RE: Atom feed synchronization)
Bob Wyman wrote: Joe Gregorio wrote: The one thing missing from the analysis is the overhead, and practicality, of switching protocols (HTTP to XMPP). I'm not aware of anything that might be called "overhead." What our clients do is, upon startup, connect to XMPP and request the list of Atom files that they are monitoring. They then immediately fetch those files to establish their start-of-session state. From that point on, they only listen to XMPP since anything that would be written to the Atom files is also written to XMPP. HTTP is only used on start-up. It's a pretty clean process. I'm guessing Joe is talking about network administration. There's no shortage of places that won't even let you use SSH or POP, never mind on XMPP. It's port 80 via the proxy or nothing. This is an observation of the current state of affairs, please don't confuse it with an advocacy of it. cheers Bill
Re: Polling Sucks! (was RE: Atom feed synchronization)
Sam Ruby wrote: P.S. Why is this on atom-sytax? Is there a concrete proposal we are talking about here? Is there likely to be? Because James Snell asked a question?.. But, more seriously: I intend to write an Internet draft for RFC3229+feed and hope that I'll be able to get the working group to consider it. Given the implemenation history, we certainly meet the IETF tradition of having more than three independent implementions as well as considerable experience in field use. Also, the "Atom over XMPP" Internet Draft is something that I think the Working Group should consider once the issues related to the syntax and protocol specs are dealt with. In any case, I think it is traditional for IETF mailing lists to provide a forum for discussion of potential use of the protocols that they define in addition to providing a forum for the work of defining the language of the specifications themselves. It is only by developing a common understanding of the various use cases that we can understand how the future work, if any, of the working group should be defined. bob wyman
Re: Polling Sucks! (was RE: Atom feed synchronization)
James M Snell wrote: If I understand Bob's solution correctly, it goes something like: 1) wake up 2) scratch whatever you need to scratch 3) turn on computer, launch feed reader 4) feed reader does some RFC3229+feed magic to catch up on what happened during the night 5) feed reader opens a XMPP connection to receive the active stream of new entries This is precisely what I was describing and it is what we implement in the PubSub Sidebar clients. This hybrid combination gives you the best of both worlds. The result is the lowest possible bandwidth consumption as well as the lowest latency in delivering content to clients. The Push+Pull approach is particularly well suited to the kind of high volume application that James Snell describes -- particularly if the server has a large number of readers. While I've previously pointed out the benefit to the network (efficient utilization of bandwidth) and to clients (low latency), it is important to point out that the Push model offers real benefits for the server as well. In extremely high volume applications, it is important that the server be able to control and smooth load. Server based load control is most easily accomplished with a Push system. In a Pull based system, load is almost totally dependent on client-driven scheduling and thus load tends to be very bursty. Bursty load is the worst possible thing to have in a network-based system. In Push based systems, the server is able to eliminate load bursts by spreading delivery of entries over time -- without worrying about the need to service bursty client requests within the window of their request time-out limits. Even though there are all sorts of advantages to using Push-based and hybrid Push+Pull systems, the reality is that only a tiny percentage of all the millions of servers that support Atom feeds will have sufficient traffic or readership to benefit from these methods. As Joe Gregario suggests in his recent note: "99.99% of all syndication is done via HTTP" and this will probably remain the case in terms of a raw census of servers. However, it is also clear we are seeing significant growth in the use of feed aggregators like PubSub, FeedBurner and the other blog search and monitoring services. Also, we are seeing an increase in the use of feed-readers on mobile devices which require that feeds be consolidated and fed through proxies in order to reduce the amount of polling and other processing done by those mobile devices. As the use of these services increases, it will make sense for client developers to implement client-based support for Push+Pull and thus provide to their users the benefits of reduced bandwidth, reduced session management, and reduced latency. Broad client-based support makes sense even if similarly broad server-based support does not. bob wyman
first request for an atom extension: Re: Polling Sucks! (was RE: Atom feed synchronization)
This is a good venue. I think XMPP and polling can be explored. But for the needs of BlogEd [1] on which I am working, and for my personal needs, I would really like us to introduce an extension to the link concept, to provide a pointer to the next page in a historically ordered sequence of feed documents. For many people who have dumb internet connections with very minimal servers, the xmpp solution requires a lot more technology than we have available or want to be bothered with. Something like: http://.../next"; href="http://bblfish.net/blog/archive/ 2005-05-10.atom"> would be really useful. It requires only a working apache on the server side. On the client side it is really simple to follow. The client just needs to have access to the base feed url, and can follow these links through all the change history of the feed if he wishes. It would allow me to have a: - a remote backup of my blog - provide the means to synchronize it between two editors - allow clients and aggregators to get a complete historical view of the feed. And it comes at really no cost, since all it requires is for us to mint a new "next" url. So how does one go about extending atom? This was meant to be a feature of it, and especially of the link concept. Henry Story [1] https://bloged.dev.java.net/ [2] http://bblfish.net/blog/ On 18 Jun 2005, at 06:27, James M Snell wrote: Sam asked > P.S. Why is this on atom-sytax? Is there a concrete proposal we are talking about here? Is there likely to be? I launched this discussion here for three reasons: 1. Everyone who care's about it is probably already here 2. Main discussion about the syntax is pretty much complete so there is no real risk of derailing anything 3. If there was no already accepted solution to the problem, this would be a logical place to begin hunting for and discussing the solution That said, however, is there a better venue that you could suggest? Capping out the conversation a bit, Bob Wyman's RFC3229+feed proposal, once written up into an Internet-Draft, will provide the solution that I'm searching for (e.g. the ability to catch up on what has changed in a feed over a given period of time). The XMPP Push model would likely not be implemented in the case I'm considering but I couldn't rule it out completely. I believe it is Bob's intention to draft up the RFC3229+feed and pitch it to this group for discussion. Sam Ruby wrote: Joe Gregorio wrote: On 6/17/05, Bob Wyman <[EMAIL PROTECTED]> wrote: Joe Gregorio wrote: The one thing missing from the analysis is the overhead, and practicality, of switching protocols (HTTP to XMPP). I'm not aware of anything that might be called "overhead." I was referring to switching both the client and server from running HTTP to running XMPP. That may not be practical or even possible for some people. Yes, I understand that you run this right now. Yes, I understand that you run a business doing this right now. Yes, I agree that your solution is one way to solve the problem. Do you agree that 99.99% of all syndication is done via HTTP today and also offering an HTTP based solution would be of value? Joe, I'd be careful with how you structure this argument. It could be applied in a different context, for example: Do you agree that 99.99% of all syndication is done via HTTP GET and POST today and offering a solution based only on these two verbs would be of value? One can go down this path and cater to the least common denominator always, or one can say that perhaps MIDP 1.0 phones are not particularly well adapted to perform complex editing tasks beyond simple GET and POST. Perhaps HTTP is suited to a wide, but not universal, range of applications dealing with relatively coarse and relatively infrequently updated content; and XMPP is well suited to a different -- always on, firehose -- set of applications, with a wide overlap between the two. And perhaps they could be combined. I could see a future where there was a "feedmesh" backbone with nodes exchanging data via XMPP, serving content out to the rest of the universe via HTTP. - Sam Ruby P.S. Why is this on atom-sytax? Is there a concrete proposal we are talking about here? Is there likely to be?
Re: Polling Sucks! (was RE: Atom feed synchronization)
That's what I believe Bob's RFC3229+Feed proposal addresses. If I understand Bob's solution correctly, it goes something like: 1) wake up 2) scratch whatever you need to scratch 3) turn on computer, launch feed reader 4) feed reader does some RFC3229+feed magic to catch up on what happened during the night 5) feed reader opens a XMPP connection to receive the active stream of new entries Eric Scheid wrote: On 18/6/05 6:57 AM, "Bob Wyman" <[EMAIL PROTECTED]> wrote: Let's keep Atom as it is now -- without the "first" and "next" tags and encourage folk who need to keep up with high volume streams to use Atom over XMPP. Lowered bandwidth utilization, reduced latency and simplicity are good things. how does Atom over XMPP help in this scenario: 1) wake up 2) scratch myself, stagger around in morning fog 3) turn on computer, launch feed reader 4) wonder what changes happened during the night e.
Re: Polling Sucks! (was RE: Atom feed synchronization)
Sam asked > P.S. Why is this on atom-sytax? Is there a concrete proposal we are talking about here? Is there likely to be? I launched this discussion here for three reasons: 1. Everyone who care's about it is probably already here 2. Main discussion about the syntax is pretty much complete so there is no real risk of derailing anything 3. If there was no already accepted solution to the problem, this would be a logical place to begin hunting for and discussing the solution That said, however, is there a better venue that you could suggest? Capping out the conversation a bit, Bob Wyman's RFC3229+feed proposal, once written up into an Internet-Draft, will provide the solution that I'm searching for (e.g. the ability to catch up on what has changed in a feed over a given period of time). The XMPP Push model would likely not be implemented in the case I'm considering but I couldn't rule it out completely. I believe it is Bob's intention to draft up the RFC3229+feed and pitch it to this group for discussion. Sam Ruby wrote: Joe Gregorio wrote: On 6/17/05, Bob Wyman <[EMAIL PROTECTED]> wrote: Joe Gregorio wrote: The one thing missing from the analysis is the overhead, and practicality, of switching protocols (HTTP to XMPP). I'm not aware of anything that might be called "overhead." I was referring to switching both the client and server from running HTTP to running XMPP. That may not be practical or even possible for some people. Yes, I understand that you run this right now. Yes, I understand that you run a business doing this right now. Yes, I agree that your solution is one way to solve the problem. Do you agree that 99.99% of all syndication is done via HTTP today and also offering an HTTP based solution would be of value? Joe, I'd be careful with how you structure this argument. It could be applied in a different context, for example: Do you agree that 99.99% of all syndication is done via HTTP GET and POST today and offering a solution based only on these two verbs would be of value? One can go down this path and cater to the least common denominator always, or one can say that perhaps MIDP 1.0 phones are not particularly well adapted to perform complex editing tasks beyond simple GET and POST. Perhaps HTTP is suited to a wide, but not universal, range of applications dealing with relatively coarse and relatively infrequently updated content; and XMPP is well suited to a different -- always on, firehose -- set of applications, with a wide overlap between the two. And perhaps they could be combined. I could see a future where there was a "feedmesh" backbone with nodes exchanging data via XMPP, serving content out to the rest of the universe via HTTP. - Sam Ruby P.S. Why is this on atom-sytax? Is there a concrete proposal we are talking about here? Is there likely to be?
Re: Polling Sucks! (was RE: Atom feed synchronization)
Joe Gregorio wrote: On 6/17/05, Bob Wyman <[EMAIL PROTECTED]> wrote: Joe Gregorio wrote: The one thing missing from the analysis is the overhead, and practicality, of switching protocols (HTTP to XMPP). I'm not aware of anything that might be called "overhead." I was referring to switching both the client and server from running HTTP to running XMPP. That may not be practical or even possible for some people. Yes, I understand that you run this right now. Yes, I understand that you run a business doing this right now. Yes, I agree that your solution is one way to solve the problem. Do you agree that 99.99% of all syndication is done via HTTP today and also offering an HTTP based solution would be of value? Joe, I'd be careful with how you structure this argument. It could be applied in a different context, for example: Do you agree that 99.99% of all syndication is done via HTTP GET and POST today and offering a solution based only on these two verbs would be of value? One can go down this path and cater to the least common denominator always, or one can say that perhaps MIDP 1.0 phones are not particularly well adapted to perform complex editing tasks beyond simple GET and POST. Perhaps HTTP is suited to a wide, but not universal, range of applications dealing with relatively coarse and relatively infrequently updated content; and XMPP is well suited to a different -- always on, firehose -- set of applications, with a wide overlap between the two. And perhaps they could be combined. I could see a future where there was a "feedmesh" backbone with nodes exchanging data via XMPP, serving content out to the rest of the universe via HTTP. - Sam Ruby P.S. Why is this on atom-sytax? Is there a concrete proposal we are talking about here? Is there likely to be?
Re: Polling Sucks! (was RE: Atom feed synchronization)
On 18/6/05 6:57 AM, "Bob Wyman" <[EMAIL PROTECTED]> wrote: > Let's keep Atom as it is now -- without the "first" and "next" tags > and encourage folk who need to keep up with high volume streams to use Atom > over XMPP. Lowered bandwidth utilization, reduced latency and simplicity are > good things. how does Atom over XMPP help in this scenario: 1) wake up 2) scratch myself, stagger around in morning fog 3) turn on computer, launch feed reader 4) wonder what changes happened during the night e.
Re: Polling Sucks! (was RE: Atom feed synchronization)
On 6/17/05, Bob Wyman <[EMAIL PROTECTED]> wrote: > Joe Gregorio wrote: > > The one thing missing from the analysis is the overhead, and > > practicality, of switching protocols (HTTP to XMPP). > I'm not aware of anything that might be called "overhead." I was referring to switching both the client and server from running HTTP to running XMPP. That may not be practical or even possible for some people. Yes, I understand that you run this right now. Yes, I understand that you run a business doing this right now. Yes, I agree that your solution is one way to solve the problem. Do you agree that 99.99% of all syndication is done via HTTP today and also offering an HTTP based solution would be of value? -joe -- Joe Gregoriohttp://bitworking.org
RE: Polling Sucks! (was RE: Atom feed synchronization)
Joe Gregorio wrote: > The one thing missing from the analysis is the overhead, and > practicality, of switching protocols (HTTP to XMPP). I'm not aware of anything that might be called "overhead." What our clients do is, upon startup, connect to XMPP and request the list of Atom files that they are monitoring. They then immediately fetch those files to establish their start-of-session state. From that point on, they only listen to XMPP since anything that would be written to the Atom files is also written to XMPP. HTTP is only used on start-up. It's a pretty clean process. > Let's keep Atom as it is now explain to folks who need to keep up with > high volume streams the two options they have, either streaming over > XMPP or "next" links. Where are these "next" links defined? I don't see them in the Atom Internet Draft. The word "next" doesn't even appear in the ID... If they aren't there, how can you call them "Atom as it is now"? I thought Henry Story was proposing these as extensions. bob wyman
Re: Polling Sucks! (was RE: Atom feed synchronization)
On 6/17/05, Bob Wyman <[EMAIL PROTECTED]> wrote: > Let's keep Atom as it is now -- without the "first" and "next" tags > and encourage folk who need to keep up with high volume streams to use Atom > over XMPP. -1 Let's keep Atom as it is now explain to folks who need to keep up with high volume streams the two options they have, either streaming over XMPP or "next" links. > Lowered bandwidth utilization, reduced latency and simplicity are > good things. The one thing missing from the analysis is the overhead, and practicality, of switching protocols (HTTP to XMPP). Thanks, -joe -- Joe Gregoriohttp://bitworking.org
RE: Polling Sucks! (was RE: Atom feed synchronization)
Antone Roundy wrote: > XMPP: > 5. If the feed had entries that were old and not updated, go to step 7 6. > If the feed has a "first" or "next" or whatever link, go to step 1 using > that link 7. Open a socket 8. Send "login" XML stanza I am assuming that if you are pushing entries via Atom over XMPP, you would only push new and updated entries. Thus, a client shouldn't need to check for "old and not updated" entries. Also, I'm assuming that since you are pushing entries, you wouldn't be inserting "first" or "next" links that needed to be followed. The client would get all of its entries from the XMPP stream. > XMPP could achieve parity in getting feed changes that occurred while > offline, at the expense of implementation complexity parity, by > polling the feed once upon startup. My assumption is that any well-built XMPP feed reader will, in fact, also be able to read Atom files via HTTP. This is what we do at PubSub and Gush does the same. I think Bill's app also does this. The original question dealt can, I think, be summarized as: "How does one best keep up with a high-volume Atom publisher?" My point was that the "first" and "next" links don't make things any easier. They just force the client to do a great deal of work to discover what the server already knows -- which entries have been updated. The "first" and "next" links approach just makes the process of working with feed files more complex as well as more bandwidth intensive. XMPP support is a much better solution for keeping up with changes while connected. Let's keep Atom as it is now -- without the "first" and "next" tags and encourage folk who need to keep up with high volume streams to use Atom over XMPP. Lowered bandwidth utilization, reduced latency and simplicity are good things. bob wyman
Re: Polling Sucks! (was RE: Atom feed synchronization)
On Friday, June 17, 2005, at 12:32 PM, Bob Wyman wrote: This is *not* simpler than taking a push feed using Atom over XMPP. For a push feed, all you do is: 1. Open a socket 2. Send a "login" XML Stanza 3. Process the stanzas as they arrive. ... For your solution, you need to: 1. Poll the feed to get a pointer to the "first link". (each poll will cost you a TCP/IP connection). 2. If you got a new "first link" then go to step 5 3. Wait some period of time (the polling interval) 4. GoTo Step 1 5. Open a new TCP/IP socket to get the next link 6. Form and send an HTTP request for the next entry 7. Catch the response from the server 8. Parse the response to determine if its time stamp is something you've already seen. 9. If you haven't seen the current entry before, then go to step 5 10. Go to step 1 to start over. Not to get into a big argument (each method has its advantages depending on circumstances), but allow me to revise the above a little. The following assumes applications that attempt to keep you up-to-date on changes to the feed that occurred while you were offline: XMPP: 1. Open a socket 2. Request and get the feed 3. Parse the XML 4. Process the entries (Determine whether each is new/updated or not--if so, do the appropriate thing) 5. If the feed had entries that were old and not updated, go to step 7 6. If the feed has a "first" or "next" or whatever link, go to step 1 using that link 7. Open a socket 8. Send "login" XML stanza 9. Wait for a stanza (sending keep-alive packets periodically), and when it arrives... 10. Parse the XML 11. Process it (Determine whether the entry is new/updated or not and do the appropriate thing) 12. Go to step 9 Polling: 1. Open a socket 2. Request and get the feed 3. Parse the XML 4. Process the entries (Determine whether the entry is new/updated or not and do the appropriate thing) 5. If the feed had entries that were old and not updated, go to step 7 6. If the feed has a "first" or "next" or whatever link, go to step 1 using that link 7. Wait some period of time 8. Go to step 1 The XMPP app will need to contain a superset of the polling app's code. My assessment of which method wins on various issues: Latency: XMPP Implementation complexity: Polling Bandwidth consumption: XMPP Resource consumption between polls or pushes: Polling Getting all feed changes while online: XMPP if you're trying to archive the feed, otherwise no difference Getting feed changes that occurred while offline: no difference If we're not concerned about ensuring that we get all changes, the story is different: XMPP: 1. Open a socket 2. Send "login" XML stanza 3. Wait for a stanza (sending keep-alive packets periodically), and when it arrives... 4. Parse the XML 5. Process it (Determine whether the entry is new/updated or not and do the appropriate thing) 6. Got to step 3 Polling: 1. Open a socket 2. Request and get the feed 3. Parse the XML 4. Process the entries (Determine whether the entry is new/updated or not and do the appropriate thing) 5. Wait some period of time 6. Go to step 1 My assessment: Latency: XMPP Implementation complexity: similar Bandwidth consumption: XMPP Resource consumption between polls or pushes: Polling Getting all feed changes while online: XMPP Getting feed changes that occurred while offline: Polling XMPP could achieve parity in getting feed changes that occurred while offline, at the expense of implementation complexity parity, by polling the feed once upon startup.
Polling Sucks! (was RE: Atom feed synchronization)
Henry Story wrote: > The best solution is just to add a link types to the atom syntax: > a link to the previous feed document that points to the next bunch of > entries. IE. do what web sites do. If you can't find your answer on > the first page, go look at the next page. > How do you know when to stop? If the pages are ordered > chronologically, the client will know to stop when he has come to a page > with entries with update times before the date he last looked. This is *not* simpler than taking a push feed using Atom over XMPP. For a push feed, all you do is: 1. Open a socket 2. Send a "login" XML Stanza 3. Process the stanzas as they arrive. For your solution, you need to: 1. Poll the feed to get a pointer to the "first link". (each poll will cost you a TCP/IP connection). 2. If you got a new "first link" then go to step 5 3. Wait some period of time (the polling interval) 4. GoTo Step 1 5. Open a new TCP/IP socket to get the next link 6. Form and send an HTTP request for the next entry 7. Catch the response from the server 8. Parse the response to determine if its time stamp is something you've already seen. 9. If you haven't seen the current entry before, then go to step 5 10. Go to step 1 to start over. (Note: I've eliminated and compressed a few steps to avoid more typing... An actual implementation would be more complex than I describe above.) Your solution is more complex and generates much more network traffic (i.e. because of polling the feed, repeatedly opening new TCP/IP connections with all the traditional "slow start" overhead, and requesting each "next link"). Additionally, you end up with reduced latency since the age of any entry you discover will be, on average, half that of your polling frequency plus some latency introduced by link following. (Yes, you could rely on continuous connections and thus remove the overhead of creating so many TCP/IP connections, however, at that point, you might as well have a continuous push socket open...) The push solution conserves network bandwidth, delivers data with much less latency and is simpler to implement. Polling sucks! (that was a pun...) bob wyman