On Fri Apr 30 18:28:35 2010, Bob Wyman wrote:
Dave Cridland <[email protected]> wrote:
> you rely on the clients maintaining state.
It would be nice if we could, in fact, rely on clients to maintain
state and
to do so without error (This is hard to do in XMPP since there is no
guarantee of message delivery...), however, while this relying on
state-full
clients is an attractive idea, it is likely that in many cases, we
simply
can't generally make this assumption. Other than the limitations of
the XMPP
protocol itself (i.e. no delivery guarantee), we also need to
recognize
that we're seeing significant growth in the use of clients that
don't have a
great deal of long-term storage capacity -- i.e. mobile phones,
tablets,
etc. Thus, in any case, "last mile" delivery of messages will need
to
support delivery of all data that might be needed by the client in
order to
do whatever it is that it does with new messages. (Yes, we can
reduce the
client's to simply be display devices for state maintained on
intermediary
servers, but this is not, I think, ideal.)
OK, I have to point out that until I replaced it, the average
smartphone actually had more storage than my only *somewhat* aged and
cheap laptop - I bought it for IETF-63 in Paris, back in 2005. My
n800 - a 2007 tablet device - easily ranged up to 16G storage when it
was launched. Really, I don't buy the argument that these things have
significantly limited storage for these purposes.
Moreover, there's no need to rely on error-free, complete, storage,
as long as clients are able to recover from failure, and if such
protocols are well-designed, they'll form a generalized capability
for clients to acquire sync with new feeds efficiently. This is a
very well-understood problem, after all - Mark Crispin was
synchronizing message feeds without transferring all the data some
thirty years ago, on diskless devices and over 2600 baud.
Finally, XMPP reliability (or otherwise) is also a well-understood
and examined problem, which is why we're seeing the beginnings of
XEP-0198 deployment, which does indeed provide reliability and
stanza-level retransmission. Yet even without this, people have been
content to use XMPP for really quite serious and critical
communications anyway, so I'm deeply unconvinced that this is a
practical problem for the vast majority at this point in time.
And all this really doesn't help explain why you want to deliver data
that's not only redundant, but utterly ignored by the client. We're
not talking about small amounts, here - we're talking about messages
taking up 4k, and that becomes really significant over the mobile
devices you're talking about, since it's dramatically over the MTU
for each and every message. I have a strong suspicion - albeit not
one backed up by data - that this will increase battery usage on
mobile devices.
A reliance on clients' maintaining state would also seem to assume
that a
reasonably high percentage of the traffic shares message-independent
"static" information with messages received earlier and thus that
cache-hit
rates are reasonably high. Client maintenance of state is most
useful when
all messages have the same originator. It is least useful when
every message
has a unique sender.
Today, most applications relevant to this discussion only support
"topic-based" publish/subscribe. Thus, they implement what we tend
to call
"follow" -- messages will be received from some whitelisted set of
publishers. However, in the future, I'm fairly confident that we'll
see an
increase in the number of systems that support "content-based"
publish and
subscribe. Thus, we'll see messages being delivered because of their
content, not simply because of their author. This sort of thing
will be very
much like the "Track" function that originally influenced, in part,
Twitter's adoption of "Atom over XMPP". In the "Track" use case,
(when you
might subscribe to all messages containing the keyword "XMPP")
you'll often
get messages from senders that you've never seen before or will
never see
again. Thus, you'll often find that cache hit rates are lower than
you'd
like even though you may dedicate a great deal of resource to
maintaining
that cache.
All of which is true, yet the Atom feed doesn't contain the avatar
image, which is likely to be the only thing the device will care
about - this is only contained by reference.
These, instead, would need to be currently fetched by HTTP. This is
quite sensible - persistent and one-time URLs allow this data to be
fetched once only, and the usage of a distinct domain allows for a
more efficient distribution architecture.
So the kind of fetching on demand and caching is actuyally going to
be happening anyway, for one of two peices of data the client's
likely to be displaying to the user - the other being the message
itself, of course.
That aside, in a subject, or content-based system, you're still
likely to end up seeing a lot of self-similarity between the
authorships of messages, if only because people often do go through
bursts of talking about a particular topic, and moreover talk in
relatively compact sets of participants. We humans call this a
conversation.
So, we see that, at least, limitations in the XMPP protocol,
resource
limitations on the clients, and a move towards cache-inefficient
content-based routing all tend to argue against an assumption that
we can
rely on clients to maintain state...
You might, but "we" don't see that at all. Maybe it's because I'm
more focussed on practise than theory.
You've failed to show that the reliability issue in XMPP is
signficant, or insurmountable.
You've utterly failed to convince me that a mobile device is so puny
as you claim.
And while I agree that content-based routing will be *less*
cache-efficient, it is not clear at all that the same strategies will
simply be less efficient, as opposed to your implication that it'll
be a net loss.
Dave.
--
Dave Cridland - mailto:[email protected] - xmpp:[email protected]
- acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
- http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade