Synchronization and offline clients

James M Snell Fri, 07 Jul 2006 15:13:34 -0700

The importance of being able to support offline clients is becoming more
and more obvious.  However, there are serious deficiencies in the
current APP collection listing model that makes it difficult, if not
impossible, for an offline client to reliably synchronize with a server.
 We see some of these deficiencies in the recent re-discussion of
app:modified and in the discussion that surrounded Mark Nottingham's
Feed History specification.  That said, however, I think the current
draft safely meets a 80-20 mark with respect to what a lot of clients
are going to want.  Therefore, I am currently working on an extension
spec that describes an off-line synchronization model for APP.  I should
have a initial draft ready in a few days, however, I would like to open
a discussion about the synchronization model.


The model would work essentially by associating a sync URI with a
collection in addition to it's regular Feed URI.  When I do an
unparameterized HTTP GET on the sync URI, I will receive an Atom Feed
representing the *complete* set of members in the collection.  The feed
would not be paged and would have an Etag and Last-Modified associated
with it.  And yes, I'm aware that the response could, potentially, be
massive.  Entries in the feed would contain a modified element that
would identify the time when the entry was last modified in any way.
Every entry in the feed would represent exactly one member in the
collection.

The sync URI could be parameterized to limit the set of entries.  The
parameters would be a date range reminiscent of the old list templates
mechanism.

An HTTP GET to the URI,

  http://example.org/collection/sync?start=2006-06

Would return an Atom feed with just the entries that have been modified
since the timestamp specified.

  http://example.org/collection/sync?start=2006-06&end=2006-07

Would return an Atom feed with just the entries that had been modified
within the time specified.

Ongoing synchronization would work as follows:

1. The first time I sync with the server, I perform an unparameterized
GET on the sync URI to retrieve everything in the collection.  I cache
the result, making note of the ETag and Last-Modified header values, as
well as the Date header in the HTTP response.

2. The next time I sync with the server, I could choose two options:
  a. Complete resync.  I would perform a conditional GET on the sync
     URI using the Etag and Last-Modified.  The response to the request
     would contain the complete set of entries for the entire
     collection.
  b. Partial resync. I would perform a parameterized GET on the sync URI
     passing in the Date of the previous sync response as the value of
     the "start" query string parameter, returning only the entries that
     have been modified since the moment I last sync'd. (* or I could
     use feed delta encoding... keep reading for discussion)

The synchronization spec would introduce two format extensions.

1. x:modified - Atom Date Construct indicating the timestamp of the last
                modification of any kind.
2. x:deleted-entry - A tombstone element that would replace any entry
                     that has been removed from the feed.


  GET /collection/sync?start=2006-06-01 HTTP/1.1
    ...

  HTTP/1.1 200 OK
  Date: ...
  Content-Type: application/atom+xml
  Content-Length: nnnn

  <?xml version="1.0"?>
  <feed xmlns="http://www.w3.org/2005/Atom";>
    ...
    <x:deleted-entry>
      <id>tag:example.org,2006:/2</id>
      <x:modified>2006-06-05T12:12:12Z</x:modified>
    </x:deleted-entry>
    <entry>
      <id>tag:example.org,2006:/1</id>
      <x:modified>2006-06-04T12:12:12Z</x:modified>
      ...
    </entry>
    ...
  </feed>

Note that in order to avoid concurrent update and sliding window issues,
synchronization feeds MUST NOT be paged.

As an alternative to date-based filtering of feeds using query strings,
I am also considering creating a standardizable definition of feed delta
encoding (Bob Wyman has often talked about creating an I-D for delta
encoding but no spec has actually emerged).

So, for instance, for the partial resync (option 2b above), rather than
passing in a start querystring parameter, I would take the Last-Modified
date of the previous sync and set A-IM: feed to indicate to the server
that, if the collection has been modified in any way, return only the
deltas.

I'm happy with either approach.

The sync feed would be linked to by the regular collection feed using a
"sync" link relation.

  <link rel="sync" href="collection/sync" type="application/atom+xml" />

To wrap up I want to emphasize that I am NOT proposing that this be
discussed in the APP spec.  What I want is a separate draft.

Thoughts? Gripes?

- James

Synchronization and offline clients

Reply via email to