Re: Improving the latency of Sync

Ryan Kelly Mon, 16 May 2016 19:00:40 -0700

On 17/05/2016 03:54, Edouard Oger wrote:

Okay I'll try to list the solutions that we have here in order to send a
tab to a device.


* We can either keep the new tab record in the clients collection and
improve the latency of it: (1)

S1: After syncing the collection, the client sends a request R to the
FxA server to notify other devices with a push notification. (see
http://i.imgur.com/RvhVJKK.png)
S2: Similar to S1, except that the request R is made by the Sync Server
after a PUT on /collections
S3: Similar to S1, except that the Sync Server asks the FxA server for
the device list and sends the push notifications itself.
S4: Similar to S1, except that the client gets a device list from FxA
and sends the notifications itself.

* Or make a push-only version (needs push TTL > 0): (2)
S5: The client sends a request to FxA which makes a push notification
(see http://i.imgur.com/Z7XSnKH.png)
S6: The client gets a device list from FxA and sends the notifications
itself.

I would lean towards S1 as the simplest path forward, followed by S5 ifwe want to do something specific to send-tab-to-device.

Edouard, if you'd like to move forward with either of those options, Iinvite you to pitch a concrete API that we should add to fxa-auth-serverto support it, along the lines of the "POST /v1/account/devices/notify"suggestion from earlier in the thread.



  Cheers,

    Ryan

On Tue, May 10, 2016, at 07:55 PM, Mark Hammond wrote:

On 10/05/2016 10:35 AM, Richard Newman wrote:

I think it's worth separating this into at least two different

 > problems.


Sync — a shared whiteboard object store — is currently (ab)used to
achieve two and a half different things:

1. Real shared data like passwords. All clients read and write the

 > same records.

2. One-way data like open tabs. One client writes their record, and

 > all other clients read it.

3. Kinda-one way command channel data via the clients record: all
clients sometimes write any record, but one client owns it. Sending a
tab is a particular kind of command.

Implementing one-shot one-recipient commands via the Sync clients

 > engine is really, really horrible. The client code is complicated, it
 > doesn't scale, and timeliness is only one of its problems.


Making (3) more timely via push notifications is feasible, because by
definition only one client needs to be pinged. As a Send Tab user, I
would find it valuable. But it doesn't get us any closer to a good
command channel, and it's a special case solution that doesn't address
(2) or (1). So we should be clear that this is continuing to invest in
something that we really want to replace.


I think we need to be realistic here - there are no plans to replace the
sync store in general for (1) - maybe that's a conversation we should
have, but it would take an eternal optimist to suggest it would be ready
in a reasonable time frame and it should block other improvements.

So I don't see any reason not to improve (3) - which I assume is the
"half" in your "two and a half different things"?

Making (2) more timely is very feasible and potentially valuable,
because there is less chance of write contention — only one client
writes each record. But see the extensive cost worries in Bug 1222594
<https://bugzilla.mozilla.org/show_bug.cgi?id=1222594>: the design of
the Sync tabs format is deeply flawed, and adding push to the mix only
makes those flaws more apparent.


Note that Edouard is really concerned here with "send tab to device"
rather than the list of Synced tabs. This is a different problem that we
have with the tab list as explained in bug 1222594 - in that bug we
haven't yet *written* the new tab list at the time we need it. Send Tab
to Device is different - the record has been written and we are trying
to reduce the latency of the record being *read* by the recipient.

(Obviously I'm not suggesting that we shouldn't work out how to fix (2)
though)

Making (1) — the other collections — more timely is potentially very
valuable, but simply pinging clients is going to result in race
problems, because many of Sync's bugs are provoked when more than one
client syncs at a time. You might be able to work around this by being
very careful to define what each push notification means, but pretty
soon you'll be knee-deep in distributed voting protocols…


I've a couple of observations about this:

1) As you say, we already have this issue, and given the number of
syncers, we will be hitting it regularly. I agree we don't want to make
it worse, but there's already work under way that should make this
better (eg, batched uploads). It's unreasonable for us to say "we can't
possibly sync more often as it will trigger existing sync bugs" - we
need to work out how to mitigate those existing bugs.

2) A push notification that says "another device just finished writing
records" need not trigger these existing bugs anyway, especially if the
clients responded by only processing incoming records, leaving the
processing of outgoing records to the same schedule as now. That alone
would solve the "send tab to device" latency issue.

When sending a tab they would /directly/ push the command, not

 > involving Sync at all, unless the other device wasn't push-aware.
 > This bootstraps us out of the existing Sync object formats, so we

could build a strictly better system.


I think this is a great idea, and IIUC would work the same even if FxA
was the mediator as Ryan suggested. I was going to suggest that we try
and come up with a scheme where the push message is the canonical
conduit for these messages and the clients engine a fallback that can be
removed once proven - but it seems our push story isn't yet capable of
doing this sanely anywhere other than desktop <-> desktop - which is far
less interesting than desktop <-> mobile IMO.

* On Android, the Java code rather than Gecko would probably need to be
responsible for seeing the push notifications for these "commands" - but
that would probably require us to implement aes-gcm-128 in that Java
code to decrypt the payload (Fennec already has that in Gecko, but that
probably doesn't help us here). Generically implementing commands
without a payload seems tricky - but even that would seem to require
non-trivial Fennec changes IIUC.

* For iOS the story is similar - possibly even worse, I'm not sure - but
they don't even have a gecko fallback.

(Kit did mention that it might be possible to use our own
content-encoding scheme and use native Sync encryption, but that's still
quite an unknown and might not be worthwhile - but worth keeping in mind)

Ryan's proposal (POST /v1/account/devices/notify HTTP/1.1) neatly steps
around this to some degree - it means the devices don't need push
support to *send* the notification, only to receive them - which is
still an incremental improvement.

That client-driven solution also generalizes itself to addressing (2)
("hey laptop, upload your open tabs!" "hey phone, I uploaded my

 > tabs!") and is no worse at solving (1) than any other approach.

Agreed - although I don't see why we can't do the "hey phone, I uploaded
my tabs!" 1/2 of that first. That would also solve some "synced tabs"
latency on desktop - now we force a sync as the UI is being shown, but
the user still needs to wait to see the current list - it would almost
certainly already be up-to-date in this scheme.

Aside: A full solution to this might look more like "hey phone, here are
my tabs, in this very message", but that still doesn't solve the problem
outlined in bug 1222594 - the laptop went to sleep before uploading the
new list of tabs; the phone can ask for the tabs all it wants but they
aren't going to arrive.

Ryan:

(As Richard points out, you may not *want* all the clients to sync all
their data in a push-driven way, because it might trigger all sorts of
edge-cases and data loss.  If you restricted to syncing the clients
collection it would probably be fine.)


As above, I don't see why we couldn't just do incoming records when we
see that notification (and that notification would only be sent when a
device uploads). Is there a reason I'm missing?

(If the notification was sent only on upload, one tweak I'd suggest to
Ryan's idea is that the engines that wrote records is part of the
message)

Cheers,

Mark

_______________________________________________
Sync-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/sync-dev

_______________________________________________
Sync-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/sync-dev

Re: Improving the latency of Sync

Reply via email to