Re: [alto] Incremental updates transport using SSE?

Wendy Roome Mon, 20 Oct 2014 10:14:12 -0700

Xiao,

#1: That case is certainly possible. So we should say that when a server
updates a network map, the server SHOULD send the update on all SSE streams
at the same time as that version becomes available via the network map
resource and the dependent cost map resources. If a client gets network map
and cost map updates via two different SSE streams, it is possible the
client will receive the cost map updates before the network map updates. The
client can detect this by the tag mismatch. If this happens, the client MUST
buffer the cost map updates and apply them after receiving the network map
update. If the corresponding network map update does not arrive in a timely
fashion (whatever that means), the client MUST close both streams, discard
both maps, and reestablish both streams. That may not the most efficient way
to resynchronize, but it is the simplest and most reliable.


Of course, avoiding bizarre cases like that are why a server should normally
use the same stream for network map and cost map updates. :-)

#2: To handle that issue, we should say that if several different update
stream resources offer the same update event type, then events of that type
must be identical on all streams (modulo transmission delays). A client may
use which ever stream it prefers. In particular, a client may pick the
stream that offers the desired combination of update events.

- Wendy Roome 

From:  Xiao SHI <[email protected]>
Date:  Mon, October 20, 2014 at 11:22
To:  Wendy Roome <[email protected]>
Cc:  Xiao SHI <[email protected]>, "Y. Richard Yang" <[email protected]>, IETF
ALTO <[email protected]>
Subject:  Re: [alto] Incremental updates transport using SSE?

Hi Wendy,

#1: Yes, that will solve the problem I was thinking of.

A follow up on this: what if say the client gets a cost map which depends on
a version of the network map that the client has not received? Should the
client simply discard the cost map or its patch? or temporarily save the
cost map and wait? Or does this question enters the realm of client/server
implementation/optimization and is no longer the protocol's concern?

Anyhow, I agree that the resource-id should do.

#2: Sounds good. The reason I thought of this is that a network map could be
in multiple update streams, should/could there be a preference of which
stream the user get the update of the network map from? Or the client could
simply find *any* update stream which contains the resource update it wants?

#3: That's exactly what I meant. Just double checking.

Thanks,
Xiao


On Mon, Oct 20, 2014 at 10:14 AM, Wendy Roome <[email protected]>
wrote:
> Xiao,
> 
> #1: I don't think that is a good idea. The tag is mutable -- it changes every
> update. I believe the event type should be immutable over updates. E.g., note
> that the IRD has resource ids, but not tags. I believe the scenario you
> described is easily avoidable: when the client gets a new network map event,
> the client updates the cached network map, including updated the tag, and
> automatically discards any cached cost map(s) that depended on the old tag.
> And in any case, before using a cost map, the client should always verify that
> the network map tag in the cost map matches the tag of the map used to lookup
> the address.
> 
> Or did I not understand your scenario properly?
> 
> #2: That seems to be an unnecessary optimization and complication. It's a
> complication because it adds an additional constraint (that uri must exist,
> the resource must be an event stream, etc). It's unnecessary because let¹s be
> realistic, an IRD will have at most 20 or 30 entries. Searching them for an
> update stream is not a challenge. Particularly if you remember a client does
> that once and then keeps the stream open for hours.
> 
> Also, if you do think that is necessary, the map resource should give the
> *resource id* of the associated update resource, not the uri. Only a
> resource's IRD entry should give the uri. Think of that as equivalent to
> normalizing a relational DB.
> 
> Or perhaps there could be a capability (or an attribute) that says this
> resource does have an update stream, so the client knows whether search the
> IRD for it.
> 
> #3: I assumed that for a filtered update stream, the client would use a POST
> request to tell the server which events the client wanted, and the server
> would filter out (or just not send) the other events. Is that what you meant?
> 
> - Wendy Roome
> 
> From:  Xiao SHI <[email protected]>
> Date:  Fri, October 17, 2014 at 20:13
> 
> To:  Wendy Roome <[email protected]>
> Cc:  Xiao SHI <[email protected]>, "Y. Richard Yang" <[email protected]>, IETF
> ALTO <[email protected]>
> Subject:  Re: [alto] Incremental updates transport using SSE?
> 
> Hi Wendy,
> 
> I think this is a very good and efficient design. Just a few comments:
> 
> 1. In the event field of the SSE, you included the resource-id but not the tag
> along with it. I agree that the tag seems unnecessary because each event is
> essentially an update of the map, and should be assigned a new tag if the
> update were not incremental. However, consider the following scenario: if, for
> whatever reason, the client only subscribed to the network map incr updates
> (or the server only provided the network map stream), and the client is
> requesting cost maps from the server via regular mechanism (i.e. always
> getting a full new map which contains the dependent-vtag), could there be
> issues when the version of the network map is not synchronized with the
> dependent vtag that the client is about to use? For example, if I got the cost
> map which depends on the network map a few incremental updates ago, but I have
> already processed those updates, there could potentially be PID mismatches and
> what not, right?
> 
> Shall we simply go ahead and add the tag into the event field? As for cost
> maps, that would be the dependent-vtag. We can also put it in the id field.
> 
> 2. In the IRD, would it make sense to include the incremental update uri in
> the entries for regular resources? For example, a client would probably find
> it useful to know that the server is providing "my-default-network-map," and
> its incr update stream can be found in
> "http://example.com/network-map-updates.";
> 
> So the IRD would look like this:
>   "network-map": {
>     ...,
>     "update-uri":"http://somewhere.com/network-map-updates";
>   },
>   "routingcost-map": {
>     ...,
>     "update-uri":"http://somewhere.com/network-map-updates";
>   },
>   "hopcount-map": {
>     ...,
>     "update-uri":"http://somewhere.com/network-map-updates";
>   },
>   
>   "network-map-updates": {
>     "uri": "http://somewhere.com/network-map-updates";,
>     "media-type": "text/event-stream",
>     "uses": ["network-map", "routingcost-map", "hopcount-map"],
>     "accepts": "application/alto-updatestreamfilter+json",
>     "capabilities": {
>       "events": [
>         {"media-type": "application/alto-networkmap+json",
>          "resource-id": "network-map"},
>         {"media-type": "application/alto-costmap+json",
>          "resource-id": "routingcost-map"},
>         {"media-type": "application/merge-patch+json",
>          "resource-id": "routingcost-map"},
>         {"media-type": "application/alto-costmap+json",
>          "resource-id": "hopcount-map"},
>         {"media-type": "application/merge-patch+json",
>          "resource-id": "hopcount-map"}
>       ]
>     }
>   }
> 
> This saves the trouble for the client to search through the "uses" field in
> each stream resource; Also, when there are multiple streams for incremental
> updates for one map (e.g. one stream for networkmap A + costmap B, one stream
> for networkmap A + costmap C and D), the server could load-balance by
> providing the default update-uri for networkmap A in its IRD entry.
> 
> 3. To reduce overall traffic, I assume for one particular event stream, the
> event filtering would be done on the server side, right?
> 
> What do you think?
> 
> Cheers,
> Xiao
> 
> On Fri, Oct 17, 2014 at 1:35 PM, Wendy Roome <[email protected]>
> wrote:
>> Xiao,
>> 
>> My first take was that update stream resources should be 1-1 with the
>> underlying full map resources. Then I realized that cost maps depend on
>> network maps, and clients & servers should coordinate changes to the two. The
>> easiest way to accomplish that was to use the same stream for both network
>> map and cost map updates, with the event type distinguishing network map
>> updates from cost map updates.
>> 
>> But your comment reminded me that network maps are 1-n with cost maps. Eg, a
>> network map can have two cost maps, one for routingcost and the other for
>> hopcount.
>> 
>> So I suggest that an update stream resource can provide updates for an
>> arbitrary set of resources. The server picks the set. Each update event has a
>> media-type (basically, full-map vs merge-patch), and the resource-id of the
>> map it updates. Because SSE only allows three fields (event, id and data), I
>> suggest we encode the media-type,resource-id pair in the event type, as a
>> JSON object with those field names, as in:
>> 
>>     event: 
>> 
{"media-type":"application/alto-network-map+json","resource-id":"network-map">>
}
>> 
>> Here¹s a detailed example of an IRD with an update stream service that gives
>> updates for the network map and the associated routingcost and hopcount maps.
>> It offers merge-patch incremental updates for the cost maps, but only full
>> updates for network maps:
>> 
>>   "network-map": {...},
>>   "routingcost-map": {...},
>>   "hopcount-map": {...},
>>   
>>   "network-map-updates": {
>>     "uri": "http://somewhere.com/network-map-updates";,
>>     "media-type": "text/event-stream",
>>     "uses": ["network-map", "routingcost-map", "hopcount-map"],
>>     "accepts": "application/alto-updatestreamfilter+json",
>>     "capabilities": {
>>       "events": [
>>         {"media-type": "application/alto-networkmap+json",
>>          "resource-id": "network-map"},
>>         {"media-type": "application/alto-costmap+json",
>>          "resource-id": "routingcost-map"},
>>         {"media-type": "application/merge-patch+json",
>>          "resource-id": "routingcost-map"},
>>         {"media-type": "application/alto-costmap+json",
>>          "resource-id": "hopcount-map"},
>>         {"media-type": "application/merge-patch+json",
>>          "resource-id": "hopcount-map"}
>>       ]
>>     }
>>   }
>> 
>> This is a POST request. The client sends
>> application/alto-updatestreamfilter+json data (a new MIME type) to select the
>> events it wishes to get. Here is how a client would request updates for the
>> network and routingcost maps, but not the hopcount map:
>> 
>>   POST /network-map-updates
>>   Host: somewhere.com <http://somewhere.com>
>>   Content-Length: ###
>>   Content-Type: application/alto-updatestreamfilter+json
>>   Accept: test/event-stream,application/alto-error+json
>>   
>>   {"events": [
>>      {"media-type": "application/alto-networkmap+json",
>>       "resource-id": "network-map"},
>>      {"media-type": "application/alto-costmap+json",
>>       "resource-id": "routingcost-map"},
>>      {"media-type": "application/merge-patch+json",
>>       "resource-id": "routingcost-map"}
>>   ]}
>> 
>> The server would respond with something like this.
>> 
>>   HTTP/1.1 200 OK
>>   Content-Type: text/event-stream
>>   
>>   : Full network map, sent immediately.
>>   event: 
>> 
{"media-type":"application/alto-network-map+json","resource-id":"network-map">>
}
>>   data: { ... full network map ... }
>> 
>>   : Full routingcost map, sent immediately.
>>   event: 
>> 
{"media-type":"application/alto-costmap+json","resource-id":"routingcost-map">>
}
>>   data: { ... full routingcost cost map ... }
>>      
>>   : Incremental routingcost map update, sent when enough costs have changed.
>>   event: 
>> {"media-type":"application/merge-patch+json","resource-id":"routingcost-map"}
>>   data: { ... merge-patch for routingcost cost map ... }
>> 
>> Note that this omits the ids. They aren't really necessary. But if the
>> underlying SSE library expects them, the server can always use sequence
>> numbers 1, 2, 3, etc.
>> 
>> How do people feel about this approach?
>> 
>> - Wendy Roome
>> 
>> From:  Xiao SHI <[email protected]>
>> Date:  Fri, October 17, 2014 at 10:10
>> To:  Wendy Roome <[email protected]>
>> Cc:  Xiao SHI <[email protected]>, "Y. Richard Yang" <[email protected]>, IETF
>> ALTO <[email protected]>
>> 
>> Subject:  Re: [alto] Incremental updates transport using SSE?
>> 
>> Hi Wendy,
>> 
>> I completely agree with you. I was thinking maybe there's some clever way of
>> using the SSE event-id, but the questionable saving is simply not worth the
>> logging and synchronization trouble.
>> 
>> 1. One thing that follows this choice is to use one single SSE stream (open
>> connection) for each resource. I briefly considered having one open
>> connection and send updates for all resources on the server via that stream.
>> The advantage of having only one connection open is smaller overhead, and the
>> disadvantage is that it would require a more complicated ID and version
>> tagging in event name/id maybe. What do you think?
>> 
>> 2. If the mapping between SSE stream and underlying map is one-to-one, would
>> it make sense for the resourceID of a stream to be the resourceID of the
>> underlying map and ".update" concatenated together? This way, when the client
>> actually requests and initiates the stream, the client would be aware of the
>> resourceID of the underlying map.
>> 
>> 3. Following that, we should define id field in the events. Richard proposed
>> using a tag as the id if it's a full network map, and newtag_oldtag if it's a
>> patch. The vtag for network map is ResourceID+tag tuple, if the client knows
>> about the resourceID, could we only use the tag? And should the cost maps
>> simply use dependent-vtags? Since the id field is a string, what format do we
>> want the tags in? A few options are 1) json with all whitespaces removed; 2)
>> an even more compact patterned string.
>> 
>> e.g. for network map patch, it could be:
>> {"old-tag":{"resource-id":"my-default-network-map","tag":"3ee2cb7e8d63d9fab71
>> b9b34cbf764436315542e"},"new-tag":{"resource-id":"my-default-network-map","ta
>> g":"c0ce023b8678a7b9ec00324673b98e54656d1f6d"}}
>> 
>> or (since it must be the same resource):
>> {"resource-id":"my-default-network-map","old-tag":"3ee2cb7e8d63d9fab71b9b34cb
>> f764436315542e","new-tag":"c0ce023b8678a7b9ec00324673b98e54656d1f6d"}
>> 
>> or (if the client knows the resource-id for the underlying map of the stream,
>> just use a delimiter):
>> 3ee2cb7e8d63d9fab71b9b34cbf764436315542e_c0ce023b8678a7b9ec00324673b98e54656d
>> 1f6d
>> 
>> What do you think?
>> 
>> Best,
>> Xiao
>> 
>> On Fri, Oct 17, 2014 at 9:21 AM, Wendy Roome <[email protected]>
>> wrote:
>>> Xiao,
>>> 
>>> That is a very good point. I hoped someone would ask about that.
>>> 
>>> It is a tradeoff. The disadvantage of requiring the server to send full
>>> map(s) at the start of each stream is that if the connection drops, we have
>>> to send the full map over again. But the advantage is that incremental
>>> updates are only within the context of this particular stream. We don't need
>>> keep track of where we were.
>>> 
>>> To see the issue, suppose we wanted to avoid re-sending the full map when
>>> the client reestablishes the connection. Then the server must assign a
>>> unique id to every incremental update event. When the client reconnects, the
>>> client gives the server the last known event id, and the server sends any
>>> events the client missed.
>>> 
>>> BTW, SSE does define event ids. Each event has an id, and when connecting,
>>> the client gives a Last-Event-Id header with the last id it received (if
>>> any).
>>> 
>>> So what is the problem with that? First, the server MUST assign a unique id
>>> to every incremental update, and they must be the same for all clients. That
>>> is, if the server has several increment-update clients, the server must
>>> "clock" their update event streams at the same rate. Second, the server must
>>> keep a buffer of old update events, in case a client needs to reconnect. And
>>> third, because a reconnecting client may send an old (or bogus)
>>> Last-Event-Id, the server may have send the full map(s) anyway.
>>> 
>>> If we do not use event ids, then it is much simpler. Incremental updates are
>>> in the context of this stream. Period. No persistent state that must survive
>>> the end of the connection. And if desired, the server can send updates are
>>> different rates to different clients.
>>> 
>>> So I think it boils down to how often we expect connections to drop
>>> unintentionally. If that is common, then, yes, we need to define event ids
>>> to avoid resending the full maps. But if we can assume accidental drops are
>>> rare, then I think it is simpler just to send the full map(s) for every new
>>> connection.
>>> 
>>> - Wendy Roome
>>> 
>>> From:  Xiao SHI <[email protected]>
>>> Date:  Thu, October 16, 2014 at 18:22
>>> To:  Wendy Roome <[email protected]>
>>> Cc:  "Y. Richard Yang" <[email protected]>, IETF ALTO <[email protected]>
>>> Subject:  Re: [alto] Incremental updates transport using SSE?
>>> 
>>> Hi Wendy,
>>> 
>>> If a client accidentally dropped the connection and hopes to re-connect and
>>> receive the merge-patch events, would it make sense for the server not to
>>> send the first full-map event? It will save quite some effort that way.
>>> 
>>> Best,
>>> Xiao
>>> 
>> 
>

_______________________________________________
alto mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/alto

Re: [alto] Incremental updates transport using SSE?

Reply via email to