Jan, Wendy's arguments seem very well thought-out to me. Why not define a specific solution for incremental updates of network and cost maps (i.e., the stuff we have in the base protocol) and then, each time we specify an extension, define either a specific update mechanism for it or specify that json patch is to be used for it?!
Sebastian On Thu, Jul 10, 2014 at 02:26:08PM +0000, Jan Seedorf wrote: > Hi Wendy, > > What about future, new ALTO services (e.g. as proposed in > http://tools.ietf.org/html/draft-seedorf-cdni-request-routing-alto-07)? > > I am not a fan of JSON patch, but a solution for incremental updates based on > JSON patch should be much more future-proof with respect to new, future ALTO > services that convey JSON objects other than network/cost maps, right? > > - Jan > > > -----Original Message----- > > From: alto [mailto:[email protected]] On Behalf Of Wendy Roome > > Sent: Wednesday, July 09, 2014 9:15 PM > > To: IETF ALTO > > Subject: Re: [alto] JSON Patch vs. custom representation for incremental > > updates > > > > Here's why I think we need a representation for incremental updates that's > > tailored to the ALTO data model, rather than using the general JSON Patch > > representation. > > > > As I understand it, JSON is a standardized way for a computer to create a > > serialized, machine-independent representation of a data structure, send > > that serialization over a stream to another computer, and have the other > > computer recreate that data structure. This is a simplification, of > > course, but I believe that's the goal. > > > > JSON Patch is a standard way to represent the changes to a data structure, > > ship them to another computer, and have a JSON Patch library on the other > > computer automatically update the remote data structure, with little > > additional work for either computer. > > > > That's a wonderful goal. Unfortunately that has three problems when we > > apply it to ALTO: (1) JSON does not have data representations that > > directly correspond to the ALTO data structures, so JSON cannot capture > > the semantics of the ALTO data. (2) As a result, JSON Patch is an > > inefficient representation of the legal changes. (3) For the clients who > > need incremental update, that inefficiency is a deal breaker. > > > > Let's take the last first. What clients need incremental update? Clients > > who keep full cost and network maps. But what clients would do that? After > > all, clients care about costs between endpoints. Clients don't really care > > about PIDs. PIDs are just an abstraction to make the space of endpoints > > more manageable. For most ALTO clients, the Endpoint Cost Service (ECS) is > > exactly what they want, and they'd much rather use that than go though the > > hassle of downloading the maps, searching them, and keeping them > > up-to-date. > > > > So why would a client use full maps? Because the client needs to lookup > > costs very quickly, and cannot tolerate the delay of querying the ALTO > > Server. For example, a P2P tracker must select, out of 5,000 peers, the 50 > > with the lowest cost to a given peer. And a tracker might do that 10 times > > a second. > > > > As for the second point, incremental update is only necessary for large > > maps. If a map only has 25 PIDs, why bother? Just download a new version. > > What do I mean by "large"? A Network Map with 5,000 PIDs, 250,000 > > prefixes, and up to 25,000,000 cost points. > > > > Yes, that seems huge. Will anyone ever build that large an ALTO server? I > > don't know. But I think a lot of us remember when the ipv4 address space > > seemed infinite. Or when a 100 meg disk was big. > > > > Now consider point 1: JSON does not do a good job of representing the ALTO > > data. Take Cost Maps. A Cost Map is a square sparse matrix of numbers > > indexed by strings. JSON has no such data structure, so in JSON we > > represent that as a lookup table of lookup tables of costs. But that > > consumes a lot more space than necessary. Furthermore, at least for most > > cost metrics, the values are low precision (do you really think that a > > routingcost of 49.99999 is any better than a cost of 50?), and the string > > indexes -- the PID names -- don't change very often. > > > > So if a client needs to handle a 5,000 x 5,000 Cost Map, and lookup costs > > in microseconds, the client convert the PID names to numbers from 0 to > > N-1, so it can use a sparse numerically indexed array, and it stores the > > costs single-precision floats, not double-precision, to save 100 megs of > > RAM. > > > > The mismatch is even worse for Network Maps. A Network Map is a lookup > > table from PID names to sets of prefixes. ALTO has lookup tables, but > > doesn't have sets, so we represent the sets by arrays. But this confounds > > JSON Patch, because order matters in arrays. Furthermore, the JSON > > representation does not capture the semantics that a prefix can only be in > > one PID. So if the server moves 1.2.3.4 from PID1 to PID2, JSON Patch > > would need the following update commands: > > > > add 1.2.3.4 at index 17 in the array for PID1 > > delete index 6 from the array for PID2 > > > > But if we know the real semantics of ALTO Network Maps, we can represent > > that update as: > > > > add 1.2.3.4 to PID1 > > > > The delete from PID2 is implicit. > > > > Here's the bottom line: Clients who need incremental update will NOT store > > data in a format that looks like JSON data model. Such a client will read > > the JSON data, convert it in a totally different form, and then discard > > the original JSON. If we use JSON Patch to represent deltas, a client > > would NEVER be able to use a standard JSON library to automatically apply > > the patches. Instead, the client would need custom code that understands > > every possible JSON Patch update command, and figures out how to apply > > them to the client's representation of the data. And the client may be > > forced to use a suboptimal data structure to allow that (e.g., store > > prefixes as arrays rather than sets). > > > > This does not simplify anything; it just makes more work for the client. > > > > - Wendy Roome > > > > > > _______________________________________________ > > alto mailing list > > [email protected] > > https://www.ietf.org/mailman/listinfo/alto > > _______________________________________________ > alto mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/alto _______________________________________________ alto mailing list [email protected] https://www.ietf.org/mailman/listinfo/alto
