Re: [alto] Httpdir early review of draft-ietf-alto-new-transport-07

kaigao Mon, 03 Apr 2023 18:54:27 -0700

Hi Martin and all,

Thanks a lot for these wonderful comments! We are sorry for the late response 
but it took us a while to digest the comments and discuss internally. The 
responses will arrive in multiple emails, focusing on different aspects.


Before going to the technical and editorial updates, I would like to provide 
some context to make the document easier to follow (hopefully), because 
complexity is definitely not something we want in IETF.

The ultimate goal is to enable more efficient delivery of ALTO information to 
clients. Besides generic mechanism (e.g., upgrading HTTP version), our design 
wants to leverage some characteristics of ALTO traffic, as summarized below 
(C1-C4):

C1: ALTO resources are more likely to evolve incrementally (e.g., network map 
changes triggered by network maintenance, reconfiguration or failure).
C2: Clients continuously fetch or monitor ALTO resources.
C3: Some clients may make customized queries but many clients request the same 
ALTO resources
C4: Clients are heterogeneous, i.e., they request ALTO information with 
different timing, frequency, transport objective, etc.

Given the characteristics, the objectives/requirements of the new transport 
document are

1. to support incremental updates mixed with “snapshots” (based on C1 &amp; C2),
2. to enable cacheable updates to leverage web caches for (potentially) faster 
and more efficient distribution (based on C3),
3. to enable customized update selection and scheduling for more flexible and 
efficient state synchronization (based on C3 &amp; C4, as different clients may 
have different local state),
4. to allow more resource-efficient server implementations but with flexibility 
to enhance performance: no need to store multiple copies of the same data, no 
need to store the complete update history, but must not violate reliability, 
i.e., any client joining at any time with any local state should be able to 
compute the latest available state (based on C3 &amp; C4, as well as server 
heterogeneity),
5. to reduce latency for applications with higher demand for reactivity, e.g., 
an early I-D by Tencent introduces the case of distributing real-time base 
station results through ALTO (based on C4),
6. and backward compatibility (not based on any particular characteristic).

Make sense so far? Then let’s see why RFC 8895 does not work:

1. RFC 8895 already supports a mixed transport of incremental updates and 
snapshots.
2. In RFC 8895, however, the updates are “unnamed” so that they can not be 
shared across different clients using purely ALTO protocol &amp; extensions.
3. In RFC 8895, updates are scheduled purely by the server and clients have no 
control flexibility.
4. In RFC 8895, a server may share the internal storage for queries to the same 
ALTO resource(s). However, to guarantee correctness, a server must maintain the 
state of each client, and also store the history from the version held by the 
oldest client. If a server is short on resources, it may need to disrupt the 
service for clients with older versions to reduce storage overhead.
5. RFC 8895 offers server push capability using server sent event over 
websocket. However, it is the source of why RFC 8895 fails to address the 
previous requirements.

Then what about the new transport mechanism specified in this document?

1. This document introduces a graph structure to describe the evolving of an 
ALTO resource. One can think of the graph as a git tree of only one branch: 
each node is a version and each edge is a patch. Note that a snapshot is a 
patch to the initial (empty) state.

2. With each state versioned, the snapshots and incremental updates now have 
names so that they are now cacheable. Then ALTO servers and clients can gain 
from the common HTTP web cache infrastructure that is widely deployed in the 
Internet today.

3. The graph structure only gives the “metadata” describing what patches are 
available to transit from one version to another. Clients can then determine 
whether to synchronize, which state to synchronize to, what is the best way to 
synchronize, etc., based on its own local state and configurations.

4. As the scheduling of updates is now determined by clients, an ALTO server 
now has full control flexibility of the storage and the available updates, as 
long as it satisfies the condition that there exists at least one available 
path to the most recent version from the initial state (i.e., a new client). 
Note that existing clients can always fallback to a new client by discarding 
its own local state.

5. This is probably the only piece that we need to be HTTP-version-specific. 
The transport mechanism is mainly designed for scalability and flexibility, it 
introduces one round of HTTP “RTT” to fetch the “metadata” -- not a big problem 
for applications/networks that are less sensitive to changes but not good for 
applications demanding fast reactions to network changes. Pushing is preferred 
in the latter scenario. HTTP/1.x does not support native server push 
functionality. Thus, we use long polling: a client sends a request to the next 
update based on the naming convention; then once there is an update, the server 
can send the result. Not the best option as the client needs to make another 
request to receive the next update which may happen very soon after the first 
one, but it’s probably the best we can do for HTTP/1.x now AFAWK. For HTTP/2 
and /3, we do expect to leverage server push and would like to hear opinions 
and have further discussions with HTTP experts like you. Note that this is not 
expected to be a mandatory functionality for any ALTO server/resource, as it 
requires keeping track of client states and may not scale well on the Internet. 
However, we do see use cases such as in the 5G edge network, where this 
functionality can be highly useful.

Based on the design, the document then includes specifications for:

- the ALTO service that provides the new functionality (TIPS service or simply 
TIPS), including the creation/deletion/... (Sections 3 &amp; 4)
- the TIPS view (as the root of the update information related to the request 
resource) (most of Section 5)
- fetching updates with the update graph, including the URL pattern, data 
format, URL pattern for the updates, and reliability requirements/invariants 
(Sections 5.5 &amp; 6)
- receiving updates pushed by servers, including the URL pattern for receiver 
set and how to subscribe (Section 7)

We hope that this email conveys the high-level ideas of the new transport 
mechanism and makes the document easier to follow. While we are having this 
conversation to reach consensus on the design, we the authors are doing the 
wordsmithing for the detailed comments and will follow up on the mailing list 
soon.

Looking forward to your feedback!

Best,
Kai

&gt; -----Original Messages-----
&gt; From: "Martin Thomson via Datatracker" <[email protected]>
&gt; Sent Time: 2023-03-23 10:24:10 (Thursday)
&gt; To: [email protected]
&gt; Cc: [email protected], [email protected]
&gt; Subject: [alto] Httpdir early review of draft-ietf-alto-new-transport-07
&gt; 
&gt; Reviewer: Martin Thomson
&gt; Review result: Not Ready
&gt; 
&gt; I'm going to level-set here from the outset.  I have not given this 
document as
&gt; thorough a review as it might need to be sure, but only because I was 
unable to
&gt; understand it in the hour or so that I spent.  That's clearly not enough 
time
&gt; for something this complex, so adjust the finding of "Not Ready" 
accordingly.
&gt; 
&gt; # Introduction
&gt; 
&gt; This document aims to describe an ALTO-specific means of providing clients 
with
&gt; updates to the transport and network information.  It expands on previous
&gt; efforts that use SSE, which are generic, but maybe don't take advantage of 
the
&gt; newer HTTP server push feature.
&gt; 
&gt; ## This is ALTO
&gt; 
&gt; I like that this doesn't shy away from making its design very specific to 
the
&gt; application.  Lots of people get grandiose ideas that their design is 
wonderful
&gt; and generally applicable and try to build something very generic, in the 
process
&gt; losing touch with both the needs of their application.  They might convince
&gt; (maybe deluding) themselves that others will take on their wonderful ideas 
and
&gt; apply them to completely different applications.
&gt; 
&gt; This document suffers from no such delusions, which is great because this 
work
&gt; is really very difficult to understand for an outsider.  I was involved
&gt; peripherally with the initial ALTO HTTP design, so have some familiarity 
with
&gt; its goals and structure, but I found this document very difficult to 
process.
&gt; Maybe more time would help, but I really can't justify spending that time.
&gt; 
&gt; So I want to focus on the really high-level stuff.
&gt; 
&gt; ## High Level
&gt; 
&gt; The first of my issues makes me wonder if this has been implemented at 
all.  And
&gt; as I went through this, I found myself asking that same question again 
multiple
&gt; times.  Has it?
&gt; 
&gt; Finally, I feel obligated to point out that expending effort on HTTP 
server push
&gt; is perhaps unwise given its relative adoption and success.  That is, it 
has not
&gt; really succeeded in browsers and is being removed.  Much of the benefit 
that a
&gt; protocol like ALTO gains is from using common infrastructure, design 
paradigms,
&gt; tooling, and so forth.  ALTO is a small user of HTTP and so benefits from 
the
&gt; millenia of effort put in to improve HTTP by larger users. Those benefits 
don't
&gt; extend to server push, so there is a real risk of this work becoming hard 
or
&gt; impossible to deploy.
&gt; 
&gt; ## Caveat, Reprise
&gt; 
&gt; To soften this a little, it is entirely possible that some of my criticism 
is
&gt; rooted in not understanding the details well enough.  This is a tower 
built on
&gt; top of a tower build on top of a tower build on top of a protocol that I 
know
&gt; reasonably well, but it is a long way from where I stand to the top of that
&gt; topmost tower.  And it is fair to say that a good review of this document 
(what
&gt; I an not claiming to provide) would demand that I gain familiarity with the
&gt; entire stack.  However, I think that there are several aspects of this 
document
&gt; that could do with some dedicated editorial effort in order to improve 
this sort
&gt; of accessibility.  I've highlighted a few, but I almost certainly missed 
others
&gt; because my focus was on HTTP usage primarily (for instance, I did not 
consider
&gt; whether the security considerations were reasonable or even approximately
&gt; comprehensive).
&gt; 
&gt; 
&gt; # Issues
&gt; 
&gt; ## Server Push Usage
&gt; 
&gt; Section 7.1 says "A client can add itself explicitly to the receiver set 
or add
&gt; itself to the receiver set when requesting the TIPS view."  It describes 
two
&gt; methods for doing this, but neither indicates which request will remain 
open so
&gt; that the client can receive push promises.
&gt; 
&gt; HTTP server push requires that the server send pushes alongside an 
outstanding
&gt; request, but aside from discussion of streams in Section 7.3.1, I can't 
work out
&gt; how the client would do that.  Section 2.4 also fails to make this clear.
&gt; 
&gt; Consequently, I cannot convince myself that the primary feature of this 
document
&gt; will work.
&gt; 
&gt; ## Use of Undefined and Poorly Defined Terms
&gt; 
&gt; I'm raising this to the level of a serious issue because this draft is made
&gt; extraordinarily difficult to understand as a result of this.  Take Section 
2.3,
&gt; which introduces some terms.  That same section then includes completely
&gt; different terms in Figure 1; terms that turn out to be critical concepts.
&gt; 
&gt; I'll also note, though you might treat this as a separate issue, that 
while the
&gt; use of template-like URLs as a convention is a powerful explanatory tool, 
the
&gt; draft doesn't make this clear enough.  The use of i and j for instance, are
&gt; introduced in an example, which is easy to skip, only to find that the 
rest of
&gt; the document critically depends on understanding what those mean.
&gt; 
&gt; 
&gt; ## DELETE, but not
&gt; 
&gt; Section 4.4 describes a use for HTTP's DELETE verb that is novel to say the
&gt; least.  If the goal is to use DELETE to remove something and that 
something is a
&gt; client's membership in a group (a receiver set here), then you should 
provide
&gt; each client with a URL of their own to delete.  Whether you provide a 
resource
&gt; for the collection (which might be useful for adding clients) or not is up 
to
&gt; you, but this approach is not consistent with how HTTP is expected to 
operate
&gt; and will result in surprises.
&gt; 
&gt; Of course, the use of one request (here, the DELETE) to stop server push 
that
&gt; might be happening on another request, is also not how server push is 
expected
&gt; to work.
&gt; 
&gt; 
&gt; ## Connections and Clients
&gt; 
&gt; I can't pin this one down, but there seems to be some sort of assumption 
that
&gt; there is a 1:1 correspondence between connections and clients.  That is 
not how
&gt; HTTP works.  In HTTP, every request stands on its own.  Though there might 
be
&gt; linkages between requests, those linkages should not affect how HTTP itself
&gt; operates, including server push.  (You might detect a common theme here.)
&gt; 
&gt; As I noted, I'm not completely certain about raising this issue because of 
a
&gt; lack of clarity about how the protocol is supposed to operate.
&gt; 
&gt; 
&gt; ## Specification by Example
&gt; 
&gt; I found that this document leaned a little too heavily on examples, to the 
point
&gt; that it sometimes does not concretely specify expected behaviour at all. 
The
&gt; content of examples were used to show the general shape of what is being
&gt; considered.  As I noted before, examples can be a powerful explanatory 
tool, but
&gt; it means that the true interoperability requirements are not always 
directly
&gt; written.  Implementers need to infer normative requirements in some cases.
&gt; 
&gt; See the use of /<tips-view-uri>/... everywhere it appears, Section 2.1.1
&gt; (schema, where the figures are critical to understanding; I also found 
Figure 2
&gt; very hard to understand, so I had to ignore it, even if it still seems 
crucial),
&gt; Figure 4, the figures in Section 2.4, Section 8.3.
&gt; 
&gt; Probably the biggest hole here is j+1 in examples.  I couldn't find a 
statement
&gt; anywhere that says that increments need to strictly increment by 1 each 
time
&gt; (separately, why 101 and not just 1?).  There seems to be an assumption 
about
&gt; that, but the directory resources all seem to indicate that the server is
&gt; responsible for numbering increments.  Fixing that seems important.
&gt; 
&gt; ## Complicated
&gt; 
&gt; This design is very complex.  Some of the details in the document probably 
do
&gt; not need to be specified in the level of detail provided.  Section 5 for
&gt; instance describes resources that provide clients with data about other
&gt; resources, which isn't really consistent with HTTP principles, but they 
probably
&gt; aren't necessary either.
&gt; 
&gt; As long as servers adhere to the invariants (S5.5), clients can ask for
&gt; incremental resources based on what they know and either get a snapshot or
&gt; increments based on what the server is willing to serve them and what they 
are
&gt; willing to process.  The design here requires additional round trips to 
gather
&gt; information about the information the client really wants when it could 
probably
&gt; ask for a resource, use etags to indicate what it has, use Accept to 
indicate
&gt; what it can handle, and the server could then work out what best to serve 
up.
&gt; 
&gt; A specific problem that this design creates is strong coupling.  The client
&gt; needs to know about URI structure in order to use the information in the 
TIPS
&gt; view.
&gt; 
&gt; Similar comments might apply to managing the set of clients that might want
&gt; server push (though again, see the first issue).
&gt; 
&gt; # Nits
&gt; 
&gt; These are just the ones that really jumped out.
&gt; 
&gt; I suggest checking for typos, I saw many.
&gt; 
&gt; Please use proper section references when linking.  Links in the form 
`<xref> target="RFC1234" section="4.5"/&gt;` will ensure that your HTML is 
properly
&gt; generated.
&gt; 
&gt; Please submit a bug report for whatever is going on with the caption on 
Figure
&gt; 2.  It looks like you have a cross reference in there that xml2rfc is 
mangling.
&gt; 
&gt; "long pull" is not a thing (Section 2.1)
&gt; 
&gt; "Connection: Closed" is not a thing (Section 4.2)
&gt; 
&gt; Please use real lists (1) especially when the content of the item is long 
and
&gt; (2) because it makes (a) reading and (b) citing the document easier.
&gt; 
&gt; I can't work out why "next-edge" (Section 5.2) is null when server push is
&gt; disabled.  Why would a client need to know that only when push is enabled? 
 If
&gt; push is enabled, won't the client be pushed the next increment?  On the 
other
&gt; hand, if push is disabled, won't the client need to know what to request 
next?
&gt; 
&gt; 
&gt; ## Section 5.5 is Complicated
&gt; 
&gt; Section 5.5 says:
&gt; 
&gt; &gt; Continuity: ns -&gt; ne, anything in between ns and ne also exists 
(implies ni -&gt;
&gt;   ni + 1 patch exists), where ns is start-seq and ne is end-seq
&gt; 
&gt; This section might be reduced to saying:
&gt; 
&gt; &gt; A server needs to ensure that any resource state that it makes 
available MUST
&gt; be reachable by clients, either directly via a snapshot (that is, relative 
to 0)
&gt; or indirectly by requesting an earlier snapshot and a contiguous set of
&gt; incremental updates.
&gt; 
&gt; 
&gt; 
&gt; _______________________________________________
&gt; alto mailing list
&gt; [email protected]
&gt; https://www.ietf.org/mailman/listinfo/alto
</xref></tips-view-uri></[email protected]>
_______________________________________________
alto mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/alto

Re: [alto] Httpdir early review of draft-ietf-alto-new-transport-07

Reply via email to