Benjamin Kaduk has entered the following ballot position for draft-ietf-alto-path-vector-22: Discuss
When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ for more information about how to handle DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-alto-path-vector/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- The IANA Considerations section seems incomplete. Looking over the registries at https://www.iana.org/assignments/alto-protocol/alto-protocol.xhtml and comparing against the mechanisms defined in this document, it seems that we need to register the "ane-path" Cost Metric. More worryingly, there is no registry on that page in which the "array" cost mode could be registered, and it seems that using any value other than "numerical" or "ordinal" would violate a "MUST" in §10.5 of RFC 7285. This seems to present some procedural difficulties, especially now that this document is targeting Experimental status rather than Proposed Standard (which, to be clear, I think was the right thing to do). ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- Thanks for fleshing out the security considerations section substantially in recent revisions, and thanks to Sam Weiler for the multiple secdir reviews. While I agree with Sam that it would be nice to be able to list examples of non-DRM technical measures to protect the confidentiality of sensitive path vector information, I can't actually think of any that would be worth listing, myself. So we may have to proceed with the current text (unless you have further ideas, of course). It looks like the VersionTag.tag value of "d827f484cb66ce6df6b5077cb8562b0a" is used in a few different examples. While being associated with different VersionTag.ResourceID values is sufficient to distinguish the uses from each other, it seems like these examples might be more enlightening if distinct VersionTag.tag values were used for the distinct resources. Section 1 Predicting such information can be very complex without the help of ISPs [BOXOPT]. [...] I'm not entirely sure which part(s) of [BOXOPT] are being referenced here. Their scheme seems to involve producing a privacy-preserving scheme for resource allocation that does involve exchnages between client and network, just not ones that reveal sensitive information. Did I miss a part where they compare against a scenario where the network/ISP does not provide input? (I only skimmed the paper.) Section 4.1 * The ALTO server must expose abstract information on the properties of the ANEs used by "eh1 -> eh2" and "eh1 -> eh4", e.g., the available bandwidth between "eh1 - sw1", "sw1 - sw5", "sw5 - sw7", "sw5 - sw6", "sw6 - sw7", "sw7 - sw2", "sw7 - sw4", "sw2 - eh2", "sw4 - eh4" in Case 1. Does it actually need to expose exactly the available bandwidth between all those listed pairs of entities? I would have thought that some of the details could be abstracted away. Section 4.2.1 For example, assume hosts "a", "b", "c" are in site 1 and hosts "d", "e", "f" are in site 2, and there In Figure 5, I see something that looks like an entry for [d] in the "Site 1" part, and an entry for [c] in the "Site 2" part. I'm not sure if that's just an attempt to indicate the directionality of the core backbone or something else. Section 6.4 Note that these property types do not depend on any information resource. As such, their ResourceID part must be empty. Does this mean that the '.' is absent as well? Section 6.5.2 Note that this cost mode only requires the cost value to be a JSON array of JSONValue. However, an ALTO server that enables this extension MUST return a JSON array of ANEName (Section 6.1) when the cost metric is "ane-path". If we're going to require that the cost mode "array" only be used with an array of ANEName, then would it make more sense to call the cost mode "anearray", leaving the generic "array" for a more generic behavior? Section 6.6 DOMAIN-NAME: DOMAIN-NAME has the same format as dot-atom-text specified in Section 3.2.3 of [RFC5322]. It must be the domain name of the ALTO server. (somewhat editorial) is there always exactly one domain name of the ALTO server (vs. more than one)? Section 7.2.4 object { [EntityPropertyName ane-property-names<0..*>;] } PVFilteredCostMapCapabilities : FilteredCostMapCapabilities; with fields: Up in §7.2.3 we didn't repeat any of the fields from the base type we inherited from. Here we do, but (apparently) only because we have more to say about them, e.g., new restrictions on the cost-type-names field to include the Path Vector cost type. Do we want to mention that some fields are repeated because we make their definiton more specific for the PVFilteredCostMapCapabilities usage? Also, since we're repeating most of the FilteredCostMapCapabilities fields, is it worth also defining max-cost-types for completeness? Section 7.2.6 The "Content-Type" header of the response MUST be "multipart/related" as defined by [RFC2387] with the following parameters: This could be read as saying that all three parameters are mandatory, but the actual description for "start" includes the phrase "if present", implying that it is optional. Some more clarity would be helpful (especially relating to whether "boundary" is optional or mandatory, which RFC 2387 itself does not actually clarify directly). * The Path Vector part MUST include "Content-ID" and "Content-Type" [...] RESOURCE-ID in the "Content-ID" of the Path Vector part. The "meta" field MUST also include the "dependent-vtags" field, whose value is a single-element array to indicate the version tag of the network map used, where the network map is specified in the "uses" attribute of the multipart Filtered Cost Map resource in IRD. Just to confirm, there would not be a need to include in this "dependent-vtags" field any dependent resources relating to persistent ANEs? Vector part MUST be included in the "dependent-vtags". If "persistent-entity-id" is requested, the version tags of the dependent resources that MAY expose the entities in the response MUST also be included. This seems a surprising use of the normative MAY, to me. HTTP/1.1 200 OK Content-Length: 821 Content-Type: multipart/related; boundary=example-1; I'm having a hard time reproducing this Content-Length value. Could you double-check it? Section 7.3.3 POST /ecs/pv HTTP/1.1 Host: alto.example.com Accept: multipart/related;type=application/alto-endpointcost+json, application/alto-error+json Content-Length: 222 I'm getting a length of 226 or 227 (depending on newline at end of file); please confirm that 222 is correct. Section 7.3.6 boundary: The boundary parameter is as defined in [RFC2387]. As I alluded to above, the boundary parameter is actually defined in RFC 2045; the only appearance in RFC 2387 is in two examples. The body of the Path Vector part MUST be a JSON object with the same format as defined in Section 11.5.1.6 of [RFC7285] when the "cost-type" field is present in the input parameters and MUST be a JSON object with the same format as defined in Section 4.1.3 of [RFC8189] if the "multi-cost-types" field is present. The JSON I think §4.2.3 of RFC 8189 is somewhat more relevant than §4.1.3, here. The body of the Unified Property Map part MUST be a JSON object with the same format as defined in Section 4.6 of [I-D.ietf-alto-unified-props-new]. [...] Is §4.6 the right reference here? I don't see much defining a JSON format in that section or subsections. Vector part MUST be included in the "dependent-vtags". If "persistent-entity-id" is requested, the version tags of the dependent resources that MAY expose the entities in the response MUST also be included. As above, this is an unusual use of the normative "MAY". HTTP/1.1 200 OK Content-Length: 810 Content-Type: multipart/related; boundary=example-1; type=application/alto-endpointcost+json Continuing the theme, please check this Content-Length as well. Section 8.4 As mentioned in Section 6.5.1, an advanced ALTO server may obfuscate the response in order to preserve its own privacy or conform to its own policies. [...] Is §6.5.1 the correct reference? The word "obfuscate" does not appear therein that I can see. Section 8.5 Is there anything to say about updates needing to be paired in the same way/for the same reasons we have to use multipart/related to get a consistent picture of the path vector cost map and its associated ANE property map? Or perhaps in §9.3 instead? Section 8.6 The second part is the same as in Section 8.4 It seems only analogous, not "the same as", to me -- this example uses aggregated ANEs but §8.3 used the full topology of Figure 10. "endpoint-cost-map": { "ipv4:192.0.2.34": { "ipv4:192.0.2.2": [[ "NET3", "AGGR1" ], 1], "ipv4:192.0.2.50": [[ "NET3", "AGGR2" ], 1] }, "ipv6:2001:db8::3:1": { "ipv6:2001:db8::4:1": [[ "NET3", "AGGR2" ], 1] Is it really plausible to use the same routing cost of 1 for all three paths? Section 11 Streaming updates of max-reservable-bandwidth seems to provide basically an equivalent information stream as to what paths have been reserved (and their bandwidth). That information might be differently sensitive than the primary network information we're exposing with the path-vector methodology, so we should probably mention this "information leakage" channel and give some guidance about what server behaviors might mitigate the leakage (e.g., batching updates, though I suspect that the policy for doing so in a way that minimizes information leakage will be about as hard a problem to solve as padding policies are in general). I'm a little surprised that we didn't mention anything about persistent ANEs here (which would be a great way to contrast with the obfuscation that ephemeral ANEs provide). MIME parsers have historically been a recurring source of security-relevant bugs in other contexts. Perhaps that's sufficiently well known to not need restating here, though. For risk type (3), an ALTO server MUST use dynamic mappings from ephemeral ANE names to underlying physical entities. Thus, ANE names contained in the Path Vector responses to different clients or even for different request from the same client SHOULD point to different physical entities. [...] The guidance of "SHOULD point to different physical entities" doesn't seem quite right. If the ANE abstraction actually attempted to maximize the number of distinct physical entities represented, that seems lke it would make graph reconstruction easier, rather than harder. Perhaps it is better to give guidance about noncorrelation over time of the ANE name/physical element mapping, or even guidance to just use randomized ANE names. Section 13.1 I don't think RFC 4271 needs to be classified as normative; we seem to only reference it as an analogy for the Path Vector/AS Path. Section 13.2 [BOXOPT] Xiang, Q., Yu, H., Aspnes, J., Le, F., Kong, L., and Y.R. Yang, "Optimizing in the dark: Learning an optimal solution through a simple request interface", Proceedings of the AAAI Conference on Artificial Intelligence 33, 1674-1681 , 2019. It looks like this is https://doi.org/10.1609/aaai.v33i01.33011674 ; if so, including the DOI link would be very helpful for readers. That's just the one I happend to go pull up; DOIs for the other papers (if available) should be included, too. NITS Section 1 Map that contains the properties requested for these ANEs. To enforce consistency and improve server scalability, this document uses the "multipart/related" message defined in [RFC2387] to return the two maps in a single response. I think it's more typical to say "content type" than "message" in this context. Section 3 performance of traffic. An ANE can be a physical device such as a router, a link or an interface, or an aggregation of devices such as a subnetwork, or a data center. I think we do not want a comma before "or a data center", since the data center is just another example of an aggregation of devices. Section 4.1 performance. The capacity region information for those flows will benefit the scheduling. However, Cost Maps as defined in [RFC7285] can not reveal such information. I'm not sure I know what "capacity region information" is; did we mean "region capacity information" (or maybe "Knowledge of the relevant capacity regions for those flows")? With the ALTO Cost Map, the cost between PID1 and PID2 and between PID1 and PID4 will be 100 Mbps. The client can get a capacity region I'd suggest "will both be". Section 4.2.1 With the Path Vector extension, a site can reveal the bottlenecks inside its own network with necessary information (such as link capacities) to the ALTO client, instead of providing the full topology and routing information. [...] I'd suggest adding "or no bottleneck information at all". Section 4.2.2 in various documents (e.g., [SEREDGE] and [MOWIE]). Internet Service Providers may deploy multiple layers of CDN caches, or more generally service edges, with different latency and available resources including number of CPU cores, memory in Gigabytes (G), and storage measured in Terabytes (T). The units are probably not relevant for the abstract scenario, and would only become relevant when we start introducing Figure 6 as a specific instantiation of the multi-layer model. Section 5.1.3 Specifically, the available properties for a given resource are announced in the Information Resource Directory as a new capability called "ane-property-names". The selected properties are specified in a filter called "ane-property-names" in the request body, and the response includes and only includes the selected properties for the ANEs in the response. Going from the first to second sentence, we switch from using the string "ane-property-names" to refer to the available properties announced in the IRD, to using it to refer to the properties that a client supplies in a path vector query for use in filtering the response results. To help the reader make this transition smoothly, I suggest rephrasing the transition, perhaps to something like "The properties selected by a client as being of interest are specified in the subsequent Path Vector queries using the filter called 'ane-property-names'." Section 5.3 1. Since ANEs may be constructed on demand, and potentially based on the requested properties (See Section 5.1 for more details). If Incomplete sentence. Section 6.2.4 multipart response. Meanwhile, for persistent ANEs whose entity domain name has the format of "PROPMAP.ane" where PROPMAP is the name of a Property Map resource, PROPMAP is the defining resource of these Is it better to say "name" of "ResourceID"? _______________________________________________ alto mailing list [email protected] https://www.ietf.org/mailman/listinfo/alto
