Hi all, Here are my comments, some of which overlap with what others have said:
* The name of the draft says "rfc6810-bis", but the XML <rfc> tag
doesn't have an obsoletes="6810" attribute. And I don't think it
should -- Section 7 has a normative reference to RFC6810 when
discussing downgrades to version 0, which isn't specified in this
document. So perhaps the title and abstract should be worded to
make it clear that this is not a replacement for RFC6810, but
rather a new version of the protocol specified in RFC6810. (Or
maybe this document should be worded as an update to RFC6810?)
(Also mentioned in <http://article.gmane.org/gmane.ietf.sidr/6871>.)
* The protocol is mostly query-response lockstep, but there are no
timeouts. If the cache is taking unreasonably long to respond to a
query, what should the router do? How long is unreasonably long?
If timeouts are added, should the router reset its timeout timer
for each response PDU (Cache Response, payload, and End of Data),
or only after it receives the End of Data PDU?
* Should the cache time out the router if the router doesn't send a
Query soon after connecting?
* Notify/Query race: What is supposed to happen if the router sees a
Serial Notify right after it sends a Serial Query or Reset Query?
This could happen if the two are sent at the same time -- the
messages will cross paths and the router might think that the
Serial Notify is an erroneous response to the query, and that the
subsequent Cache Response came out of the blue.
* The name "Session ID" is misleading. Section 2 clearly defines it,
but unless you pay attention to the definition it's easy to assume
that "session" refers to the transport session with the peer. I
would prefer a different name such as "Cache Instance ID", though
that name may be insufficient when you consider the protocol
upgrade problem brought up by David in
<http://article.gmane.org/gmane.ietf.sidr/6896>. Maybe something
like "Data Series ID"?
* In Section 5.1 (fields) under "Session ID", what is the definition
of "completely drop the session"? Do you mean send a fatal error
PDU, do a transport-layer disconnect, and let the router reconnect
(possibly to a more preferred cache)? Or do you mean send a Cache
Reset (cache->router) or Reset Query (router->cache) and continue
the existing transport session? Or is either reaction acceptable?
* What is the definition of "payload PDU", mentioned in Sections 5.3,
5.5, 8.1, 8.2, and 8.3? (I assume it means IPv4 Prefix, IPv6
Prefix, and Router Key, but it should be explicitly stated.)
* Suppose an IPv4 Prefix was announced in serial 5 and withdrawn in
serial 6, and a router does a Serial Query against serial 4. Is
it OK if the cache elides the announce/withdraw pair? MUST it? If
it doesn't, it seems like the cache MUST send the payload PDUs in
serial number order, and the router MUST process the payload PDUs in
serial number order (which implies that the transport MUST provide
in-order delivery of the PDUs because the router has no idea which
PDUs correspond to which serial number).
* Section 5.1 (fields) says that the serial number is the serial
number of the cache, but Section 5.3 (Serial Query) talks about
serial numbers as if they are properties of a PDU. Perhaps 5.3
should be worded like:
The router sends a Serial Query to ask the cache for the
announcements and withdrawals that have occurred since the
Serial Number in the Serial Query.
Section 5.5 (Cache Response) has similarly problematic wording.
* The two sentences in 5.3 (Serial Query) paragraph 2 seem to
contradict each other in the case where there are no (net?)
changes: The first sentence suggests that the cache sends a Cache
Response (maybe followed by something?), while the second suggests
that it only sends an End of Data (no Cache Response). I think the
intention is for the cache to send a Cache Response immediately
followed by an End of Data. Is that correct?
* I don't think the set of valid responses to a Query (Reset or
Serial) is clearly specified. I think the intention is for these
to be the only valid responses:
- Reset Query:
* Cache Response followed by 0 or more payload PDUs followed
by End of Data
* Error Report
- Serial Query:
* Cache Response followed by 0 or more payload PDUs followed
by End of Data
* Error Report
* Cache Reset
Is this correct?
* Is there a particular reason for omitting a payload PDU count field
from the Cache Response PDU? If one was present, the router could
pre-allocate an appropriate amount of memory to handle the payload
PDUs (and perform additional sanity checks).
I guess a PDU count field would prevent an implementation from
opportunistically sending additional PDUs if there happened to be a
serial number bump during the middle of a Cache Response.
(Instead, the cache would have to follow the End of Data PDU with a
Serial Notify, which is almost as good.)
* Section 5.6 (IPv4 Prefix) mentions duplicates, but are redundant
entries OK? Examples:
- {65536,192.0.2.0/24-26} and {65536,192.0.2.0/26-26} (the latter
is redundant)
- {65536,192.0.2.0/24-26} and {65536,192.0.2.0/24-25} (the latter
is redundant)
* The fixed-length SKI field doesn't permit algorithm changes. Note
that there has been some discussion about using SHA-256 for the SKI
and AKI fields for the RFC6487(bis) profile (I'm guessing that's
probably not going to happen, but still...).
(Also mentioned in <http://article.gmane.org/gmane.ietf.sidr/6869>.)
* Section 5.11 (Error Report) says that Error Reports are only sent
as responses to other PDUs. Why the restriction? This prevents a
side from raising a timeout error, and it prevents the cache from
raising an internal error if a problem is detected when it's time
to send a Serial Notify.
* If error reports are only sent as responses to other PDUs, how is
it possible for an Error Report to not be associated with the PDU
to which it is responding? (Section 5.11 paragraph 4)
* For version negotiation, what is supposed to happen if the router
starts with a PDU with version > 1? There is an Unsupported
Protocol Version error type, but nothing requires that to be sent.
* Suppose a router connects and issues a v0 Query. If the cache
doesn't support protocol v0, Section 7 says it MUST either
downgrade or disconnect. Can it issue an Error Report before
disconnecting? I would prefer it if the server MUST issue an
Unsupported Protocol Version Error Report before disconnecting.
* The second-to-last paragraph of Section 10 talks about deleting
data from a cache when it has been unable to refresh from that
cache for twice the polling period (by default). Why not have the
time to delete equal the Expire Interval as specified in Section 6?
Thanks,
Richard
signature.asc
Description: OpenPGP digital signature
_______________________________________________ sidr mailing list [email protected] https://www.ietf.org/mailman/listinfo/sidr
