Re: [P2PSIP] AD evaluation: draft-ietf-p2psip-diagnostics-17

Songhaibin (A) Thu, 20 Aug 2015 20:23:14 -0700

Hi Alissa,

Thank you for the comments. Please find my explanations in line with [haibin].

>Thanks. I did have one further thought for Section 8. It strikes me that the 
>combination of diagnostic elements that this spec makes available could easily 
>be used to fingerprint a peer in an overlay where peers may otherwise be 
>attempting to remain anonymous. Although defenses against such fingerprinting 
>are probably hard to come by (short of disallowing access to the diagnostics 
>via the config file), the threat is worth noting at least.

[haibin]Actually we imply your concern with the sentence "It should also be 
noted that attackers can use diagnostics to analyze overlay information to 
attack certain key peers." But we could elaborate more in this section. 

From: Alissa Cooper [mailto:[email protected]] 
Sent: Wednesday, August 19, 2015 12:04 AM
To: David Bryan
Cc: p2psip; [email protected]
Subject: Re: [P2PSIP] AD evaluation: draft-ietf-p2psip-diagnostics-17

On Aug 17, 2015, at 8:54 AM, David Bryan <[email protected]> wrote:

Thanks for the comments. I'll circle back with the team and will revise.

Thanks. I did have one further thought for Section 8. It strikes me that the 
combination of diagnostic elements that this spec makes available could easily 
be used to fingerprint a peer in an overlay where peers may otherwise be 
attempting to remain anonymous. Although defenses against such fingerprinting 
are probably hard to come by (short of disallowing access to the diagnostics 
via the config file), the threat is worth noting at least.

I have to check, but I actually think the first few versions of this (when it 
was individual) *may* be pre-5378, but will double check.

Ah ok.

[haibin] It is true that the first version of the individual draft is pre-5378.

Alissa

On Fri, Aug 14, 2015 at 6:33 PM, Alissa Cooper <[email protected]> wrote:
I have reviewed this document in preparation for IETF last call. Before 
proceeding to last call, I have some comments and questions that I'd like to 
discuss. I've also included a list of nits that should be resolved together 
with any last call comments.

== Substantive comments and questions ==

= General 

I think this document should be listed as updating RFC 6940 in the document 
header.

The document contains a disclaimer for pre-RFC5378 work, which I assume is in 
error. If using xml2rfc to generate the text, using the most up-to-date version 
of xml2rfc to create the next version should fix this.

= Section 4.3:

"There have been proposals that RouteQuery and a series of Fetch
   requests can be used to replace the PathTrack mechanism, but in the
   presence of churn such an operation would not, strictly speaking,
   provide identical results, as the path may change between RouteQuery
   and Fetch operations. (although obviously the path could change
   between steps of PathTrack as well)."

This text is confusing. It doesn't explain why PathTrack is being defined 
rather than peers using RouteQuery plus Fetch. Furthermore, the document does 
not explain how peers are supposed to interpret the results of the PathTrack 
method in the event that the route does change in between steps. This seems 
like a substantial omission.

[haibin] This is a good question. As the route change is a problem (with little 
probability), we would like to retrieve diagnostics info in an efficient way, 
so one simple message is preferred. And you are right, it does not change the 
truth that route can change  in between steps. We need re-wording the section. 
The implication of route change is that, the result of the diagnostics cannot 
be trusted with one hundred percent, which should be noted to the users that 
perform the diagnostics.

In general, this document provides very little information about how peers are 
expected to interpret and use the diagnostic information. I understand that 
peers can do whatever they want with it, but in several places the fact that 
this is not explained makes the reasoning behind the protocol design decisions 
opaque. I would suggest providing a little more context, at the very least by 
moving the examples into the body of the draft somewhere. 

[haibin] We can add some more explanations with diagnostics info 
interpretation. 

= Section 4.4:

I'm a little confused about the values given for the error codes. Do you want 
IANA to chose these specific values? Or are they just there as placeholders? 
The customary way of doing placeholders would be as follows:

Code Value         Error Code Name
      [TBD1]               Underlay Destination Unreachable
      [TBD2]               Underlay Time exceeded
      [TBD3]               Message Expired
      [TBD4]               Upstream Misrouting
      [TBD5]               Loop detected
      [TBD6]               TTL hops exceeded

These placeholders would then get used throughout the document.

[haibin] they are place holders. 

This section also says:

"In addition, this document introduces several types of error information in 
the error_info field in the case of Code 0x65. ... Here are some examples for 
the error info. ... The error_info field values of the Code 0x66 to 0x70 are to 
be application specific and defined by the particular overlay."

I'm not sure what is meant by all of this text. As defined in RFC 6940, 
error_info is an optional implementation-specific byte string. If you are 
intending to require implementations to include specific text in the error_info 
for the first error code, then you need to explain under what conditions the 
specific text needs to be included for each string. And the text about the 
other codes seems redundant with how error_info is already defined in RFC 6940.

[haibin] if a peer receives a diagnostic request found that it cannot reach its 
next hop due to underlay problem, it needs to explain with error_info what 
happened. I do not know what you mean by redundant with RFC 6940. They are 
different errors.

= Section 5.2:

"ext_length : the length of the returned DiagnosticInfo information
      in bytes.  If the value is greater than or equal to 1, then some
      extended diagnostic information is requested."

This is defining the diagnostic response - why would the extension be 
requesting something? And if it is possible to use an extension in that way, 
the expected behavior from the initiator when it receives such an extension 
needs to be described, or at least it needs to be noted that a diagnostic 
response could provoke a request back to the initiator.

[haibin] it is possible to reword it as "....then some extended diagnostic 
information is responded."

Also, in the struct the diagnostic info is called "diagnostic_info_contents" 
but in the text it is called "diagnostic_information." These should be the same 
I think.

[haibin] Yes. It will be corrected.

= Section 5.3:

I'm concerned about the way STATUS_INFO is defined. First, why are there 16 
different levels of congestion status? What are they supposed to signify? If 
there is no standardized way of measuring congestion defined, how is a node 
supposed to interpret STATUS_INFO received from different peers? That is, 
couldn't one peer's level 7 actually mean it is less congested than another 
peer's level 3, and won't that make the information somewhat meaningless to the 
initiator? Maybe this would be clearer if you could explain what you expect an 
initiator to do with the information about 16 different levels of congestion.

[haibin] Your concern makes sense. Actually it is intended to leave it for 
overlay implementers or a future standard to define the congestion level 
algorithms . This draft just defines the format and reserves the bits for their 
use. As the section says " This document does not provide a specific method for 
congestion, leaving this decision to each node.  One possible option for a node 
would be to take its CPU/memory/bandwidth usage percentage in the past 600 
seconds and normalize the highest value to the range [0x00, 0x0F].  A future 
draft may define an objective measure or specific algorithm for this." I think 
we should change the node decision to overlay decision. And an overlay 
implementer may choose to not use all that 16 values.

SOFTWARE_VERSION doesn't really seem like diagnostic information. While all of 
the other fields may be constantly changing and therefore it would make sense 
to inquire about them in a diagnostic fashion, SOFTWARE_VERSION seems more like 
data that nodes could register or exchange when they join the overlay. Why is 
it included as a diagnostic?

[haibin] Diagnostics usually take place when there are failures or abnormal 
behaviors. Software version incompatibility might be a cause.

Also, doesn't the SOFTWARE_VERSION type need a length or a specified delimiter 
that indicates the end of the string? Otherwise, how is the recipient supposed 
to know when to stop parsing it?

[haibin] Will add it.

The definitions of EWMA_BYTES_SENT and EWMA_BYTES_RCVD seem problematic.

sent = alpha x sent_present + (1 - alpha) x sent
rcvd = alpha x rcvd_present + (1 - alpha) x rcvd

As written these equations are not right because sent/rcvd appear on both 
sides. It would be clearer to use last_sent and last_rcvd or some such on the 
right-hand side of these equations. But this begs some bigger questions:

- Does this place a requirement on all nodes implementing this specification to 
have to calculate these values every 5 seconds?
- How are the values calculated the first time?
- How was the value of 5 seconds chosen?

[Haibin] This is incorporated from the early RELOAD document. Maybe some author 
of the RELOAD protocol can give a better explanation.

Finally, I have trouble understanding why only some of the types are subject to 
access control as described in Section 7. It seems to me that in some 
situations all of the fields defined here could be considered sensitive by 
certain nodes or on certain networks. What is the justification for only making 
some of them subject to access control?

[haibin] Actually it says different level of access control can be made. For 
example, diagnostic information A can be accessed by node 1 and 2, but 
diagnostic information B can only be accessed by node 2. We can add some 
explanations as it seems not clear to readers.

= Section 6.1:

"The destination field MUST be set to the desired destination, which MAY be 
either a NodeId or ResourceID but SHOULD NOT be the broadcast NodeID."

What is the expected behavior if a Ping or a PathTrack does get sent to the 
broadcast NodeID? Or should these requirements really be MUST NOTs?

[haibin] I guess they should be MUST NOTs. 

= Section 6.4:

"However, for a single hop
   measurement, the traditional measurement methods MUST be used instead
   of the overlay layer diagnostics methods."

What is meant by the traditional measurement methods?

[haibin] It mainly means IP layer ping.

= Section 9:

Please only use either RFC-XXXX or RFC-AAAA but not both as placeholders.

[haibin] RFC-AAAA is RFC 6940. RFC-XXXX is the placeholder. They will be 
corrected.

Please also use [TBDX] for values you expect to be allocated by IANA.

[haibin] ok.

== Nits ==

= General

The document seems to repeatedly explain that it is doing things "as specified 
in RFC6940." I don't think this necessary in many cases. It's worth doing a 
pass through and removing the places where it's obvious that this spec is 
building on RFC 6940.

[haibin] ok.

= Section 4.1:

OLD
The extensions strictly follow RELOAD
   specification on the messages routing, transport, NAT traversal etc.

NEW
The extensions strictly follow how RELOAD specifies message routing, transport, 
NAT traversal, and other RELOAD protocol features.

= Section 4.2.1:

s/as specified in Table 4 and specified in the RELOAD/as specified in Table 4/

s/new requests of the the/new requests of the/

= Section 4.3:

OLD
We define a simple PathTrack method for retrieving diagnostic
   information iteratively.  The mechanism defined in this document
   follows the RELOAD specification, the new request and response
   message use the message format specified in RELOAD messages.  Please
   refer to the RELOAD [RFC6940] for details of the protocol.

   The operation of this request is shown below in Figure 2.

NEW
We define a simple PathTrack method for retrieving diagnostic
   information iteratively.  The operation of this request is shown below in 
Figure 2.

= Section 4.3.1.2:

s/after that/after receiving a PathTrackAns where the next_hop node ID equals 
the responding node ID/

= Section 4.3.2.1:

s/PathTrack Response/PathTrack response/

= Section 4.4:

s/code. and we define/code. We define/

= Section 5:

s/Path_track methods/PathTrack methods/

= Section 5.1:

s/Note that is NOT/Note that it is not/

= Section 5.2:

s/Note that is NOT/Note that it is not/

s/consists one or more/consists of one or more/

= Section 6.1:

s/`MAY/MAY/

= Section 6.2:

s/return the response the initiator node/return the response to the initiator 
node/

s/The peer SHOULD/The intermediate peer SHOULD/

[haibin] Thanks again.

BR,
-Haibin Song

_______________________________________________
P2PSIP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/p2psip

_______________________________________________
P2PSIP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/p2psip

Re: [P2PSIP] AD evaluation: draft-ietf-p2psip-diagnostics-17

Reply via email to