Hi,
We have recently implemented chapter 7 and we have another list of
questions.
The questions are sorted by the protocol message, being sent.
For the convenience, we also included the questions about chapter 7,
that we have sent in our LastCall comments. These issues are marked with
a star: '[issue ###]*'.
1. Store
======================================================================
Before describing the issues, we will cite the checklist from section
7.4.1.1. It is the same list in exactly the same order, except that it
is numbered. The 2 additional items - * and ** appear in section
7.4.1.1. after describing StoreReq structure (it is last paragraph on
page 101).
[*] If the replica number is zero, then the peer MUST check that it is
responsible for the resource and, if not, reject the request.
[**] If the replica number is nonzero, then the peer MUST check that it
expects to be a replica for the resource and that the request sender is
consistent with being the responsible node (i.e., that the receiving
peer does not know of a better node) and, if not, reject the request.
1. The Kind-ID is known and supported.
2. The signatures over each individual data element (if any) are valid.
If this check fails, the request MUST be rejected with an
Error_Forbidden error.
3. Each element is signed by a credential which is authorized to write
this Kind at this Resource-ID. If this check fails, the request MUST be
rejected with an Error_Forbidden error.
4. For original (non-replica) stores, the StoreReq is signed by a
credential which is authorized to write this Kind at this Resource-ID.
If this check fails, the request MUST be rejected with an
Error_Forbidden error.
5. For replica stores, the StoreReq is signed by a Node-ID which is a
plausible node to either have originally stored the value or in the
replica set. What this means is overlay specific, but in the case of the
Chord based DHT defined in this specification, replica StoreReqs MUST
come from nodes which are either in the known replica set for a given
resource or which are closer than some node in the replica set. If this
check fails, the request MUST be rejected with an Error_Forbidden error.
6. For original (non-replica) stores, the peer MUST check that if the
generation counter is non-zero, it equals the current value of the
generation counter for this Kind. This feature allows the generation
counter to be used in a way similar to the HTTP Etag feature.
7. For replica Stores, the peer MUST set the generation counter to
match the generation counter in the message, and MUST NOT check the
generation counter against the current value. Replica Stores MUST NOT
use a generation counter of 0.
8. The storage time values are greater than that of any value which
would be replaced by this Store.
9. The size and number of the stored values is consistent with the
limits specified in the overlay configuration.
10. If the data is signed with identity_type set to "none" and/or
SignatureAndHashAlgorithm values set to {0, 0} ("anonymous" and "none"),
the StoreReq MUST be rejected with an Error_forbidden error. Only
synthesized data returned by the storage can use these values (see
Section 7.4.2.2)
Reload nodes send store requests in 4 situations:
Case 1. Node wishes to store its data in the overlay.
Case 2. Node, that has received the store request, sends store
requests to save replicas immediately after receiving the original request.
Case 3. Resource's replica set has changed. Node has to save new
replicas on new neighboring nodes.
Case 4. Node is no longer responsible for this resource. It has to
store data on new responsible node.
(We use the term Node, rather than peer because the address of peer is
called NodeId).
[minor issue #1.0.1]
Is the order of checks important or can they be checked in any order?
[minor issue #1.0.2]
Is it an error to send StoreReq without StoreKindData's, or
StoreKindData without any StoredDatas inside?
Case 1.1. Node A stores its data on Node B, which is responsible for
ResourceId R.
--------------------------------------------------------------------
[issue #1.1.1]
Firstly, B has to perform check [*] - is B responsible for R - and
reject the request if not.
It is unclear though what "reject" means exactly. From our point of view
B should send Error_Forbidden to A, but this leads to [issue #1.5.1]
[minor issue #1.1.2]
What if A sends StoreKindData with generation_counter = 5 and B doesn't
have any value for this kind already stored (e.g. A stores data for this
kind for the first time)?
Would it be correct to accept the request, set 5 as initial
generation_counter for this kind and increase it to 6?
[issue #1.1.3]*
We still don't understand the exact meaning of following texts in the
beginning of section 7 under "storage_time"
>data may be stored in a single transaction, rather than querying
>for the value of a counter before the actual store.
Which counter is meant here?
>If a node attempting to store new data in response to a user
>request (rather than as an overlay maintenance operation such as
>occurs when healing the overlay from a partition) is rejected with
>an Error_Data_Too_Old error, the node MAY elect to perform its
>store using a storage_time that increments the value used with the
>previous store.
We don't understand what "using a storage_time that increments the
value used with the previous store" actually means. In this case it is
assumed that the requesting node has already the storage_time
of the previous store available or must it send a StatReq first?
In the former case it could simply check if localtime > storage_time
before sending the request. By what amount should the value be
incremented?
Case 1.2. Node B, sends store requests to save replicas on nodes C and D
immediately after receiving store request from Node A.
-----------------------------------------------------------------------
[issue #1.2.1]*
What is the exact time sequence of the following events?
1. A sends StoreReq to B.
2. B sends StoreAns to A, listing C and D as replicas (in
StoreKindResponse).
3. B sends StoreReq to C; B sends StoreReq to D.
4. C sends StoreAns to B; C sends StoreAns to B.
and with reference to 7.4.1.2 def. of replicas.
1. A sends StatReq to C; A sends StatReq to D.
Does time sequence in section 10.4 apply to any topology plugin, or is
it CHORD-RELOAD specific? Section 7.4.1.2 defines replicas as "the list
of peers at which the data was/will be replicated".
How should A react if storing replicas fails on C and/or D? Should it
retry to store the data? Or should it inform usage and let it handle
this case?
[issue #1.2.2]
In this case, replica number in StoreReq is not 0 and this replica store
can fail due to check [**]. What does "reject the request" means in this
case, and how should the node, that sent the StoreReq, react to this
failure?
D can fail on such request, if some node D* has recently joined D, and B
doesn't know about D*. If it is critical to save the required number of
replicas, B should request the routing table of D, update its own
routing table and store replica on D*. If it is not critical, than B can
wait for the next update from its neighbors and send a store request as
described in section 10.7.3.
[minor issue #1.2.3]
Is it possible to define different replication strategies for different
kinds?
If not, why does a node have to send list of replicas for each kind, not
for the entire request? (There is the "replicas" field in
StoreKindResponse).
[minor issue #1.2.4]
What happens if B sends StoreReq with generation_counter of 0. Should C
and D reject it? What error should they send?
Case 1.3. R's replica set changed, B has to save new replica on new node
D* instead of D.
---------------------------------------------------------------------
[issue #1.3.1]
We still don't understand how CHORD-RELOAD should assign replica_numbers
in this case. Should it be the next replica number 3 because it is the
3rd replica store, that was sent, or should it be 2, because D had
replica number 2?
We still believe that assigning replica_numbers sequentially will lead
to many confusions and it is better to just use replica_number "1" (or
"42") for all replicas in CHORD-RELOAD.
Case 1.4. Node A is no longer responsible for R. It has to migrate data
to Node A*.
-----------------------------------------------------------------------
[issue #1.4.1]
What is the value of replica_number in this case? Is it "0", because A*
is responsible for R, or is it some other replica number (check issue
1.3.1)?
If we use replica_number of 0, then check [4] will fail, because A is
not authorized to write at R.
If we use a replica_number that is non zero, then A' must copy
generation_counter of A, and than the first paragraph on page 104
doesn't make much sense.
1.5. Common issues
----------------------------------------------------------------------
[issue #1.5.1]
Error_Forbidden is sent by node B if checks [2],[3],[4],[5] or [10]
fails, and checks [*] and [**] are also good candidates for sending this
error. However there is no particular format for Error_Forbidden
defined. This means, that the error message is human-readable, but not
machine-readable. In turn the node, that receives this error cannot
react to it, because it doesn't know why exactly his request was rejected.
As we understand:
Check [2] may fail if node A sends new replica store request and
certificate of one of the datas has expired. Since nodes are not
required to synchronize clocks, certificate may still be valid from A's
point of view. In this case A needs some meaningful error message to
react to this error.
The reaction to checks [3] and [4] must be usage-specific.
The reaction to check [5] should be topology-plugin specific.
And finally check [10] is a programming error at the sender, and it can
only be logged and reported.
Checks [*] and [**] can fail due to nodes joined or left. They can be
corrected, if notified properly.
[minor issue #1.5.2]
Section 7.
Any attempt to store a data value with a storage time before that of a
value already stored at this location MUST generate a Error_Data_Too_Old
error.
Section 7.4.1.1
The storage time values are greater than that of any value which would
be replaced by this Store.
So is it before (accept if new_storage_time >= old_storage_time) or
greater (new_storage_time > old_storage_time)?
[issue #1.5.3]
7.3.4.
In the NODE-MULTIPLE policy, a given value MUST be written (or
overwritten) if and only if the signer’s certificate contains a Node-ID
such that H(Node-ID || i).
Should it be obvious that this 'i' is represented as 32 bit unsigned
integer in network byte order?
[minor issue #1.5.3]
7.3.2. and 7.3.3.
In the NODE-MATCH policy, a given value MUST be written (or
overwritten) if and only if the signer’s certificate has a specified
Node-ID which hashes (using the hash function for the overlay) to the
Resource-ID for the resource and that Node-ID is the one indicated in
the SignerIdentity value cert_hash.
We think it should be defined as "and that Node-ID is the one, that has
signed the StoredValue (or StoreReq), as indicated by SignerIdentity".
First, we think cert_hash_node_id is also a valid choice. Second,
SignerIdentity types can be extended.
[minor issue #1.5.4]
If the error is called Error_Generation_Counter_Too_Low, what should
and implementation do if it is opposite too high ?
[minor issue #1.5.5]*
What is the proper reaction for StoreReq, that contain two or more
StoreKindDatas for the same KindId.
2. Fetch
======================================================================
[issue #2.1]
Does the node have to check, if it is responsible for this resource, or
if it is in replica set for this resource?
[issue #2.2]
What is sent if there is no data stored for the requested resource?
Empty FetchReq, FetchReq with FetchKindResponse for each kind and empty
values in it, or Error_Not_Found?
What is sent if there is data for the requested resource, but no data
for the requested kind?
What generation_counter should be sent in FetchKindResponse in this
case? '0' seems to be a good choice. Can we just send FetchKindResponse
with generation_counter of 0 and no values, or should we generate all
requested values?
[minor issue #2.3]
Are storage time and lifetime of synthesized values also set to 0 ?
3. Stat
=======================================================================
[issue #3.1]
Do all the considerations in Fetch also apply here? In particular:
1. Should the node check that it is responsible for resource or that it
is in the replica set for it? (please consider [issue 1.2.1] here)
2. What to send, if current resource_id is not found? What happens if
there is no data for this kind at this resource?
3. Does implementation have to process generation_counter as in FetchReq
- do not send MetaDatas, if generation counter in request matches the
one being stored.
4. Does store have to synthesize values for missing array indexes and
nonexisting dictionary keys as in FetchReq?
4. Find
=======================================================================
[issue #4.1]
What does "closest to R" exactly mean and how are 1+ R_n and
nearest(1+R_n) defined? Does TopologyPlugin have to provide a valid
order relation or metric for ResourceIds?
[issue #4.2]
Would it make sense to define wildcard resource_id to find any resource
that has the requested kind on this node?
[issue #4.3]
It is unclear, what is the resource_id of '0'? Is it the resource with
zero length on the wire, or is it the resource with some plausible for
the current topology value of 0 (e.g. the same value as invalid NodeId).
[issue #4.4]
Should "kind_id is not known" be interpreted as "kind_id is not defined
in overlay-config" or as "there is no resource which stores this kind on
this node"? In the former case sending resource_id of '0' is overloaded
- it is either unknown kind, or kind for which there is no entry on this
node. Please consider also [issue 6.4].
5. Defining new Kinds
========================================================================
[issue #5.1]
Where exactly is textual form present?
In what units is max-size defined?
[issue #5.2]
Section 7.4.4. says, that some kinds cannot be used with find. Is there
a config-entry for this?
6. Common issues
========================================================================
[issue #6.1]*
For each request in data storage protocol the receiving node should
check if the requested kind is present in the configuration file. But
the requirement is different in for each protocol message:
StoreReq: send back Error_Unknown_Kind. The error message must contain
all kinds, the receiver didn't recognize and the sender MUST generate a
config-update after receiving this message.
FetchReq: "Implementations SHOULD reject requests corresponding to
unknown Kinds unless specifically configured otherwise. (as always, what
"reject" mean)
StatReq: no hint given how to behave
FindReq: "If a Kind-ID is not known, then the corresponding Resource-ID
MUST be 0."
[issue #6.2]
Would you consider moving the requirement about signature computation on
arrays from 7.4.2.2, to 7.1, or at least referencing it in 7.1. In this
place it is really easy to be overlooked.
[minor issue #6.3]* (this is in common issues because value is included
in signature)
If some node wishes to delete its data, it should send StoredDataValue with
exists = False
value = {} (0 length)
What must a receiving node do if the value is given? Should it ignore
the value, or reject it?
[issue #6.4]
ProbeReq/Ans defines one of the ProbeInformationType - num_resources as:
>indicates that the peer should Respond with the number of resources
>currently being stored by the peer.
It is unclear, what exactly "currently being stored" means with relation
to non-existing values. In particular say Node A has stored some value
at ResourceId R with lifetime 5hours, and in 1hour will remove it(as
defined in 7.4.1.3). That means R will contain one StoredData with
StoredDataVales::exist set to false and lifetime at least 4 hours. If
this value is the only value R contains, does R count as "currently
being stored"?
Should this resource be returned by Find ?
--- ---
Regards,
Roland and Polina
_______________________________________________
P2PSIP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/p2psip