Hi, I'd like to offer some feed back and comments on draft 05 by way of review.
I've been following this draft for a while. My perspective is that of someone potentially needing to specify a VDS in compliance with the the requirements of [4.3.1](https://www.ietf.org/archive/id/draft-ietf-cose-merkle-tree-proofs-05.html#name-registration-requirements) My primary question while reading the draft has been: Does this specification enable different VDS's to surface their particular benefits whilst at the same time promoting interoperability ? The tl;dr of this feedback is yes it does. Much gratitude! Were this draft to proceed to wg last call, in my opinion, it meets its stated goals of extensibility and concision. As an implementor of a transparency service that uses an alternate VDS, I am actively keen for this draft to go forward. For additional context, the VDS in question is Merkle based, and commonly known as a "Merkle Mountain Range". This has some similarities to the CT Merkle trees referenced in 5, While providing for different trade offs regarding replicability, prunability, monitoring and "proof" freshness. For the purposes of this review, I think these differences only matter in so far as to offer a fair lense of assessment, especially on the extensibility points. Some questions on scope: Is it the intent of the authors to specifically exclude constructions based on things other than hashes ? Colloquially "trees of math primitives" ? Do the authors have an opinion, or advice, on the range of applicable hashes, or specifically want to leave that out of scope ? Pedersen and Poseidon have come up from time to time in implementations I have studied. # Re 4. And, in particular, the two extension points Structures and proofs are the defined extension points. I found it straight forward to map the proof formats. It is likely for the case of MMR's I could re-use the structures exactly as is, but this may not surface the full advantages. It is likely an MMR based VDS could usefully extend or replace those structures. It is also likely the case that for specific tree sizes (perfect powers of two), the verification algorithms of both could be used inter-changeably. There is enough alignment that I can see this draft leading to greater re-use in implementations and "lower tax" on maintainers due to that. The effort of writing specs is similarly assisted by this commonality, even if there is no attempt at co-ordination or re-use. In the case where it is extended, it seems a consumer of the proof format could re-use the enc/dec handling of the common format, but forgo access to extended features. ## Minor critical points ### Don't see how to express re-use of parts of another spec. 4.3 Usage "When the receipts parameter is present, the associated verifiable data structure and verifiable data structure proofs MUST match entries present in the registries established in " seems to expressly prevent "spec" reuse. I am not sure I can express in the encoding "You can unpack as per RFC9162_SHA256, then you must verify in this way". I can believe it is simpler not to try to cross the streams in this way, and that doing so could provide its own foot-guns. ### The form of the registry entries The case of CT is interesting because RFC9162 specifically enables hash algorithm agility. The registry here appears to force an assignment for each instantiation of CT, eg RFC9162_SHA3_256 would require a new entry, and by implication *any* VDS that offers the same agility would follow this model. Is that what is expected ? I can see it could be a nightmare to try and "align mechanisms for agility" across independent drafts. If registration of the instances is a "low tax" thing, then seems like the right choice for sure. ### Registration requirements This is more of a question: Is it the case that, rather than overly specifying the requirements of VDS specs in this draft, the authors are relying on the general processes of draft adoption to ensure quality ? For example, there is no specific guidance or mention of test vectors, acceptable forms for algorithm description and so on. Section 5 essentially frames RFC9162 as a "template". I think that is fine, but some vehicle for communicating expectations here could help stream line the process. # Re 5 The example VDS spec based on RFC 9162 With the understanding that this is the aspect of the spec that represents how the extension points would be satisfied in other drafts. ## 5.2 Inclusion Proof I was surprised at the use of 'int'. Is that cbor major type 1 (signed) ? Especially for the leaf identifier Has there been specific controversy one way or another vs unsigned int, major type 0 ? "index of leaf in tree" that wording can be confusing where it is legitimate to talk about the leaves on their own as an array, vs all of the nodes including interiors Eg, 6 2 5 0 1 3 4 I presume 'leaf-index' any of the elements [0, 1, 3, 4] But it could mean ALL of the elements, [0, 1, 2, 3, 4, 5, 6], if inclusion of an interior node is permitted. And it could also mean [0, 1, 2, 3] where the leaf nodes [0, 1, 3, 4] are accessed in series I appreciate this may be an act of syntactic pedantry, but depending on context it can matter a lot. Some art in the draft might go a long way. "; path from leaf to current merkle root" This is definitely a concept specific to the CT style of binary merkle trees. Addition always creates a new root, and receipts must be proven against the root so produced. But this is not generally the case. It is a good example of why the receipt structure needs to be an extension point, even when the domain is constrained to binary merkle trees. ## 5.2.1 The strong encouragement "Profiles of proof signatures are encouraged to make additional protected header parameters mandatory, to ensure that claims are processed with their intended semantics" And the guidance on how to accomplish that is well placed and helpful The requirement to make the root detached is an excellent example of the utility of detached payloads. It is tempting to make "re-construction of the proof anchor" globally mandatory. But it is probably much clearer why this is "good" by placing it in the context of a specific VDS. # 7.2. Validity Period I think there might actually be a problem here. Do you mean simply the exp, nbf and so on interms of the signature over the receipt ? Do you mean how long the receipt holder can count on the receipt being verifiable against the log ? I think using exp, nbf and so on for the first case is clear and useful I don't thing time based exp, nbf are suitable for proof validity periods for all vds's In the specific case of some MMR constructions, it is effectively permanent. And the property to convey is how much and how often is it worth checking in on the log for consistency of inclusion for specific items of interest. If this is simply about the signature, then it is fine. And those details I mention can be part of the VDS spec. # Concernes on the general model Only one. There seems to be a baked in assumption that a single attestation covers the inclusion of a single element in the verifiable set. With the implication that every single "proof of inclusion" requires a distinct signature. I think the draft permits a VDS where this is not true. Is this so and intended ? Considering a merkle tree as an accumulator, where multiple proofs can be verified against the same accumulator state, an attestation on the accumulator would naturally be signed by the log operator. In the case of an MMR, the accumulator state is just the "current peak list", and many set elements can have inclusion proven against a single accumulator state. Especially where the log data is widely replicated, the individual proof paths need not be signed by the operator. Either the lead to the accumulator or they do not. The interpretation of that fact seems up stack. In that sort of situation parties other than the transprency service may infact want to independently attest to proof paths. I don't think there is anything here that in anyway "blocks" progress. But I'd appreciate feed back from the authors on intent. # Notable absences It seems common that "logs get frozen", and new logs get started. Reference to this process is made in RFC 9162. This draft doesn't appear to comment on how receipts can or should convey a relation ship to a particular "epoch" or "period" of log operation. This draft seems relatively silent on what consumers of receipts should expect. It seems to constrain itself to be specifically concerned about mechanical interoperability. I think this is good. It seems like an "up stack" concern. I think that means that COSE Receipts is strictly about the shape and "printer ink" for the receipts, but not about the interpretation and model for use so much. Is that fair ? Hope this is all helpful and constructive. Thanks again for all the work that has gone into this draft. Robin _______________________________________________ COSE mailing list -- [email protected] To unsubscribe send an email to [email protected]
