I'm currently deep in the weeds of metadata resolution and validation, given input issuer and resource identifiers. As with many things OAuth these days, this is being driven by the needs of MCP, but also applies to any other "dynamic" protocols that make use of discovery.
In implementing this, I've identified some cases that are unclear in the specifications. These cases revolve around identifiers that end in "/". Starting with authorization server metadata, section 3.1 of RFC 8414 states: > If the issuer identifier value contains a path component, any > terminating "/" MUST be removed before inserting "/.well-known/" and > the well-known URI suffix between the host component and the path > component. Let's say I have an input issuer identifier of "https://example.com/issuer1/". Because this path component has a terminating "/", I would expect the metadata request to be (having removed the terminating "/" from the path component): ``` GET /.well-known/oauth-authorization-server/issuer1 HTTP/1.1 Host: example.com ``` Note that this is the same request that would be made if the identifier _did not_ have a terminating "/". Now, in order to pass validation required by section 3.3 of RFC 8414, the metadata needs to respond with an issuer identifier that _preserves_ the terminating "/". ``` { "issuer": "https://example.com/issuer1/" } ``` However, the metadata request was "lossy" by removing this slash, and creates ambiguity in situations where the input _did not_ have a terminating "/". The same holds for an issuer identifier that has a sole "/" path component. "https://example.com/" results in a request to: ``` GET /.well-known/oauth-authorization-server HTTP/1.1 Host: example.com ``` Now, turning attention to protected resource metadata. Section 3.1 of RFC 9728 states: > If the resource identifier value contains a path or query component, any > terminating slash (/) following the host component MUST be removed before > inserting /.well-known/ and the well-known URI path suffix between the host > component and the path and/or query components. NOTE: This is semantically different from RFC 8414, as it removes terminating "/" _following the host component_, as opposed to terminating slash of the path component (assuming I'm interpreting the specification correctly). Let's say I have an input resource identifier of " https://example.com/resource1/". Because this path component has a terminating "/", I would expect the metadata request to be (having removed the leading "/" but _preserving_ the terminating "/"): ``` GET /.well-known/oauth-protected-resource/resource1/ HTTP/1.1 Host: example.com ``` >From this perspective, the request is non-lossy, and allows the metadata response to be crafted in a way that is unambiguous with respect to the input identifier. However, ambiguity creeps in again when we have an issuer identifier with _solely_ a terminating "/", such as "https://example.com/". In such a case, I would expect the metadata request to be (after having removed the terminating slash following the host component): ``` GET /.well-known/oauth-protected-resource HTTP/1.1 Host: example.com ``` Once again, to pass validation required by section 3.3 of RFC 9728, the metadata needs to respond with: ``` { "resource": "https://example.com/", } ``` But, the request is now lossy and the server does not know whether the input identifier contained a trailing slash or not. These situations often occur in situations where the input is user-entered, and servers accept paths where trailing slashes are optional - responding identically to either request. This is particularly common when serving from "root" resources, as users often enter URLs with bare hostnames. I'm posting here to initiate a discussion and gather consensus on what is the expected behavior (or to be informed that I am reading the specifications wrong). After consensus, I'd like to have a clear set of test cases that validate the expected behavior in all these scenarios, in order to address the lack of such detail in the specifications. I'm seeing SDKs pop up with varying behavior, and given the security-critical nature of these validations, I think it's important to resolve this to ensure both compatibility and security among implementations. I have some thoughts on how the wording could be modified to make the algorithm and expected results more clear, but I'll wait on that until others have had a chance to comment. It's also unfortunate that the two specifications seem to define different algorithms (again, assuming I am interpreting them correctly). It would be preferable if they were identical, but that ship has likely sailed. Thanks! Jared Hanson Co-Founder ja...@keycard.ai www <https://www.keycard.sh> | linkedin <https://www.linkedin.com/in/jaredhanson> | github <https://github.com/jaredhanson>
_______________________________________________ OAuth mailing list -- oauth@ietf.org To unsubscribe send an email to oauth-le...@ietf.org