I'm currently deep in the weeds of metadata resolution and validation,
given input issuer and resource identifiers.  As with many things OAuth
these days, this is being driven by the needs of MCP, but also applies to
any other "dynamic" protocols that make use of discovery.

In implementing this, I've identified some cases that are unclear in the
specifications.   These cases revolve around identifiers that end in "/".

Starting with authorization server metadata, section 3.1 of RFC 8414 states:

> If the issuer identifier value contains a path component, any
> terminating "/" MUST be removed before inserting "/.well-known/" and
> the well-known URI suffix between the host component and the path
> component.

Let's say I have an input issuer identifier of "https://example.com/issuer1/";.
Because this path component has a terminating "/", I would expect the
metadata request to be (having removed the terminating "/" from the path
component):

```
GET /.well-known/oauth-authorization-server/issuer1 HTTP/1.1
Host: example.com
```

Note that this is the same request that would be made if the identifier
_did not_ have a terminating "/".

Now, in order to pass validation required by section 3.3 of RFC 8414, the
metadata needs to respond with an issuer identifier that _preserves_ the
terminating "/".

```
{
  "issuer": "https://example.com/issuer1/";
}
```

However, the metadata request was "lossy" by removing this slash, and
creates ambiguity in situations where the input _did not_ have a
terminating "/".

The same holds for an issuer identifier that has a sole "/" path
component.  "https://example.com/"; results in a request to:

```
GET /.well-known/oauth-authorization-server HTTP/1.1
Host: example.com
```

Now, turning attention to protected resource metadata.  Section 3.1 of RFC
9728 states:

> If the resource identifier value contains a path or query component, any
> terminating slash (/) following the host component MUST be removed before
> inserting /.well-known/ and the well-known URI path suffix between the
host
> component and the path and/or query components.

NOTE: This is semantically different from RFC 8414, as it removes
terminating "/"
_following the host component_, as opposed to terminating slash of the path
component
(assuming I'm interpreting the specification correctly).

Let's say I have an input resource identifier of "
https://example.com/resource1/";.  Because this path component has a
terminating "/", I would expect the metadata request to be (having removed
the leading "/" but _preserving_ the terminating "/"):

```
GET /.well-known/oauth-protected-resource/resource1/ HTTP/1.1
Host: example.com
```

>From this perspective, the request is non-lossy, and allows the metadata
response to be crafted in a way that is unambiguous with respect to the
input identifier.

However, ambiguity creeps in again when we have an issuer identifier with
_solely_ a terminating "/", such as "https://example.com/";.  In such a
case, I would expect the metadata request to be (after having removed the
terminating slash following the host component):

```
GET /.well-known/oauth-protected-resource HTTP/1.1
Host: example.com
```

Once again, to pass validation required by section 3.3 of RFC 9728, the
metadata needs to respond with:

```
{
  "resource": "https://example.com/";,
}
```

But, the request is now lossy and the server does not know whether the
input identifier
contained a trailing slash or not.

These situations often occur in situations where the input is user-entered,
and servers accept paths where trailing slashes are optional - responding
identically to either request.   This is particularly common when serving
from "root" resources, as users often enter URLs with bare hostnames.

I'm posting here to initiate a discussion and gather consensus on what is
the expected behavior (or to be informed that I am reading the
specifications wrong).  After consensus, I'd like to have a clear set of
test cases that validate the expected behavior in all these scenarios, in
order to address the lack of such detail in the specifications.  I'm seeing
SDKs pop up with varying behavior, and given the security-critical nature
of these validations, I think it's important to resolve this to ensure both
compatibility and security among implementations.

I have some thoughts on how the wording could be modified to make the
algorithm and expected results more clear, but I'll wait on that until
others have had a chance to comment.  It's also unfortunate that the two
specifications seem to define different algorithms (again, assuming I am
interpreting them correctly).  It would be preferable if they were
identical, but that ship has likely sailed.


Thanks!

Jared Hanson
Co-Founder
ja...@keycard.ai
www <https://www.keycard.sh> | linkedin
<https://www.linkedin.com/in/jaredhanson> | github
<https://github.com/jaredhanson>
_______________________________________________
OAuth mailing list -- oauth@ietf.org
To unsubscribe send an email to oauth-le...@ietf.org

Reply via email to