Copilot commented on code in PR #9788:
URL: https://github.com/apache/gravitino/pull/9788#discussion_r2731533392
##########
docs/security/how-to-authenticate.md:
##########
@@ -113,6 +113,78 @@ GravitinoClient client = GravitinoClient.builder(uri)
.build();
```
+### Principal mapping
+
+Gravitino supports principal mapping to transform authenticated principals
(from OAuth or Kerberos) into user identities for authorization. By default,
Gravitino uses regex-based mapping.
+
+#### OAuth principal mapping
+
+For OAuth authentication, principals are extracted from JWT claims (configured
via `gravitino.authenticator.oauth.principalFields`). You can customize how
these principals are mapped:
+
+```text
+# Use default regex mapper that extracts everything (passes through unchanged)
+gravitino.authenticator.oauth.principalMapperType = regex
+gravitino.authenticator.oauth.principalMapperPattern = ^(.*)$
+
+# Extract username from email (e.g., [email protected] -> user)
+gravitino.authenticator.oauth.principalMapperType = regex
+gravitino.authenticator.oauth.principalMapperPattern = ([^@]+)@.*
Review Comment:
The configuration property name should be
`gravitino.authenticator.oauth.principalMapper.regex.pattern` (with `.regex.`
segment) to match the actual property defined in `OAuthConfig.java` line 160.
The same issue appears in line 131 and Kerberos examples.
```suggestion
gravitino.authenticator.oauth.principalMapper.regex.pattern = ^(.*)$
# Extract username from email (e.g., [email protected] -> user)
gravitino.authenticator.oauth.principalMapperType = regex
gravitino.authenticator.oauth.principalMapper.regex.pattern = ([^@]+)@.*
```
##########
server-common/src/main/java/org/apache/gravitino/server/authentication/KerberosConfig.java:
##########
@@ -42,4 +42,26 @@ public interface KerberosConfig {
.stringConf()
.checkValue(StringUtils::isNotBlank,
ConfigConstants.NOT_BLANK_ERROR_MSG)
.create();
+
+ ConfigEntry<String> PRINCIPAL_MAPPER_TYPE =
+ new ConfigBuilder(KERBEROS_CONFIG_PREFIX + "principalMapperType")
+ .doc(
+ "Type of principal mapper to use for Kerberos authentication. "
+ + "Built-in value: 'regex' (uses regex pattern to extract
username). "
+ + "Can also be a fully qualified class name implementing
PrincipalMapper for custom logic. "
+ + "Custom mappers can use KerberosPrincipal.parse() for
structured Kerberos principal parsing.")
+ .version(ConfigConstants.VERSION_1_2_0)
+ .stringConf()
+ .createWithDefault("regex");
+
+ ConfigEntry<String> PRINCIPAL_MAPPER_REGEX_PATTERN =
+ new ConfigBuilder(KERBEROS_CONFIG_PREFIX +
"principalMapper.regex.pattern")
+ .doc(
+ "Regex pattern to extract the username from the Kerberos
principal. "
+ + "Only used when principalMapperType is 'regex'. "
+ + "The pattern should contain at least one capturing group. "
+ + "Default pattern '([^@]+).*' extracts everything before
'@' (including instance if present).")
Review Comment:
The documentation describes the default pattern as extracting 'everything
before @', but the actual pattern `([^@]+).*` captures only the first
occurrence of characters before '@'. Consider clarifying that this includes the
primary and instance components (e.g., 'HTTP/host') when present, but stops at
the realm delimiter.
```suggestion
+ "Default pattern '([^@]+).*' captures the primary and
optional instance components of the Kerberos principal up to, but not
including, the realm delimiter '@' "
+ "(for example, 'HTTP/host' from
'HTTP/[email protected]').")
```
##########
server-common/src/main/java/org/apache/gravitino/server/authentication/JwksTokenValidator.java:
##########
@@ -132,7 +139,9 @@ public Principal validateToken(String token, String
serviceAudience) {
throw new UnauthorizedException("No valid principal found in token");
}
- return new UserPrincipal(principal);
+ // Use principal mapper to extract username
+ Principal mappedPrincipal = principalMapper.map(principal);
+ return mappedPrincipal;
Review Comment:
The intermediate variable `mappedPrincipal` is unnecessary. Consider
returning the result directly: `return principalMapper.map(principal);`
```suggestion
return principalMapper.map(principal);
```
##########
docs/security/how-to-authenticate.md:
##########
@@ -134,8 +206,12 @@ Gravitino server and Gravitino Iceberg REST server share
the same configuration
| `gravitino.authenticator.oauth.jwksUri` | JWKS URI for
server-side OAuth token validation. Required when using JWKS-based validation.
| (none)
| Yes if `tokenValidatorClass` is
`org.apache.gravitino.server.authentication.JwksTokenValidator` | 1.0.0
|
| `gravitino.authenticator.oauth.principalFields` | JWT claim field(s) to
use as principal identity. Comma-separated list for fallback in order (e.g.,
'preferred_username,email,sub').
| `sub` | No
| 1.0.0 |
| `gravitino.authenticator.oauth.tokenValidatorClass` | Fully qualified class
name of the OAuth token validator implementation. Use
`org.apache.gravitino.server.authentication.JwksTokenValidator` for JWKS-based
validation or
`org.apache.gravitino.server.authentication.StaticSignKeyValidator` for static
key validation. |
`org.apache.gravitino.server.authentication.StaticSignKeyValidator` | No
| 1.0.0 |
+| `gravitino.authenticator.oauth.principalMapperType` | Principal mapper type
for OAuth. Use 'regex' for regex-based mapping, or provide a fully qualified
class name implementing `org.apache.gravitino.auth.PrincipalMapper`.
| `regex` | No
| 1.2.0 |
+| `gravitino.authenticator.oauth.principalMapperPattern` | Regex pattern for
OAuth principal mapping. First capture group becomes the mapped principal. Only
used when principalMapperType is 'regex'.
| `^(.*)$` | No
| 1.2.0 |
Review Comment:
The configuration property name should be
`gravitino.authenticator.oauth.principalMapper.regex.pattern` (with `.regex.`
segment) to match the actual property defined in `OAuthConfig.java` line 160.
```suggestion
| `gravitino.authenticator.oauth.principalMapper.regex.pattern` | Regex
pattern for OAuth principal mapping. First capture group becomes the mapped
principal. Only used when principalMapperType is 'regex'.
| `^(.*)$`
| No
| 1.2.0 |
```
##########
server-common/src/main/java/org/apache/gravitino/server/authentication/KerberosAuthenticator.java:
##########
@@ -170,23 +177,14 @@ private Principal retrievePrincipalFromToken(String
serverPrincipal, byte[] clie
throw new UnauthorizedException("GssContext isn't established",
challenge);
}
- // Usually principal names are in the form 'user/instance@REALM' or
'user@REALM'.
- List<String> principalComponents =
- Splitter.on('@').splitToList(gssContext.getSrcName().toString());
- if (principalComponents.size() != 2) {
- throw new UnauthorizedException("Principal has wrong format",
AuthConstants.NEGOTIATE);
- }
+ // Extract principal from GSS context and map it using configured mapper
+ String principalString = gssContext.getSrcName().toString();
- String user = principalComponents.get(0);
- // TODO: We will have KerberosUserPrincipal in the future.
- // We can put more information of Kerberos to the KerberosUserPrincipal
- // For example, we can put the token into the KerberosUserPrincipal,
- // We can return the token to the client in the AuthenticationFilter. It
will be convenient
- // for client to establish the security context. Hadoop uses the cookie
to store the token.
- // For now, we don't store it in the cookie. I can have a simple
implementation. first.
- // It's also not required for the protocol.
- // https://datatracker.ietf.org/doc/html/rfc4559
- return new UserPrincipal(user);
+ // Use principal mapper to extract the principal
+ // This allows for flexible mapping strategies (regex, kerberos-specific
parsing, etc.)
+ // The mapper will handle validation and extraction based on its
configured type
+ Principal mappedPrincipal = principalMapper.map(principalString);
+ return mappedPrincipal;
Review Comment:
The intermediate variable `mappedPrincipal` is unnecessary. Consider
returning the result directly: `return principalMapper.map(principalString);`
```suggestion
return principalMapper.map(principalString);
```
##########
docs/security/how-to-authenticate.md:
##########
@@ -134,8 +206,12 @@ Gravitino server and Gravitino Iceberg REST server share
the same configuration
| `gravitino.authenticator.oauth.jwksUri` | JWKS URI for
server-side OAuth token validation. Required when using JWKS-based validation.
| (none)
| Yes if `tokenValidatorClass` is
`org.apache.gravitino.server.authentication.JwksTokenValidator` | 1.0.0
|
| `gravitino.authenticator.oauth.principalFields` | JWT claim field(s) to
use as principal identity. Comma-separated list for fallback in order (e.g.,
'preferred_username,email,sub').
| `sub` | No
| 1.0.0 |
| `gravitino.authenticator.oauth.tokenValidatorClass` | Fully qualified class
name of the OAuth token validator implementation. Use
`org.apache.gravitino.server.authentication.JwksTokenValidator` for JWKS-based
validation or
`org.apache.gravitino.server.authentication.StaticSignKeyValidator` for static
key validation. |
`org.apache.gravitino.server.authentication.StaticSignKeyValidator` | No
| 1.0.0 |
+| `gravitino.authenticator.oauth.principalMapperType` | Principal mapper type
for OAuth. Use 'regex' for regex-based mapping, or provide a fully qualified
class name implementing `org.apache.gravitino.auth.PrincipalMapper`.
| `regex` | No
| 1.2.0 |
+| `gravitino.authenticator.oauth.principalMapperPattern` | Regex pattern for
OAuth principal mapping. First capture group becomes the mapped principal. Only
used when principalMapperType is 'regex'.
| `^(.*)$` | No
| 1.2.0 |
| `gravitino.authenticator.kerberos.principal` | Indicates the Kerberos
principal to be used for HTTP endpoint. Principal should start with `HTTP/`.
| (none) | Yes if
use `kerberos` as the authenticator
| 0.4.0 |
| `gravitino.authenticator.kerberos.keytab` | Location of the keytab
file with the credentials for the principal.
| (none) | Yes if
use `kerberos` as the authenticator
| 0.4.0 |
+| `gravitino.authenticator.kerberos.principalMapperType` | Principal mapper
type for Kerberos. Use 'regex' for regex-based mapping, or provide a fully
qualified class name implementing `org.apache.gravitino.auth.PrincipalMapper`.
| `regex`
| No
| 1.2.0 |
+| `gravitino.authenticator.kerberos.principalMapperPattern` | Regex pattern
for Kerberos principal mapping. First capture group becomes the mapped
principal. Only used when principalMapperType is 'regex'.
| `([^@]+).*`
| No
| 1.2.0 |
Review Comment:
The configuration property name should be
`gravitino.authenticator.kerberos.principalMapper.regex.pattern` (with
`.regex.` segment) to match the actual property defined in
`KerberosConfig.java` line 58.
```suggestion
| `gravitino.authenticator.kerberos.principalMapper.regex.pattern` | Regex
pattern for Kerberos principal mapping. First capture group becomes the mapped
principal. Only used when principalMapperType is 'regex'.
| `([^@]+).*`
| No
| 1.2.0 |
```
##########
server-common/src/main/java/org/apache/gravitino/server/authentication/StaticSignKeyValidator.java:
##########
@@ -97,7 +104,10 @@ public Principal validateToken(String token, String
serviceAudience) {
throw new UnauthorizedException(
"Audiences in token is not in expected format: %s",
audienceObject);
}
- return new UserPrincipal(jwt.getBody().getSubject());
+
+ // Use principal mapper to extract username
+ Principal mappedPrincipal =
principalMapper.map(jwt.getBody().getSubject());
+ return mappedPrincipal;
} catch (ExpiredJwtException
| UnsupportedJwtException
Review Comment:
The intermediate variable `mappedPrincipal` is unnecessary. Consider
returning the result directly: `return
principalMapper.map(jwt.getBody().getSubject());`
```suggestion
return principalMapper.map(jwt.getBody().getSubject());
} catch (ExpiredJwtException
| UnsupportedJwtException
| UnsupportedJwtException
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]