Copilot commented on code in PR #9912:
URL: https://github.com/apache/gravitino/pull/9912#discussion_r2774618801
##########
core/src/main/java/org/apache/gravitino/EntityStore.java:
##########
@@ -165,6 +167,35 @@ <E extends Entity & HasIdentifier> E update(
<E extends Entity & HasIdentifier> E get(NameIdentifier ident, EntityType
entityType, Class<E> e)
throws NoSuchEntityException, IOException;
+ /**
+ * Batch get the entity from the underlying storage.
+ *
+ * @param idents the unique identifier of the entity
+ * @param entityType the general type of the entity
+ * @param clazz the entity class instance
+ * @param <E> the class of entity
+ * @return the entity retrieved from the underlying storage
+ * @throws NoSuchEntityException if the entity does not exist
+ */
+ <E extends Entity & HasIdentifier> List<E> batchGet(
+ List<NameIdentifier> idents, EntityType entityType, Class<E> clazz);
Review Comment:
The new abstract batchGet(List<...>) method (a) documents throwing
NoSuchEntityException but does not declare any checked exceptions, and (b) also
omits IOException unlike the existing get(...) API. If
NoSuchEntityException/IOException are checked in this codebase, implementers
will be forced to change behavior or won't compile. Consider declaring `throws
NoSuchEntityException, IOException` (and updating the default array overload
accordingly), or explicitly making the contract return partial results without
throwing (and update the JavaDoc to match).
##########
core/src/main/java/org/apache/gravitino/auth/RegexPrincipalMapper.java:
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.auth;
+
+import java.security.Principal;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+import org.apache.gravitino.UserPrincipal;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Regex-based principal mapper that extracts username using regex patterns
with capturing groups.
+ *
+ * <p>This implementation is thread-safe as Pattern.matcher() creates
thread-local Matcher
+ * instances.
+ */
+public class RegexPrincipalMapper implements PrincipalMapper {
+
+ private static final Logger LOG =
LoggerFactory.getLogger(RegexPrincipalMapper.class);
+
+ private final Pattern pattern;
+
+ /**
+ * Creates a new regex principal mapper.
+ *
+ * @param patternStr the regex pattern with a capturing group (required)
+ * @throws IllegalArgumentException if the pattern string has invalid regex
syntax
+ */
+ public RegexPrincipalMapper(String patternStr) {
+ if (patternStr == null || patternStr.isEmpty()) {
+ throw new IllegalArgumentException("Pattern string cannot be null or
empty");
+ }
+ this.pattern = Pattern.compile(patternStr);
+ LOG.info("Initialized RegexPrincipalMapper with pattern: {}", patternStr);
+ }
+
+ /**
+ * Maps a principal string to a Principal using the configured regex pattern.
+ *
+ * @param principal the principal string to map
+ * @return a Principal containing the extracted username from the first
capturing group, or the
+ * original principal if no match
+ * @throws IllegalArgumentException if pattern matching fails
+ */
+ @Override
+ public Principal map(String principal) {
+ if (principal == null) {
+ return null;
+ }
+
+ try {
+ Matcher matcher = pattern.matcher(principal);
+
+ // If pattern matches and has at least one capturing group, return the
first group
+ if (matcher.find() && matcher.groupCount() >= 1) {
+ String extracted = matcher.group(1);
+ // Return extracted value if it's not null and not empty, otherwise
original principal
+ String username = (extracted != null && !extracted.isEmpty()) ?
extracted : principal;
+ return new UserPrincipal(username);
+ }
+
+ // If pattern doesn't match or has no capturing groups, return original
principal
+ return new UserPrincipal(principal);
+
+ } catch (Exception e) {
+ String message =
+ String.format(
+ "Error applying regex pattern '%s' to principal '%s'",
pattern.pattern(), principal);
+ LOG.error("{}: {}", message, e.getMessage());
+ throw new IllegalArgumentException(message, e);
+ }
+ }
Review Comment:
PrincipalMapper.map() returns null when principal is null, which pushes
null-handling onto every caller and can lead to NPEs downstream; the interface
JavaDoc also states it throws IllegalArgumentException when mapping fails.
Consider throwing IllegalArgumentException for null/blank principal (similar to
KerberosPrincipalMapper). Also avoid `catch (Exception)` here; it’s better to
either not catch at all (most of these calls are safe), or catch specific
runtime exceptions and log with the throwable (e.g., `LOG.error(message, e)`)
so the stack trace is preserved.
##########
api/src/main/java/org/apache/gravitino/authorization/Privileges.java:
##########
@@ -1237,4 +1257,66 @@ public boolean canBindTo(MetadataObject.Type type) {
return type == MetadataObject.Type.METALAKE || type ==
MetadataObject.Type.JOB_TEMPLATE;
}
}
+
+ /** The privilege to create a view. */
+ public static class CreateView extends GenericPrivilege<CreateView> {
+ private static final CreateView ALLOW_INSTANCE =
+ new CreateView(Condition.ALLOW, Name.CREATE_VIEW);
+ private static final CreateView DENY_INSTANCE =
+ new CreateView(Condition.DENY, Name.CREATE_VIEW);
+
+ private CreateView(Condition condition, Name name) {
+ super(condition, name);
+ }
+
+ /**
+ * @return The instance with allow condition of the privilege.
+ */
+ public static CreateView allow() {
+ return ALLOW_INSTANCE;
+ }
+
+ /**
+ * @return The instance with deny condition of the privilege.
+ */
+ public static CreateView deny() {
+ return DENY_INSTANCE;
+ }
+
+ @Override
+ public boolean canBindTo(MetadataObject.Type type) {
+ return SCHEMA_SUPPORTED_TYPES.contains(type);
+ }
+ }
Review Comment:
CreateView.canBindTo() currently delegates to SCHEMA_SUPPORTED_TYPES, which
(per earlier constant definition) includes FILESET. That contradicts the newly
added tests expecting `createView.canBindTo(MetadataObject.Type.FILESET)` to be
false, and would incorrectly allow binding CREATE_VIEW to filesets. Please use
a dedicated supported-types set for CREATE_VIEW (e.g., {METALAKE, CATALOG,
SCHEMA}) rather than reusing SCHEMA_SUPPORTED_TYPES.
##########
catalogs/hadoop-common/src/main/java/org/apache/gravitino/catalog/hadoop/fs/HDFSFileSystemProvider.java:
##########
@@ -57,13 +60,28 @@ public String name() {
return BUILTIN_HDFS_FS_PROVIDER;
}
+ @Override
+ public String getFullAuthority(Path path, Map<String, String> conf) {
+ String authority = path.toUri().getAuthority();
+ String principal = conf.get(PRINCIPAL_KEY);
+ if (StringUtils.isNotBlank(principal)) {
+ principal = principal.replaceAll("@.*$", ""); // Remove realm if exists
+ authority = String.format("%s@%s", principal, authority);
+ }
+ String impersonationEnabled = conf.get(IMPERSONATION_ENABLE_KEY);
+ if (conf.containsKey(IMPERSONATION_ENABLE_KEY)) {
+ authority = String.format("%s?impersonation_enabled=%s", authority,
impersonationEnabled);
+ }
+ return authority;
+ }
Review Comment:
If the configuration map contains the IMPERSONATION_ENABLE_KEY with a null
value, this will append `impersonation_enabled=null` to the authority. Prefer
checking that the retrieved value is non-null/non-blank before appending, or
default it explicitly (e.g., 'true'/'false').
##########
common/src/main/java/org/apache/gravitino/dto/requests/FunctionUpdatesRequest.java:
##########
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.dto.requests;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import java.util.List;
+import lombok.AllArgsConstructor;
+import lombok.EqualsAndHashCode;
+import lombok.Getter;
+import lombok.NoArgsConstructor;
+import lombok.ToString;
+import org.apache.gravitino.rest.RESTMessage;
+import org.apache.gravitino.rest.RESTRequest;
+
+/** Request to represent updates to a function. */
+@Getter
+@EqualsAndHashCode
+@NoArgsConstructor(force = true)
+@AllArgsConstructor
+@ToString
+public class FunctionUpdatesRequest implements RESTRequest {
+
+ @JsonProperty("updates")
+ private final List<FunctionUpdateRequest> updates;
+
+ @Override
+ public void validate() throws IllegalArgumentException {
+ if (updates == null) {
+ throw new IllegalArgumentException("Updates list cannot be null");
+ }
+ updates.forEach(RESTMessage::validate);
Review Comment:
validate() will throw a NullPointerException if the updates list contains a
null element because `RESTMessage::validate` will be invoked on null. Add an
explicit check that each update entry is non-null before validating it (and
throw IllegalArgumentException with a clear message).
```suggestion
for (int i = 0; i < updates.size(); i++) {
FunctionUpdateRequest update = updates.get(i);
if (update == null) {
throw new IllegalArgumentException(
"Update entry at index " + i + " cannot be null");
}
update.validate();
}
```
##########
clients/client-python/gravitino/dto/responses/metadata_object_list_response.py:
##########
@@ -0,0 +1,59 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+
+from dataclasses_json import config
+
+from gravitino.dto.metadata_object_dto import MetadataObjectDTO
+from gravitino.dto.responses.base_response import BaseResponse
+from gravitino.utils.precondition import Precondition
+
+
Review Comment:
This DTO uses `dataclasses_json.config(...)` metadata but is not decorated
with `@dataclass_json`, so JSON (de)serialization helpers (e.g.,
to_dict/from_dict) may not be available or may ignore field_name mappings
depending on how BaseResponse is implemented. Consider adding `@dataclass_json`
(as done in other response DTOs) or removing the config metadata if
serialization is intentionally not supported.
```suggestion
from dataclasses_json import config, dataclass_json
from gravitino.dto.metadata_object_dto import MetadataObjectDTO
from gravitino.dto.responses.base_response import BaseResponse
from gravitino.utils.precondition import Precondition
@dataclass_json
```
##########
core/src/main/java/org/apache/gravitino/GravitinoEnv.java:
##########
@@ -270,7 +274,16 @@ public FunctionDispatcher functionDispatcher() {
}
/**
- * Get the PartitionDispatcher associated with the Gravitino environment.
+ * Get the ViewDispatcher associated with the Gravitino environment.
+ *
+ * @return The ViewDispatcher instance.
+ */
+ public ViewDispatcher viewDispatcher() {
+ return viewDispatcher;
+ }
+
+ /**
+ * * Get the PartitionDispatcher associated with the Gravitino environment.
Review Comment:
There is an extra '*' in the JavaDoc (`* * Get...`) which looks like a typo.
Please remove the extra asterisk to maintain clean JavaDoc formatting.
```suggestion
* Get the PartitionDispatcher associated with the Gravitino environment.
```
##########
clients/client-python/gravitino/dto/responses/tag_response.py:
##########
@@ -0,0 +1,76 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+
+from dataclasses_json import config, dataclass_json
+
+from gravitino.dto.responses.base_response import BaseResponse
+from gravitino.dto.tag_dto import TagDTO
+from gravitino.utils.precondition import Precondition
+
+
+@dataclass_json
+@dataclass
+class TagNamesListResponse(BaseResponse):
+ """Represents a response for a Tag Names List request."""
+
+ _tags: list[str] = field(default_factory=list,
metadata=config(field_name="names"))
+
+ def tag_names(self) -> list[str]:
+ return self._tags
+
+ def validate(self) -> None:
+ Precondition.check_argument(
+ self._tags is not None, "Tag Names List response should have tags"
+ )
+
+ for tag_name in self._tags:
+ Precondition.check_string_not_empty(
+ tag_name, "Tag Names List response should have non-empty tag
names"
+ )
+
+
+@dataclass_json
+@dataclass
+class TagListResponse(BaseResponse):
+ """Represents a response for a Tag List request."""
+
+ _tags: list[TagDTO] = field(
+ default_factory=list, metadata=config(field_name="tags")
+ )
+
+ def tags(self) -> list[TagDTO]:
+ return self._tags
+
+
+@dataclass_json
+@dataclass
+class TagResponse(BaseResponse):
+ """Represents a response for a tag."""
+
+ _tag: TagDTO = field(default=None, metadata=config(field_name="tag"))
+
+ def tag(self) -> TagDTO:
+ return self._tag
+
+ def validate(self) -> None:
+ Precondition.check_argument(
+ self._tag is not None, "Tag response should have a tag"
+ )
Review Comment:
TagResponse.validate() does not call `super().validate()` (unlike other
response validators in the Python client) and does not validate the TagDTO
itself. Consider calling `super().validate()` first and then validating `_tag`
(e.g., `_tag.validate()` if available) to keep response validation consistent
and to catch invalid `code`/base fields.
##########
api/src/main/java/org/apache/gravitino/rel/indexes/Indexes.java:
##########
@@ -27,8 +27,11 @@ public class Indexes {
/** An empty array of indexes. */
public static final Index[] EMPTY_INDEXES = new Index[0];
- /** MySQL does not support setting the name of the primary key, so the
default name is used. */
- public static final String DEFAULT_MYSQL_PRIMARY_KEY_NAME = "PRIMARY";
+ /**
+ * Name of the default primary key. MySQL, ClickHouse, OceanBase and many
other databases supports
+ * setting the name of the primary key and use it as primary key name.
+ */
+ public static final String DEFAULT_PRIMARY_KEY_NAME = "PRIMARY";
Review Comment:
The JavaDoc is factually inconsistent: MySQL does not support customizing
the primary key constraint name (it is effectively 'PRIMARY'). Please adjust
the wording to reflect that this is the conventional/default PK name used
across engines, and call out MySQL’s restriction separately if needed.
##########
common/src/main/java/org/apache/gravitino/dto/requests/FunctionRegisterRequest.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.dto.requests;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.base.Preconditions;
+import javax.annotation.Nullable;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.EqualsAndHashCode;
+import lombok.Getter;
+import lombok.NoArgsConstructor;
+import lombok.ToString;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.gravitino.dto.function.FunctionDefinitionDTO;
+import org.apache.gravitino.function.FunctionType;
+import org.apache.gravitino.rest.RESTRequest;
+
+/** Represents a request to register a function. */
+@Getter
+@EqualsAndHashCode
+@ToString
+@Builder(setterPrefix = "with")
+@NoArgsConstructor
+@AllArgsConstructor
+public class FunctionRegisterRequest implements RESTRequest {
+
+ @JsonProperty("name")
+ private String name;
+
+ @JsonProperty("functionType")
+ private FunctionType functionType;
+
+ @JsonProperty("deterministic")
+ private boolean deterministic;
+
+ @Nullable
+ @JsonProperty("comment")
+ private String comment;
+
+ @JsonProperty("definitions")
+ private FunctionDefinitionDTO[] definitions;
+
+ /**
+ * Validates the request.
+ *
+ * @throws IllegalArgumentException if the request is invalid.
+ */
+ @Override
+ public void validate() throws IllegalArgumentException {
+ Preconditions.checkArgument(
+ StringUtils.isNotBlank(name), "\"name\" field is required and cannot
be empty");
+ Preconditions.checkArgument(functionType != null, "\"functionType\" field
is required");
+ Preconditions.checkArgument(
+ definitions != null && definitions.length > 0,
+ "\"definitions\" field is required and cannot be empty");
+
+ // Validate each definition has appropriate return type/columns based on
function type
+ for (FunctionDefinitionDTO definition : definitions) {
+ if (functionType == FunctionType.TABLE) {
+ Preconditions.checkArgument(
+ definition.getReturnColumns() != null &&
definition.getReturnColumns().length > 0,
+ "\"returnColumns\" is required in each definition for TABLE
function type");
+ } else if (functionType == FunctionType.SCALAR || functionType ==
FunctionType.AGGREGATE) {
+ Preconditions.checkArgument(
+ definition.getReturnType() != null,
+ "\"returnType\" is required in each definition for SCALAR or
AGGREGATE function type");
+ } else {
+ throw new IllegalArgumentException("Unsupported function type: " +
functionType);
+ }
+ }
Review Comment:
This loop assumes each `definition` entry is non-null; if a client sends
`definitions: [null]`, this will NPE instead of producing a useful validation
error. Add a precondition that each definition is non-null before accessing its
fields.
##########
api/src/main/java/org/apache/gravitino/authorization/Privileges.java:
##########
@@ -1237,4 +1257,66 @@ public boolean canBindTo(MetadataObject.Type type) {
return type == MetadataObject.Type.METALAKE || type ==
MetadataObject.Type.JOB_TEMPLATE;
}
}
+
+ /** The privilege to create a view. */
+ public static class CreateView extends GenericPrivilege<CreateView> {
+ private static final CreateView ALLOW_INSTANCE =
+ new CreateView(Condition.ALLOW, Name.CREATE_VIEW);
+ private static final CreateView DENY_INSTANCE =
+ new CreateView(Condition.DENY, Name.CREATE_VIEW);
+
+ private CreateView(Condition condition, Name name) {
+ super(condition, name);
+ }
+
Review Comment:
CreateView.canBindTo() currently delegates to SCHEMA_SUPPORTED_TYPES, which
(per earlier constant definition) includes FILESET. That contradicts the newly
added tests expecting `createView.canBindTo(MetadataObject.Type.FILESET)` to be
false, and would incorrectly allow binding CREATE_VIEW to filesets. Please use
a dedicated supported-types set for CREATE_VIEW (e.g., {METALAKE, CATALOG,
SCHEMA}) rather than reusing SCHEMA_SUPPORTED_TYPES.
##########
clients/client-python/gravitino/dto/responses/metadata_object_list_response.py:
##########
@@ -0,0 +1,59 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+
+from dataclasses_json import config
+
+from gravitino.dto.metadata_object_dto import MetadataObjectDTO
+from gravitino.dto.responses.base_response import BaseResponse
+from gravitino.utils.precondition import Precondition
+
+
Review Comment:
This DTO uses `dataclasses_json.config(...)` metadata but is not decorated
with `@dataclass_json`, so JSON (de)serialization helpers (e.g.,
to_dict/from_dict) may not be available or may ignore field_name mappings
depending on how BaseResponse is implemented. Consider adding `@dataclass_json`
(as done in other response DTOs) or removing the config metadata if
serialization is intentionally not supported.
```suggestion
from dataclasses_json import dataclass_json, config
from gravitino.dto.metadata_object_dto import MetadataObjectDTO
from gravitino.dto.responses.base_response import BaseResponse
from gravitino.utils.precondition import Precondition
@dataclass_json
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]