dimas-b commented on code in PR #3616: URL: https://github.com/apache/polaris/pull/3616#discussion_r2766321215
########## polaris-core/src/main/java/org/apache/polaris/core/persistence/metrics/MetricsQueryCriteria.java: ########## @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.core.persistence.metrics; + +import java.time.Instant; +import java.util.Optional; +import org.apache.polaris.immutables.PolarisImmutable; + +/** + * Query criteria for retrieving metrics reports. + * + * <p>This class defines the filter parameters for metrics queries. Pagination is handled separately + * via {@link org.apache.polaris.core.persistence.pagination.PageToken}, which is passed as a + * separate parameter to query methods. This separation of concerns allows: + * + * <ul> + * <li>Different backends to implement pagination in their optimal way + * <li>Cursor-based pagination that works with both RDBMS and NoSQL backends + * <li>Reuse of the existing Polaris pagination infrastructure + * </ul> + * + * <h3>Supported Query Patterns</h3> + * + * <table> + * <tr><th>Pattern</th><th>Fields Used</th><th>Index Required</th></tr> + * <tr><td>By Table + Time</td><td>catalogName, namespace, tableName, startTime, endTime</td><td>Yes (OSS)</td></tr> + * <tr><td>By Time Only</td><td>startTime, endTime</td><td>Partial (timestamp index)</td></tr> + * </table> + * + * <p>Additional query patterns (e.g., by trace ID) can be implemented by persistence backends using + * the {@link #metadata()} filter map. Client-provided correlation data should be stored in the + * metrics record's metadata map and can be filtered using the metadata criteria. + * + * <h3>Pagination</h3> + * + * <p>Pagination is handled via the {@link org.apache.polaris.core.persistence.pagination.PageToken} + * passed to query methods. The token contains: + * + * <ul> + * <li>{@code pageSize()} - Maximum number of results to return + * <li>{@code value()} - Optional cursor token (e.g., {@link ReportIdToken}) for continuation + * </ul> + * + * <p>Query results are returned as {@link org.apache.polaris.core.persistence.pagination.Page} + * which includes an encoded token for fetching the next page. + * + * @see org.apache.polaris.core.persistence.pagination.PageToken + * @see org.apache.polaris.core.persistence.pagination.Page + * @see ReportIdToken + */ +@PolarisImmutable +public interface MetricsQueryCriteria { + + // === Table Identification (optional) === + + /** Catalog name to filter by. */ + Optional<String> catalogName(); + + /** Namespace to filter by (dot-separated). */ + Optional<String> namespace(); + + /** Table name to filter by. */ + Optional<String> tableName(); + + // === Time Range === + + /** Start time for the query (inclusive). */ + Optional<Instant> startTime(); + + /** End time for the query (exclusive). */ + Optional<Instant> endTime(); + + // === Metadata Filtering === + + /** + * Metadata key-value pairs to filter by. + * + * <p>This enables filtering metrics by client-provided correlation data stored in the record's + * metadata map. For example, clients may include a trace ID in the metadata that can be queried + * later. + * + * <p>Note: Metadata filtering may require custom indexes depending on the persistence backend. + * The OSS codebase provides basic support, but performance optimizations may be needed for + * high-volume deployments. + */ + java.util.Map<String, String> metadata(); + + // === Factory Methods === + + /** + * Creates a new builder for MetricsQueryCriteria. + * + * @return a new builder instance + */ + static ImmutableMetricsQueryCriteria.Builder builder() { + return ImmutableMetricsQueryCriteria.builder(); + } + + /** + * Creates criteria for querying by table and time range. + * + * <p>Pagination is handled separately via the {@code PageToken} parameter to query methods. + * + * @param catalogName the catalog name + * @param namespace the namespace (dot-separated) + * @param tableName the table name + * @param startTime the start time (inclusive) + * @param endTime the end time (exclusive) + * @return the query criteria + */ + static MetricsQueryCriteria forTable( + String catalogName, String namespace, String tableName, Instant startTime, Instant endTime) { Review Comment: nit: Would it be worth returning a Builder with just the table info and adding time ranges later (at the call site)? ########## polaris-core/src/main/java/org/apache/polaris/core/persistence/metrics/MetricsRecordIdentity.java: ########## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.polaris.core.persistence.metrics; + +import java.time.Instant; +import java.util.Map; + +/** + * Base interface containing common identification fields shared by all metrics records. + * + * <p>This interface defines the common fields that identify the source of a metrics report, + * including the report ID, catalog ID, table location, timestamp, and metadata. + * + * <p>Both {@link ScanMetricsRecord} and {@link CommitMetricsRecord} extend this interface to + * inherit these common fields while adding their own specific metrics. + * + * <h3>Design Decisions</h3> + * + * <p><b>Namespace/TableName as primitives:</b> We use separate String fields for namespace and + * table name rather than Iceberg's {@code TableIdentifier} to avoid Iceberg dependencies in the + * Polaris SPI. The service layer can convert to/from {@code TableIdentifier} as needed. + * + * <p><b>Catalog ID only (no name):</b> We store only the catalog ID, not the catalog name. Catalog + * names can change over time (via rename operations), which would make querying historical metrics + * by name challenging. Queries should resolve catalog names to IDs using the current catalog state. + * + * <p><b>Realm ID:</b> Realm ID is intentionally not included in this interface. Multi-tenancy realm + * context should be obtained from the CDI-injected {@code RealmContext} at persistence time. This + * keeps catalog-specific code from needing to manage realm concerns. + */ +public interface MetricsRecordIdentity { + + /** + * Unique identifier for this report (UUID). + * + * <p>This ID is generated when the record is created and serves as the primary key for the + * metrics record in persistence storage. + */ + String reportId(); + + /** + * Internal catalog ID. + * + * <p>This matches the catalog entity ID in Polaris persistence, as defined by {@code + * PolarisEntityCore#getId()}. The catalog name is not stored since it can change over time; + * queries should resolve names to IDs using the current catalog state. + */ + long catalogId(); + + /** + * Namespace path as a dot-separated string (e.g., "db.schema"). + * + * <p>This is the namespace portion of the table identifier. Multi-level namespaces are + * represented with dots separating levels. + */ + String namespace(); + + /** + * Table name. + * + * <p>This is the table name portion of the table identifier, without the namespace prefix. + */ + String tableName(); Review Comment: Why not table ID (same as for the catalog ID) 🤔 Table names can change over time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
