dennishuo commented on code in PR #433:
URL: https://github.com/apache/polaris/pull/433#discussion_r1845280073


##########
polaris-service/src/main/java/org/apache/polaris/service/catalog/BasePolarisCatalog.java:
##########
@@ -1195,6 +1232,8 @@ private class BasePolarisTableOperations extends 
BaseMetastoreTableOperations {
     private final String fullTableName;
     private FileIO tableFileIO;
 
+    private ReentrantLock currentMetadataLock = new ReentrantLock();

Review Comment:
   Vestigial?



##########
polaris-service/src/main/java/org/apache/polaris/service/persistence/MetadataCacheManager.java:
##########
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.service.persistence;
+
+import java.nio.charset.StandardCharsets;
+import java.util.function.Supplier;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.TableMetadata;
+import org.apache.iceberg.TableMetadataParser;
+import org.apache.iceberg.catalog.TableIdentifier;
+import org.apache.polaris.core.PolarisCallContext;
+import org.apache.polaris.core.PolarisConfiguration;
+import org.apache.polaris.core.entity.PolarisEntity;
+import org.apache.polaris.core.entity.PolarisEntitySubType;
+import org.apache.polaris.core.entity.TableLikeEntity;
+import org.apache.polaris.core.persistence.PolarisMetaStoreManager;
+import org.apache.polaris.core.persistence.PolarisResolvedPathWrapper;
+import 
org.apache.polaris.core.persistence.resolver.PolarisResolutionManifestCatalogView;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class MetadataCacheManager {
+  private static final Logger LOGGER = 
LoggerFactory.getLogger(MetadataCacheManager.class);
+
+  /**
+   * Load the cached {@link Table} or fall back to `fallback` if one doesn't 
exist. If the metadata
+   * is not currently cached, it may be added to the cache.
+   */
+  public static TableMetadata loadTableMetadata(
+      TableIdentifier tableIdentifier,
+      long maxBytesToCache,
+      PolarisCallContext callContext,
+      PolarisMetaStoreManager metastoreManager,
+      PolarisResolutionManifestCatalogView resolvedEntityView,
+      Supplier<TableMetadata> fallback) {
+    LOGGER.debug(String.format("Loading cached metadata for %s", 
tableIdentifier));
+    PolarisResolvedPathWrapper resolvedEntities =
+        resolvedEntityView.getResolvedPath(tableIdentifier, 
PolarisEntitySubType.TABLE);
+    TableLikeEntity tableLikeEntity = 
TableLikeEntity.of(resolvedEntities.getRawLeafEntity());
+    boolean isCacheValid =
+        
tableLikeEntity.getMetadataLocation().equals(tableLikeEntity.getMetadataCacheLocationKey());
+    if (isCacheValid) {
+      LOGGER.debug(String.format("Using cached metadata for %s", 
tableIdentifier));
+      return 
TableMetadataParser.fromJson(tableLikeEntity.getMetadataCacheContent());
+    } else {
+      TableMetadata metadata = fallback.get();
+      PolarisMetaStoreManager.EntityResult cacheResult =
+          cacheTableMetadata(
+              tableLikeEntity,
+              metadata,
+              maxBytesToCache,
+              callContext,
+              metastoreManager,
+              resolvedEntityView);
+      if (!cacheResult.isSuccess()) {
+        LOGGER.debug(String.format("Failed to cache metadata for %s", 
tableIdentifier));
+      }
+      return metadata;
+    }
+  }
+
+  /**
+   * Attempt to add table metadata to the cache
+   *
+   * @return The result of trying to cache the metadata
+   */
+  private static PolarisMetaStoreManager.EntityResult cacheTableMetadata(
+      TableLikeEntity tableLikeEntity,
+      TableMetadata metadata,
+      long maxBytesToCache,
+      PolarisCallContext callContext,
+      PolarisMetaStoreManager metaStoreManager,
+      PolarisResolutionManifestCatalogView resolvedEntityView) {
+    String json = TableMetadataParser.toJson(metadata);
+    // We should not reach this method in this case, but check just in case...
+    if (maxBytesToCache != 
PolarisConfiguration.METADATA_CACHE_MAX_BYTES_NO_CACHING) {
+      long sizeInBytes = json.getBytes(StandardCharsets.UTF_8).length;
+      if (sizeInBytes > maxBytesToCache) {
+        LOGGER.debug(
+            String.format(
+                "Will not cache metadata for %s; metadata is %d bytes and the 
limit is %d",
+                tableLikeEntity.getTableIdentifier(), sizeInBytes, 
maxBytesToCache));
+        return new PolarisMetaStoreManager.EntityResult(
+            PolarisMetaStoreManager.ReturnStatus.SUCCESS, null);
+      } else {
+        LOGGER.debug(
+            String.format("Caching metadata for %s", 
tableLikeEntity.getTableIdentifier()));
+        TableLikeEntity newTableLikeEntity =
+            new TableLikeEntity.Builder(tableLikeEntity)
+                .setMetadataContent(tableLikeEntity.getMetadataLocation(), 
json)

Review Comment:
   Looks like this is passing in `tableLikeEntity.getMetadataLocation()` 
instead of `metadata.metadataFileLocation()` anyways, so it's extraneous. 
Probably what we want to do here is check that 
`tableLikeEntity.getMetadataLocation().equals(metadata.metadataFileLocation())` 
and if it mismatches, it means another write concurrently updated the 
TableLikeEntity already after our original table resolution.
   
   In that case it's probably still fine to return the stale JSON contents 
as-is, but just skip trying to touch the cache content here (normally we'd 
expect the write to already have pre-populated the cache content). We could 
return some return status for "concurrent modification" or similar if we want 
the callsite to have a nice debug message.



##########
polaris-core/src/main/java/org/apache/polaris/core/entity/TableLikeEntity.java:
##########
@@ -29,6 +29,14 @@ public class TableLikeEntity extends PolarisEntity {
   // of the internalProperties JSON file.
   public static final String METADATA_LOCATION_KEY = "metadata-location";
 
+  // For applicable types, this key on the "internalProperties" map will 
return the content of the
+  // metadata.json file located at `METADATA_CACHE_LOCATION_KEY`
+  private static final String METADATA_CACHE_CONTENT_KEY = 
"metadata-cache-content";
+
+  // For applicable types, this key on the "internalProperties" map will 
return the location of the
+  // `metadata.json` that is cached in METADATA_CACHE_CONTENT_KEY
+  private static final String METADATA_CACHE_LOCATION_KEY = 
"metadata-cache-location";

Review Comment:
   Since we're updating TableLikeEntity atomically, it shouldn't be possible 
for us to ever cache mismatched content vs the base `metadata-location` key. If 
we ever do find ourselves relying on a possible mismatch between 
`metadata-location` and `metadata-cache-location`, it's probably a crutch 
pointing at a deeper concurrency issue.



##########
polaris-service/src/main/java/org/apache/polaris/service/catalog/BasePolarisCatalog.java:
##########
@@ -1381,6 +1434,96 @@ public void doCommit(TableMetadata base, TableMetadata 
metadata) {
       }
     }
 
+    /**
+     * COPIED FROM {@link BaseMetastoreTableOperations} but without the 
requirement that base ==
+     * current()
+     *
+     * @param base table metadata on which changes were based
+     * @param metadata new table metadata with updates
+     */
+    @Override
+    public void commit(TableMetadata base, TableMetadata metadata) {
+      TableMetadata currentMetadata = current();
+
+      // if the metadata is already out of date, reject it
+      if (base == null) {
+        if (currentMetadata != null) {
+          // when current is non-null, the table exists. but when base is 
null, the commit is trying
+          // to create the table
+          throw new AlreadyExistsException("Table already exists: %s", 
tableName());
+        }
+      } else if (base.metadataFileLocation() != null
+          && 
!base.metadataFileLocation().equals(currentMetadata.metadataFileLocation())) {
+        throw new CommitFailedException("Cannot commit: stale table metadata");
+      } else if (base != currentMetadata) {
+        // This branch is different from BaseMetastoreTableOperations
+        LOGGER.debug(
+            "Base object differs from current metadata; proceeding because 
locations match");
+      } else if 
(base.metadataFileLocation().equals(metadata.metadataFileLocation())) {
+        // if the metadata is not changed, return early
+        LOGGER.info("Nothing to commit.");
+        return;
+      }
+
+      long start = System.currentTimeMillis();
+      doCommit(base, metadata);
+      deleteRemovedMetadataFiles(base, metadata);
+      requestRefresh();
+
+      LOGGER.info(
+          "Successfully committed to table {} in {} ms",
+          tableName(),
+          System.currentTimeMillis() - start);
+    }
+
+    /**
+     * COPIED FROM {@link BaseMetastoreTableOperations}

Review Comment:
   You mean copied from `CatalogUtil`? Add comments here explaining why this is 
copied from there, what changes, etc



##########
polaris-service/src/main/java/org/apache/polaris/service/persistence/MetadataCacheManager.java:
##########
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.polaris.service.persistence;
+
+import java.nio.charset.StandardCharsets;
+import java.util.function.Supplier;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.TableMetadata;
+import org.apache.iceberg.TableMetadataParser;
+import org.apache.iceberg.catalog.TableIdentifier;
+import org.apache.polaris.core.PolarisCallContext;
+import org.apache.polaris.core.PolarisConfiguration;
+import org.apache.polaris.core.entity.PolarisEntity;
+import org.apache.polaris.core.entity.PolarisEntitySubType;
+import org.apache.polaris.core.entity.TableLikeEntity;
+import org.apache.polaris.core.persistence.PolarisMetaStoreManager;
+import org.apache.polaris.core.persistence.PolarisResolvedPathWrapper;
+import 
org.apache.polaris.core.persistence.resolver.PolarisResolutionManifestCatalogView;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class MetadataCacheManager {
+  private static final Logger LOGGER = 
LoggerFactory.getLogger(MetadataCacheManager.class);
+
+  /**
+   * Load the cached {@link Table} or fall back to `fallback` if one doesn't 
exist. If the metadata
+   * is not currently cached, it may be added to the cache.
+   */
+  public static TableMetadata loadTableMetadata(
+      TableIdentifier tableIdentifier,
+      long maxBytesToCache,
+      PolarisCallContext callContext,
+      PolarisMetaStoreManager metastoreManager,
+      PolarisResolutionManifestCatalogView resolvedEntityView,
+      Supplier<TableMetadata> fallback) {
+    LOGGER.debug(String.format("Loading cached metadata for %s", 
tableIdentifier));
+    PolarisResolvedPathWrapper resolvedEntities =
+        resolvedEntityView.getResolvedPath(tableIdentifier, 
PolarisEntitySubType.TABLE);
+    TableLikeEntity tableLikeEntity = 
TableLikeEntity.of(resolvedEntities.getRawLeafEntity());
+    boolean isCacheValid =
+        
tableLikeEntity.getMetadataLocation().equals(tableLikeEntity.getMetadataCacheLocationKey());

Review Comment:
   Can we just  check for equality with `tableLikeEntity.getMetadataLocation()` 
at write time in `cacheTableMetadata` and then only need to check whether 
`getMetadataCacheContent()` is null here?
   
   The `updateEntityPropertiesIfNotChanged` call in `cacheTableMetadata` is 
already the atomic compare-and-swap, so if we had a race condition where the 
table was updated since we fetched the cache content, it should already 
correctly fail the addition of the cache content because `entityVersion` won't 
match.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to