[GitHub] [iceberg] kbendick commented on a change in pull request #3543: Add configurable cache expiry to caching catalog

GitBox Thu, 09 Dec 2021 17:29:28 -0800


kbendick commented on a change in pull request #3543:
URL: https://github.com/apache/iceberg/pull/3543#discussion_r766277098




##########
File path: 
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -383,14 +390,52 @@ public boolean dropNamespace(String[] namespace) throws 
NoSuchNamespaceException
 
   @Override
   public final void initialize(String name, CaseInsensitiveStringMap options) {
-    this.cacheEnabled = 
Boolean.parseBoolean(options.getOrDefault("cache-enabled", "true"));
-    Catalog catalog = buildIcebergCatalog(name, options);
+    this.cacheEnabled = PropertyUtil.propertyAsBoolean(options,
+        CatalogProperties.TABLE_CACHE_ENABLED, 
CatalogProperties.TABLE_CACHE_ENABLED_DEFAULT);
+
+    // If the user disabled caching and did not set the 
cache.expiration-interval-ms, we'll set it to zero for
+    // them on their behalf. If they disabled caching but explicitly set a 
non-zero cache expiration
+    // interval, we will fail initialization as that's an invalid 
configuration.
+    long defaultCacheExpirationInterval =
+        cacheEnabled ? 
CatalogProperties.TABLE_CACHE_EXPIRATION_INTERVAL_MS_DEFAULT : 0L;
+
+    this.cacheExpirationIntervalMs = PropertyUtil.propertyAsLong(options,
+        CatalogProperties.TABLE_CACHE_EXPIRATION_INTERVAL_MS,
+        defaultCacheExpirationInterval);
+
+    // Normalize usage of -1 to 0 as we'll call Duration.ofMillis on this 
value.
+    if (cacheExpirationIntervalMs < 0) {
+      this.cacheExpirationIntervalMs = 0;
+    }
+
+    Preconditions.checkArgument(cacheEnabled || cacheExpirationIntervalMs <= 
0L,
+        "The catalog's table cache expiration interval must be set to zero via 
the property %s if caching is disabled",
+        CatalogProperties.TABLE_CACHE_EXPIRATION_INTERVAL_MS);
+
+    // If the user didn't specify TABLE_CACHE_EXPIRATION_INTERVAL_MS, put it 
into the SparkConf and the
+    // options map in case it's assumed to be elsewhere (such as cloning a 
spark session and
+    // re-instantiating the catalog).

Review comment:
       The problem is that the catalog configuration and a number of other 
things are still pulled from the `SparkSession` at later times, like for table 
clean up jobs etc.
   
   And the `options` map is passed through a lot more source code.
   
   So if an option is moved on behalf of the user, the goal was to make it 
reflect what we set it in their spark config and in the `options` map.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] kbendick commented on a change in pull request #3543: Add configurable cache expiry to caching catalog

Reply via email to