ctubbsii commented on code in PR #55:
URL: 
https://github.com/apache/accumulo-classloaders/pull/55#discussion_r2748497160


##########
modules/local-caching-classloader/src/test/java/org/apache/accumulo/classloader/lcc/LocalCachingContextClassLoaderFactoryTest.java:
##########
@@ -168,11 +169,53 @@ public static void afterAll() throws Exception {
   public void beforeEach() throws Exception {
     baseCacheDir = tempDir.resolve("base");
     ConfigurationCopy acuConf = new ConfigurationCopy(
-        Map.of(CACHE_DIR_PROPERTY, 
baseCacheDir.toAbsolutePath().toUri().toURL().toExternalForm()));
+        Map.of(CACHE_DIR_PROPERTY, 
baseCacheDir.toAbsolutePath().toUri().toURL().toExternalForm(),
+            UPDATE_FAILURE_GRACE_PERIOD_MINS, "1", ALLOWED_URLS_PATTERN, 
".*"));
     FACTORY = new LocalCachingContextClassLoaderFactory();
     FACTORY.init(() -> new ConfigurationImpl(acuConf));
   }
 
+  @Test
+  public void testAllowedUrls() throws Exception {
+    // use a different factory than other tests; only allow file: URLs
+    ConfigurationCopy acuConf = new ConfigurationCopy(
+        Map.of(CACHE_DIR_PROPERTY, 
baseCacheDir.toAbsolutePath().toUri().toURL().toExternalForm(),
+            ALLOWED_URLS_PATTERN, "file:.*"));
+    var factory = new LocalCachingContextClassLoaderFactory();
+    factory.init(() -> new ConfigurationImpl(acuConf));
+
+    // case 1: all URLs pass (normal case, covered by other tests)
+
+    // case 2: context definition URL fails to match the pattern
+    var ex = assertThrows(ContextClassLoaderException.class,
+        () -> factory.getClassLoader(hdfsAllContext.toExternalForm()));
+    assertTrue(ex.getCause() instanceof IllegalArgumentException);
+    assertTrue(ex.getCause().getMessage().contains("Context definition 
location (hdfs:"));
+
+    // case 3a: context definition URL matches, but resource URL should fail 
to match the pattern,
+    // but it works anyway, because the resources were downloaded already by a 
different instance
+    // (in this case, by a less restrictive FACTORY instance) and no new 
connection is made
+    FACTORY.getClassLoader(hdfsAllContext.toExternalForm());
+    factory.getClassLoader(localAllContext.toExternalForm()); // same resources
+
+    // case 3b: context definition URL matches, but resource URL fails to 
match the pattern
+    // in this case, we use a new context definition, with a resource that 
doesn't exist locally
+    var newResources = new LinkedHashSet<Resource>();
+    var badUrl = "http://localhost/some/path";;
+    newResources.add(new Resource(new URL(badUrl), "MD5", BAD_MD5));
+    var context2 = new ContextDefinition(MONITOR_INTERVAL_SECS, newResources);
+    var disallowedContext = 
tempDir.resolve("context-with-disallowed-resource-url.json");
+    Files.writeString(disallowedContext, context2.toJson(), 
StandardOpenOption.CREATE,
+        StandardOpenOption.TRUNCATE_EXISTING);
+    ex = assertThrows(ContextClassLoaderException.class,
+        () -> 
factory.getClassLoader(disallowedContext.toUri().toURL().toExternalForm()));
+    assertTrue(ex.getCause() instanceof IllegalStateException);
+    assertTrue(ex.getCause().getCause() instanceof ExecutionException);
+    assertTrue(ex.getCause().getCause().getCause() instanceof 
IllegalArgumentException);
+    assertTrue(ex.getCause().getCause().getCause().getMessage().contains(
+        "Resource location (" + badUrl + ")"), 
ex.getCause().getCause().getCause()::getMessage);
+  }

Review Comment:
   For case 6, an empty string is a valid pattern. I don't know why a user 
would set that, but it's one of many special cases that is a valid pattern, but 
would never match on any URLs. However, I don't want to start adding checks for 
special cases, because if we go down that road, we're getting into pretty 
arbitrary territory, checking for some valid patterns, but not others, and no 
clear justification. I'd prefer not go start validating any valid regexes. If 
the user set it to an empty string, then my assumption is that they 
intentionally wanted the factory to not allow any URLs. Maybe this is a 
reasonable action, once all the resources are downloaded, monitor intervals are 
set to `Integer.MAX_VALUE`, and no new contexts are expected to be created? Or 
maybe they just want to pause network activity for a short time, to alleviate a 
network congestion issue? Whatever the reason, I don't think we need to behave 
differently for an empty string than we would if it were `[^:]*`, which is 
 just as valid and won't match on any URLs either).
   
   I can add the other cases, though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to