jerryshao commented on code in PR #5521:
URL: https://github.com/apache/gravitino/pull/5521#discussion_r1837391125
##########
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##########
@@ -581,31 +587,85 @@ public Schema alterSchema(NameIdentifier ident,
SchemaChange... changes)
@Override
public boolean dropSchema(NameIdentifier ident, boolean cascade) throws
NonEmptySchemaException {
try {
+ Namespace filesetNs =
+ NamespaceUtil.ofFileset(
+ ident.namespace().level(0), // metalake name
+ ident.namespace().level(1), // catalog name
+ ident.name() // schema name
+ );
+
+ List<FilesetEntity> filesets =
+ store.list(filesetNs, FilesetEntity.class,
Entity.EntityType.FILESET);
+ if (!filesets.isEmpty() && !cascade) {
+ throw new NonEmptySchemaException("Schema %s is not empty", ident);
+ }
+
+ // Delete all the managed filesets no matter whether the storage
location is under the
+ // schema path or not.
+ // The reason why we delete the managed fileset's storage location one
by one is because we
+ // may mis-delete the storage location of the external fileset if it
happens to be under
+ // the schema path.
+ ClassLoader cl = Thread.currentThread().getContextClassLoader();
+ filesets
+ .parallelStream()
+ .filter(f -> f.filesetType() == Fileset.Type.MANAGED)
+ .forEach(
+ f -> {
+ ClassLoader oldCl =
Thread.currentThread().getContextClassLoader();
+ try {
+ // parallelStream uses forkjoin thread pool, which has a
different classloader
+ // than the catalog thread. We need to set the context
classloader to the
+ // catalog's classloader to avoid classloading issues.
+ Thread.currentThread().setContextClassLoader(cl);
Review Comment:
Generally, if the thread is spawned from the catalog thread, then it should
be fine, this can cover 90% scenarios.
Some edge cases like here are mainly because the thread pool is created
beforehand, so the classloader is different. Such cases are rare, and hard to
handle it in a unified form.
I'm inclined to not optimize this code currently unless we find many common
usage patterns, we can refactor this part then.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]