Copilot commented on code in PR #9279:
URL: https://github.com/apache/gravitino/pull/9279#discussion_r2567471424
##########
catalogs/catalog-lakehouse-generic/src/main/java/org/apache/gravitino/catalog/lakehouse/lance/LanceTableOperations.java:
##########
@@ -100,45 +109,48 @@ public Table createTable(
String location = properties.get(Table.PROPERTY_LOCATION);
Preconditions.checkArgument(
StringUtils.isNotBlank(location), "Table location must be specified");
- Map<String, String> storageProps =
LancePropertiesUtils.getLanceStorageOptions(properties);
+
+ // Extract creation mode from properties
+ CreationMode mode =
+ Optional.ofNullable(properties.get(LANCE_CREATION_MODE))
+ .map(CreationMode::valueOf)
+ .orElse(CreationMode.CREATE);
boolean register =
-
Optional.ofNullable(properties.get(LanceTableDelegator.PROPERTY_LANCE_TABLE_REGISTER))
+ Optional.ofNullable(properties.get(LANCE_TABLE_REGISTER))
.map(Boolean::parseBoolean)
.orElse(false);
- if (register) {
- // If this is a registration operation, just create the table metadata
without creating a new
- // dataset
- return super.createTable(
- ident, columns, comment, properties, partitions, distribution,
sortOrders, indexes);
+
+ // Handle EXIST_OK mode - check if table exists first
+ if (mode == CreationMode.EXIST_OK) {
+ try {
+ return super.loadTable(ident);
+ } catch (NoSuchTableException e) {
+ // Table doesn't exist, proceed with creation
+ }
}
- try (Dataset ignored =
- Dataset.create(
- new RootAllocator(),
- location,
- convertColumnsToArrowSchema(columns),
- new
WriteParams.Builder().withStorageOptions(storageProps).build())) {
- // Only create the table metadata in Gravitino after the Lance dataset
is successfully
- // created.
- return super.createTable(
- ident, columns, comment, properties, partitions, distribution,
sortOrders, indexes);
- } catch (NoSuchSchemaException e) {
- throw e;
- } catch (TableAlreadyExistsException e) {
- // If the table metadata already exists, but the underlying lance table
was just created
- // successfully, we need to clean up the created lance table to avoid
orphaned datasets.
- Dataset.drop(location,
LancePropertiesUtils.getLanceStorageOptions(properties));
- throw e;
- } catch (IllegalArgumentException e) {
- if (e.getMessage().contains("Dataset already exists")) {
- throw new TableAlreadyExistsException(
- e, "Lance dataset already exists at location %s", location);
+ // Handle OVERWRITE mode - drop existing table if present
+ if (mode == CreationMode.OVERWRITE) {
+ if (register) {
+ dropTable(ident);
+ } else {
+ purgeTable(ident);
Review Comment:
There's a potential race condition in OVERWRITE mode. Between dropping the
existing table (lines 136/138) and creating the new one (line 143), another
thread could create a table with the same name, causing `createTableInternal`
to throw `TableAlreadyExistsException`. Consider documenting this behavior or
adding retry logic to handle this edge case.
```suggestion
int maxRetries = 3;
int attempt = 0;
while (true) {
try {
if (register) {
dropTable(ident);
} else {
purgeTable(ident);
}
// Try to create the table
return createTableInternal(
ident,
columns,
comment,
properties,
partitions,
distribution,
sortOrders,
indexes,
register,
location);
} catch (TableAlreadyExistsException e) {
attempt++;
if (attempt >= maxRetries) {
LOG.error("Failed to overwrite table {} after {} attempts due to
concurrent creation.", ident, attempt);
throw e;
}
LOG.warn("Table {} was concurrently created during OVERWRITE.
Retrying (attempt {}/{})...", ident, attempt, maxRetries);
// Sleep briefly to avoid busy loop (optional, can be tuned)
try {
Thread.sleep(100L);
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
throw new RuntimeException("Interrupted during table overwrite
retry", ie);
}
}
```
##########
catalogs/catalog-lakehouse-generic/src/main/java/org/apache/gravitino/catalog/lakehouse/lance/LanceTableOperations.java:
##########
@@ -100,45 +109,48 @@ public Table createTable(
String location = properties.get(Table.PROPERTY_LOCATION);
Preconditions.checkArgument(
StringUtils.isNotBlank(location), "Table location must be specified");
- Map<String, String> storageProps =
LancePropertiesUtils.getLanceStorageOptions(properties);
+
+ // Extract creation mode from properties
+ CreationMode mode =
+ Optional.ofNullable(properties.get(LANCE_CREATION_MODE))
+ .map(CreationMode::valueOf)
Review Comment:
The `CreationMode.valueOf()` call will throw a generic
`IllegalArgumentException` for invalid mode values. Consider wrapping this in a
try-catch block to provide a more informative error message that lists the
valid creation modes (CREATE, EXIST_OK, OVERWRITE) to help users understand
what values are acceptable.
```suggestion
.map(modeStr -> {
try {
return CreationMode.valueOf(modeStr);
} catch (IllegalArgumentException e) {
throw new IllegalArgumentException(
"Invalid creation mode '" + modeStr + "'. Valid modes
are: " +
Arrays.stream(CreationMode.values())
.map(Enum::name)
.collect(Collectors.joining(", ")));
}
})
```
##########
catalogs/catalog-lakehouse-generic/src/main/java/org/apache/gravitino/catalog/lakehouse/lance/LanceTableOperations.java:
##########
@@ -100,45 +109,48 @@ public Table createTable(
String location = properties.get(Table.PROPERTY_LOCATION);
Preconditions.checkArgument(
StringUtils.isNotBlank(location), "Table location must be specified");
- Map<String, String> storageProps =
LancePropertiesUtils.getLanceStorageOptions(properties);
+
+ // Extract creation mode from properties
+ CreationMode mode =
+ Optional.ofNullable(properties.get(LANCE_CREATION_MODE))
+ .map(CreationMode::valueOf)
+ .orElse(CreationMode.CREATE);
boolean register =
-
Optional.ofNullable(properties.get(LanceTableDelegator.PROPERTY_LANCE_TABLE_REGISTER))
+ Optional.ofNullable(properties.get(LANCE_TABLE_REGISTER))
.map(Boolean::parseBoolean)
.orElse(false);
- if (register) {
- // If this is a registration operation, just create the table metadata
without creating a new
- // dataset
- return super.createTable(
- ident, columns, comment, properties, partitions, distribution,
sortOrders, indexes);
+
+ // Handle EXIST_OK mode - check if table exists first
+ if (mode == CreationMode.EXIST_OK) {
+ try {
+ return super.loadTable(ident);
+ } catch (NoSuchTableException e) {
+ // Table doesn't exist, proceed with creation
Review Comment:
There's a potential race condition in EXIST_OK mode. Between checking if the
table exists (line 127) and creating it (line 143), another thread could create
the table, causing `createTableInternal` to throw
`TableAlreadyExistsException`. Consider wrapping the entire EXIST_OK logic in a
try-catch block that handles `TableAlreadyExistsException` after the creation
attempt, or documenting this as expected behavior.
```suggestion
// Table doesn't exist, proceed with creation
try {
return createTableInternal(
ident,
columns,
comment,
properties,
partitions,
distribution,
sortOrders,
indexes,
register,
location);
} catch (TableAlreadyExistsException ex) {
// Another thread/process created the table in the meantime, load
and return it
return super.loadTable(ident);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]