alamb commented on code in PR #6096:
URL: https://github.com/apache/arrow-datafusion/pull/6096#discussion_r1176689360
##########
datafusion/core/tests/sqllogictests/test_files/information_schema.slt:
##########
@@ -375,7 +375,7 @@ SHOW CREATE TABLE test.xyz
----
datafusion test xyz CREATE VIEW test.xyz AS SELECT * FROM abc
-statement error DataFusion error: This feature is not implemented: Only `DROP
TABLE/VIEW
+statement error DataFusion error: Execution error: Cannot drop schema test
because other tables depend on it: xyz
Review Comment:
👍
##########
datafusion/core/src/execution/context.rs:
##########
@@ -676,6 +701,53 @@ impl SessionContext {
Ok(false)
}
+ /// Attempts to find a schema and deregister it. Returns a tuple of the
schema and a
+ /// flag indicating whether dereg was performed (e.g if schema is found
but has tables
+ /// then `cascade` must be set)
+ fn find_and_deregister_schema<'a>(
Review Comment:
I wonder if different catalogs might want to implement the "is the schema
empty" check differently 🤔
If so, it might make sense to put logic for "is the schema empty so can we
drop it" in the `CatalogProvider`
Perhaps if `CatalogProvider` had a signature like:
```
fn deregister_schema(&self, name: &str, cascade: bool) ->
Result<Option<Arc<dyn SchemaProvider>>> {
```
I think this code would get significantly simpler as well as the
`debug_assert`s to reassert what had just been validated would be unnecessary.
That being said, we could also do such a change as a follow on PR
##########
datafusion/core/src/catalog/catalog.rs:
##########
@@ -124,6 +124,17 @@ pub trait CatalogProvider: Sync + Send {
"Registering new schemas is not supported".to_string(),
))
}
+
+ /// Remove schema from this catalog if it exists.
+ ///
+ /// By default returns a "Not Implemented" error
+ fn deregister_schema(&self, name: &str) -> Result<Option<Arc<dyn
SchemaProvider>>> {
+ // use variables to avoid unused variable warnings
Review Comment:
An alternative pattern is to name the variable with a prefix. Like`_`
```
fn deregister_schema(&self, _name: &str) -> Result<Option<Arc<dyn
SchemaProvider>>> {
```
##########
datafusion/core/src/execution/context.rs:
##########
@@ -2617,4 +2690,96 @@ mod tests {
.unwrap()
}
}
+
+ /// helper for the following drop schema tests
Review Comment:
I think we could use sqllogic tests
https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/sqllogictests/
to write this more concisely
For example, perhaps we could extend:
https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/sqllogictests/test_files/ddl.slt
##########
datafusion/common/src/table_reference.rs:
##########
@@ -450,3 +450,39 @@ mod tests {
);
}
}
+
+#[derive(Debug, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
Review Comment:
This is somewhat strange / non standard to have code below a `mod test` --
maybe it would be cleaner in its own module (e.g.
`datafusion/common/src/schema_reference.rs`) perhaps
##########
datafusion/core/tests/sql/create_drop.rs:
##########
@@ -73,3 +73,40 @@ async fn create_external_table_with_ddl() -> Result<()> {
Ok(())
}
+
+#[tokio::test]
Review Comment:
I recommend putting these tests in
https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/sqllogictests/test_files/ddl.slt
rather than a rs test
##########
datafusion/core/src/execution/context.rs:
##########
@@ -447,6 +447,31 @@ impl SessionContext {
}
}
+ LogicalPlan::DropCatalogSchema(DropCatalogSchema {
+ name,
+ if_exists,
+ cascade,
+ ..
+ }) => {
+ let (dereg, schema) = self.find_and_deregister_schema(&name,
cascade)?;
+ match (if_exists, schema, dereg) {
+ // successful deregsiter
+ (_, _, true) => self.return_empty_dataframe(),
+ // schema found but failed to deregister
+ (_, Some(schema), false) =>
Err(DataFusionError::Execution(format!(
+ "Cannot drop schema {} because other tables depend on
it: {}",
+ name,
+ itertools::join(schema.table_names().iter(), ", ")
+ ))),
+ // no schema found
+ (false, None, false) =>
Err(DataFusionError::Execution(format!(
+ "Schema '{name}' doesn't exist."
+ ))),
+ // no schema found but dont return error
Review Comment:
The is like `DROP SCHEMA foo IF EXISTS`
This PR I think is consistent with what postgres does in this case
```
postgres=# drop schema if exists foo;
NOTICE: schema "foo" does not exist, skipping
DROP SCHEMA
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]