Github user xwu0226 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16626#discussion_r107049953
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
---
@@ -296,6 +311,51 @@ class SessionCatalog(
}
/**
+ * Alter the schema of a table identified by the provided table
identifier. The new schema
+ * should still contain the existing bucket columns and partition
columns used by the table. This
+ * method will also update any Spark SQL-related parameters stored as
Hive table properties (such
+ * as the schema itself).
+ *
+ * @param identifier TableIdentifier
+ * @param newSchema Updated schema to be used for the table (must
contain existing partition and
+ * bucket columns)
+ */
+ def alterTableSchema(
+ identifier: TableIdentifier,
+ newSchema: StructType): Unit = {
+ val db =
formatDatabaseName(identifier.database.getOrElse(getCurrentDatabase))
+ val table = formatTableName(identifier.table)
+ val tableIdentifier = TableIdentifier(table, Some(db))
+ requireDbExists(db)
+ requireTableExists(tableIdentifier)
+ checkDuplication(newSchema)
+
+ val catalogTable = externalCatalog.getTable(db, table)
+ val oldSchema = catalogTable.schema
+
+ // not supporting dropping columns yet
+ val nonExistentColumnNames =
oldSchema.map(_.name).filterNot(columnNameResolved(newSchema, _))
+ if (nonExistentColumnNames.nonEmpty) {
+ throw new AnalysisException(
+ s"""
+ |Some existing schema fields
(${nonExistentColumnNames.mkString("[", ",", "]")}) are
+ |not present in the new schema. We don't support dropping
columns yet.
+ """.stripMargin)
+ }
+
+ // make sure partition columns are at the end
+ val partitionSchema = catalogTable.partitionSchema
--- End diff --
@cloud-fan Thanks! My understanding is that the caller may pass in a new
schema that may not follow the order in that partition column is added to the
end. So i want to ensure that before passing to
`exernalCatalog.alterTableSchema`.
How about I change the definition of asking caller to ensure the column
ordering in the `newSchema` before calling this function?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]