cloud-fan commented on code in PR #56111:
URL: https://github.com/apache/spark/pull/56111#discussion_r3346354523
##########
sql/core/src/main/scala/org/apache/spark/sql/classic/DataFrameWriter.scala:
##########
@@ -176,7 +176,21 @@ final class DataFrameWriter[T] private[sql](ds:
Dataset[T]) extends sql.DataFram
val catalog = CatalogV2Util.getTableProviderCatalog(
supportsExtract, catalogManager, dsOptions)
- (catalog.loadTable(ident), Some(catalog), Some(ident))
+ try {
+ (catalog.loadTable(ident), Some(catalog), Some(ident))
+ } catch {
+ // SPARK-57068: align Overwrite-on-missing with V1 source
behavior
Review Comment:
I'd reconsider aligning with V1 source behavior here. A
`SupportsCatalogOptions` source resolves a real catalog table, so `save()` is
DML on that table, not path-based data writing: `mode("overwrite")` on an
existing table is `OverwriteByExpression(Literal(true))` just below (= SQL
`INSERT OVERWRITE`, truncate + write), and `mode("append")` is `AppendData` (=
`INSERT INTO`).
SQL `INSERT OVERWRITE`/`INSERT INTO` both require an existing table and
throw `TABLE_OR_VIEW_NOT_FOUND` on a missing one — there's no auto-create on
insert. So under SQL table semantics both Overwrite-missing and Append-missing
should throw, which is the current (consistent) behavior. Creating only on
Overwrite-missing breaks that symmetry and mixes in the V1 pathless-source
model.
The create-or-replace verb already exists via `saveAsTable`
(`ReplaceTableAsSelect(orCreate = true)` = `CREATE OR REPLACE TABLE AS
SELECT`). My suggestion is to leave `save().mode("overwrite")` throwing on a
missing catalog table and point users at `saveAsTable` for create-or-replace.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]