Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19623#discussion_r148243095
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
---
@@ -75,8 +82,10 @@
/**
* Aborts this writing job because some data writers are failed to write
the records and aborted,
* or the Spark job fails with some unknown reasons, or {@link
#commit(WriterCommitMessage[])}
- * fails. If this method fails(throw exception), the underlying data
source may have garbage that
- * need to be cleaned manually, but these garbage should not be visible
to data source readers.
+ * fails.
+ *
+ * If an exception was thrown, the underlying data source may have
garbage that need to be
+ * cleaned manually, but these garbage should not be visible to data
source readers.
--- End diff --
I'm doubting if we should call `abort` if `commit` fails. It looks like if
job commit fails, the state of the destination is undefined and it's hard for
`abort` to deal with it.
I'd like to make the action fail if job commit fails, and tell users that
the state of the destination is undefined, as it's an important information to
users.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]