Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22688
ah good point. I think the original design of append operator assumes the
table already exists, so a schema should be provided. If we treat file path as
a table, then append should fail for your case because path does not exist, and
we should use CTAS. cc @rdblue for confirmation.
That said, the change here LGTM. We should only get the relation for append
mode.
Furthermore, I think in the future we can't simply proxy old `SaveMode`
write APIs to new write APIs, as the behavior can be different. e.g. currently
we can write data to a non-existing path with append mode for file sources, but
the append operator can not.
I'm not sure this should block 2.4. Data source v2 API is unstable, so
breaking changes are allowed, and we won't treat data source v2 bugs as
blockers. We should merge this PR to 2.4, but it's not strong enough to fail an
RC.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]