Hi devs,

I'd like to propose to add close() on DataWriter explicitly, which is the
place for resource cleanup.

The rationalization of the proposal is due to the lifecycle of DataWriter.
If the scaladoc of DataWriter is correct, the lifecycle of DataWriter
instance ends at either commit() or abort(). That makes datasource
implementors to feel they can place resource cleanup in both sides, but
abort() can be called when commit() fails; so they have to ensure they
don't do double-cleanup if cleanup is not idempotent.

I've checked some callers to see whether they can apply "try-catch-finally"
to ensure close() is called at the end of lifecycle for DataWriter, and
they look like so, but I might be missing something.

What do you think? It would bring backward incompatible change, but given
the interface is marked as Evolving and we're making backward incompatible
changes in Spark 3.0, so I feel it may not matter.

Would love to hear your thoughts.

Thanks in advance,
Jungtaek Lim (HeartSaVioR)

Reply via email to