Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/21948#discussion_r207725721
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriterFactory.java
---
@@ -33,7 +33,10 @@
public interface DataWriterFactory<T> extends Serializable {
/**
- * Returns a data writer to do the actual writing work.
+ * Returns a data writer to do the actual writing work. Note that, Spark
will reuse the same data
+ * object instance when sending data to the data writer, for better
performance. Data writers
+ * are responsible for defensive copies if necessary, e.g. copy the data
before buffer it in a
+ * list.
--- End diff --
nit: the description about defensive copied in data writers, may be put in
`DataWriter`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]