[GitHub] [spark] cloud-fan commented on a diff in pull request #38192: [SPARK-40737][CONNECT] Add basic support for DataFrameWriter

GitBox Mon, 17 Oct 2022 07:07:23 -0700


cloud-fan commented on code in PR #38192:
URL: https://github.com/apache/spark/pull/38192#discussion_r997107932



##########
connector/connect/src/main/protobuf/spark/connect/commands.proto:
##########
@@ -62,3 +65,39 @@ message CreateScalarFunction {
     FUNCTION_LANGUAGE_SCALA = 3;
   }
 }
+
+// As writes are not directly handled during analysis and planning, they are 
modeled as commands.
+message WriteOperation {
+  // The output of the `input` relation will be persisted according to the 
options.
+  Relation input = 1;
+  // Format value according to the Spark documentation. Examples are: text, 
parquet, delta.
+  string source = 2;
+  // The destination of the write operation must be either a path or a table.
+  oneof save_type {
+    string path = 3;
+    string table_name = 4;
+  }
+  SaveMode mode = 5;

Review Comment:
   We added `DataFrameWriterV2` because we believe `SaveMode` is a bad design. 
It's confusing if we write to a table, as there are so many options: create if 
not exists, create or replace, replace if exists, append if exists, overwrite 
data if exists, etc.
   
   Anyway, we need to support save mode in the proto definition to support the 
existing DF API. If we want to support `DataFrameWriterV2` in Spark connect 
client, we should probably have a new proto definition without save mode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a diff in pull request #38192: [SPARK-40737][CONNECT] Add basic support for DataFrameWriter

Reply via email to