joellubi commented on code in PR #38385:
URL: https://github.com/apache/arrow/pull/38385#discussion_r1391200200


##########
format/FlightSql.proto:
##########
@@ -1779,9 +1796,61 @@ message CommandPreparedStatementUpdate {
 }
 
 /*
- * Returned from the RPC call DoPut when a CommandStatementUpdate
- * CommandPreparedStatementUpdate was in the request, containing
- * results from the update.
+ * Represents a bulk ingestion request. Used in the command member of 
FlightDescriptor
+ * for the the RPC call DoPut to cause the server load the contents of the 
stream's
+ * FlightData into the target destination.
+ */
+message CommandStatementIngest {
+  option (experimental) = true;
+
+  // Describes the behavior for loading bulk data.
+  enum IngestMode {
+    // Ingestion behavior unspecified.
+    INGEST_MODE_UNSPECIFIED = 0;
+    // Create the target table. Fail if the target table already exists.
+    INGEST_MODE_CREATE = 1;
+    // Append to an existing target table. Fail if the target table does not 
exist.
+    INGEST_MODE_APPEND = 2;
+    // Drop the target table if it exists. Then follow INGEST_MODE_CREATE 
behavior.
+    INGEST_MODE_REPLACE = 3;
+    // Create the target table if it does not exist. Then follow 
INGEST_MODE_APPEND behavior.
+    INGEST_MODE_CREATE_APPEND = 4;
+  }
+
+  // The ingestion behavior.
+  IngestMode mode = 1;
+  // The table to load data into.
+  string table = 2;
+  // The db_schema of the destination table to load data into. If unset, a 
backend-specific default may be used.
+  optional string schema = 3;
+  // The catalog of the destination table to load data into. If unset, a 
backend-specific default may be used.
+  optional string catalog = 4;
+  /*
+   * Store ingested data in a temporary table.
+   * With the exception of INGEST_MODE_APPEND, leaving temporary unspecified 
is equivalent to temporary=false.

Review Comment:
   My impression is that clients would expect they can just use a temp table 
without having to qualify it explicitly as long as the session is still open. 
In the context of your comment regarding how existing implementations handle 
this,
   
   > The ADBC drivers supporting this feature do this too. Basically, 
temporary=true is equivalent to setting schema=pg_temp, and conversely 
temporary=false is equivalent to setting schema=CURRENT_SCHEMA.
   
   The extension to handle the "unspecified" case could either be:
   1. > temporary unspecified is equivalent to setting schema=CURRENT_SCHEMA
   2. > temporary unspecified is equivalent to leaving schema unspecified
   
   An equivalency between "unspecified" and `false` results in case 1. In most 
DBs leaving the schema unspecified (case 2) defaults to the temporary table 
taking priority, which what the special semantics attempt to describe.
   
   Without the special handling of this case there could also be a mismatch 
with other commands such as `CommandStatementQuery` which don't have a 
`temporary` parameter. I think it would be most consistent for 
`temporary=unspecified` in ingest to match the semantics of other commands that 
do not specify `temporary`. The following "rough" mapping to SQL illustrates 
this:
   
   ```sql
   CREATE OR REPLACE TEMP TABLE foo; INSERT INTO TABLE foo; -- 
INGEST_MODE_REPLACE
   SELECT * FROM foo; -- Reading from temp table, but we don't need to specify 
that it's temporary.
   INSERT INTO TEMP TABLE foo; -- INGEST_MODE_APPEND; do we need "TEMP" if it's 
the same table?
   ```
   The use of `TEMP` in the third statement is required for case 1, but may be 
left out in case 2. IMO it would be confusing if `SELECT * FROM foo` and 
`INSERT INTO foo` actually referred to different tables when `TEMP` is not 
specified for the INSERT.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to