joellubi commented on issue #1107:
URL: https://github.com/apache/arrow-adbc/issues/1107#issuecomment-1757548252
Thanks for the feedback on this. I've started to put together some of these
changes in preparation for a pull request but have had a few further questions
come up.
First, here is the current draft of the proto definition I have:
```proto
/*
* Represents a bulk ingestion request. Used in the command member of
FlightDescriptor
* for the the RPC call DoPut to cause the server load the contents of the
stream's
* FlightData into the target destination.
*/
message CommandStatementIngest {
option (experimental) = true;
// Describes the behavior for loading bulk data.
enum IngestMode {
// Ingestion behavior unspecified.
INGEST_MODE_UNSPECIFIED = 0;
// Create the target table. Fail if the target table already exists.
INGEST_MODE_CREATE = 1;
// Append to an existing target table. Fail if the target table does not
exist.
INGEST_MODE_APPEND = 2;
// Drop the target table if it exists. Then follow INGEST_MODE_CREATE
behavior.
INGEST_MODE_REPLACE = 3;
// Create the target table if it does not exist. Then follow
INGEST_MODE_APPEND behavior.
INGEST_MODE_CREATE_APPEND = 4;
}
// The ingestion behavior.
IngestMode mode = 1;
// The table to load data into.
string target_table = 2;
// The db_schema of the target_table to load data into. If unset, ...
(TODO)
optional string target_schema = 3;
// The catalog of the target_table to load data into. If unset, ... (TODO)
optional string target_catalog = 4;
// Use a temporary table for target_table.
optional bool temporary = 5;
// Backend-specific options.
map<string, string> options = 1000;
}
```
A few open questions:
1. How should the behavior of an unset `target_schema` or `target_catalog`
be described? I suppose it would have to use the backend-specific default, if
one exists. Would that be a reasonable specification of the behavior?
2. @alamb had a comment about including `optional bytes transaction_id` as a
field. I think this makes sense but it's not clear to me how a server should
answer `SqlInfo` requests about `SQL_TRANSACTIONS_SUPPORTED` or
`FLIGHT_SQL_SERVER_TRANSACTION` if it supports transactions for queries but not
necessarily bulk ingestion, which could often times be the case. We could add
more options to `FLIGHT_SQL_SERVER_TRANSACTION` to capture the distinction, but
perhaps it might be simpler to leave it out of the spec and wait until
transaction support for bulk ingestion is explicitly needed in the spec before
making those extensions.
3. A more general development/contributing question: Are there specific tool
versions specified to run when generating files during development. For example
there are several newer versions of `protoc-gen-go` than the one the go pb
files are currently generated with (`v1.28.1`) and I was wondering if it would
be an issue to bump these versions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]