vicennial commented on code in PR #40147:
URL: https://github.com/apache/spark/pull/40147#discussion_r1117931152
##########
connector/connect/common/src/main/protobuf/spark/connect/base.proto:
##########
@@ -183,6 +183,87 @@ message ExecutePlanResponse {
}
}
+// Request to transfer client-local artifacts.
+message AddArtifactsRequest {
+
+ // Definition of an Artifact.
+ message Artifact {
+ // The name of the artifact is expected in the form of a "Relative Path"
that is made up of a
+ // sequence of directories and the final file element.
+ // Examples of "Relative Path"s: "jars/test.jar", "classes/xyz.class",
"abc.xyz", "a/b/X.jar".
+ // The server is expected to maintain the hierarchy of files as defined by
their name. (i.e
+ // The relative path of the file on the server's filesystem will be the
same as the name of
+ // the provided artifact)
+ string name = 1;
+ // Raw data.
+ bytes data = 2;
+ // CRC to allow server to verify integrity of the artifact.
+ int64 crc = 3;
Review Comment:
Yes, that makes sense but I'm wondering how we should proceed with the
naming since the naming `Aritifact` should technically represent any artifact
regardless of size.
A proposal for the naming:
Name the smallest unit of data as `ArtifactChunk`
```
message ArtifactChunk {
bytes data = 1;
int64 crc = 2;
}
```
Use the name `SingleAritifact` (or perhaps `CompactArtifact`) instead of
`Artifact`
```
message SingleArtifact {
string name = 1;
ArtifactChunk data = 2;
}
```
Rename `BeginChunkedArtifact` to `ChunkedArtifact`
```
message ChunkedArtifact {
string name = 1;
int64 total_bytes = 2;
int64 num_chunks = 3;
ArtifactChunk initial_chunk = 4;
}
```
Now we can define an "Artifact" (verbally) as either a {`SingleArtifact`} or
{`ChunkedAritfact` + it's appended `ArtifactChunks`}. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]