Hello All,
I am expecting FileStagingOptions#setFilesToStage in PortablePipelineOptions
<https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java#L28>
is the way to customize artifacts to be staged and resolved in portable
pipeline, however, it looks like that PortableRunner
<https://github.com/apache/beam/blob/master/runners/portability/java/src/main/java/org/apache/beam/runners/portability/PortableRunner.java#L129>
does not add preconfigured files to `filesToStageBuilder` which is used in the
final options to prepare the job. Is this the expected behavior or maybe a bug?
In addition, do we support specifying an URL in
PortablePipelineOptions#filesToStage so that ArtifactRetrievalService can
retrieve artifacts from a remote address instead of default from JobServer,
which got artifacts from SDK Client. I am asking because I noticed
public static InputStream getArtifact(RunnerApi.ArtifactInformation artifact)
throws IOException {
switch (artifact.getTypeUrn()) {
case FILE_ARTIFACT_URN:
RunnerApi.ArtifactFilePayload payload =
RunnerApi.ArtifactFilePayload.parseFrom(artifact.getTypePayload());
return Channels.newInputStream(
FileSystems.open(
FileSystems.matchNewResource(payload.getPath(), false /* is
directory */)));
case EMBEDDED_ARTIFACT_URN:
return RunnerApi.EmbeddedFilePayload.parseFrom(artifact.getTypePayload())
.getData()
.newInput();
default:
throw new UnsupportedOperationException(
"Unexpected artifact type: " + artifact.getTypeUrn());
}
}
Which indicates that only File and Embed artifacts seem to be supported now.
Best,
Ke