[
https://issues.apache.org/jira/browse/BEAM-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475954#comment-17475954
]
Luke Cwik edited comment on BEAM-13215 at 1/14/22, 5:44 AM:
------------------------------------------------------------
We should really be working on making the credentials provider convertible to
and from JSON representations like how it is done for AWS. The AwsModule
provides code for Jackson to convert the object to and from JSON. We could do
the same thing for a GCP credentials provider and make it convertible to and
from JSON so that it can be transferred as a pipeline option allowing for other
languages and other services to act on behalf of those credentials.
Code points:
https://github.com/apache/beam/blob/872455570ae7f3e2e35360bccf93b503ae9fdb5c/sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/options/AwsOptions.java#L84
https://github.com/apache/beam/blob/872455570ae7f3e2e35360bccf93b503ae9fdb5c/sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/options/AwsModule.java#L73
was (Author: lcwik):
We should really be working on making the credentials provider convertible to
and from JSON representations like how it is done for AWS. The AwsModule
provides code for Jackson to convert the object to and from JSON. We could do
the same thing for a GCP credentials provider and make it convertible to and
from JSON so that it can be transferred as a pipeline option.
Code points:
https://github.com/apache/beam/blob/872455570ae7f3e2e35360bccf93b503ae9fdb5c/sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/options/AwsOptions.java#L84
https://github.com/apache/beam/blob/872455570ae7f3e2e35360bccf93b503ae9fdb5c/sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/options/AwsModule.java#L73
> Portable OSS runners do not support GCP credentials for GCP IOs.
> ----------------------------------------------------------------
>
> Key: BEAM-13215
> URL: https://issues.apache.org/jira/browse/BEAM-13215
> Project: Beam
> Issue Type: Bug
> Components: io-go-gcp, io-java-gcp, io-py-gcp, java-fn-execution
> Reporter: Daniel Oliveira
> Priority: P2
>
> The situation here is that when a pipeline is run on a portable runner using
> a GCP IO, and uses docker for the SDK Harness environment, the SDK Harness
> does not have the user's GCP credentials available and the pipeline fails.
> There are apparently [pipeline options for setting
> credentials|https://github.com/apache/beam/blob/v2.33.0/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcpOptions.java#L170],
> but as far as I can tell they are either meant only for non-portable
> pipelines, or only for the Dataflow runner.
> The tricky part of implementing this is that credentials for GCP are not
> straightforward, and having them available for something like the Application
> Default Credentials API involves copying over multiple files or environment
> variables. The following article provides a lot of context for the
> difficulties involved:
> [https://medium.com/datamindedbe/application-default-credentials-477879e31cb5]
> Possible solutions. Note these are mostly untested:
> # Perform some volume-mounting when calling the "docker run" command to
> mount directories containing credentials. Preferably this can be set via some
> sort of pipeline option. (This could potentially also be used to provide
> directories for docker containers to write output files to with TextIO or
> FileIO.) See the article above for an example.
> ** This solution may not work with runners on remote endpoints though. The
> directory mounted must be on the same machine as the docker container to work
> properly, which may not be possible in some cases with remote runners.
> # Require custom containers with appropriate credentials provided. This is
> more robust than the solution above, but less user-friendly, and would
> require a good amount of documentation to be available.
> ** This could be possible in conjunction with the solution above, and might
> be a good way of supporting GCP credentials on remote runners. Custom
> containers can store any valid credentials of the user's choice, (for example
> service account credentials for a production service) and then be run on any
> machine.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)