[
https://issues.apache.org/jira/browse/BEAM-14014?focusedWorklogId=768996&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768996
]
ASF GitHub Bot logged work on BEAM-14014:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 11/May/22 11:01
Start Date: 11/May/22 11:01
Worklog Time Spent: 10m
Work Description: tvalentyn commented on code in PR #17244:
URL: https://github.com/apache/beam/pull/17244#discussion_r869700306
##########
sdks/python/apache_beam/examples/wordcount_it_test.py:
##########
@@ -47,6 +47,26 @@ class WordCountIT(unittest.TestCase):
def test_wordcount_it(self):
self._run_wordcount_it(wordcount.run)
+ @pytest.mark.it_postcommit
+ @pytest.mark.sickbay_direct
+ @pytest.mark.sickbay_spark
+ @pytest.mark.sickbay_flink
+ def test_wordcount_impersonation_it(self):
+ """Tests impersonation on dataflow."""
+ ACOUNT_TO_IMPERSONATE =
'[email protected]'
Review Comment:
nit: account
##########
sdks/python/apache_beam/examples/wordcount_it_test.py:
##########
@@ -47,6 +47,26 @@ class WordCountIT(unittest.TestCase):
def test_wordcount_it(self):
self._run_wordcount_it(wordcount.run)
+ @pytest.mark.it_postcommit
+ @pytest.mark.sickbay_direct
+ @pytest.mark.sickbay_spark
+ @pytest.mark.sickbay_flink
+ def test_wordcount_impersonation_it(self):
+ """Tests impersonation on dataflow."""
Review Comment:
Add Kenn's comment from the Java change explaining this setup.
##########
sdks/python/apache_beam/io/gcp/gcsfilesystem.py:
##########
@@ -38,6 +40,15 @@ class GCSFileSystem(FileSystem):
CHUNK_SIZE = gcsio.MAX_BATCH_OPERATION_SIZE # Chuck size in batch operations
GCS_PREFIX = 'gs://'
+ def __init__(self, pipeline_options):
+ super().__init__(pipeline_options)
+ if isinstance(pipeline_options, PipelineOptions):
+ gcs_options = pipeline_options.view_as(GoogleCloudOptions)
+ impersonate_service_account = gcs_options.impersonate_service_account
Review Comment:
+1 to Kenn's comment. This looks strange to me as well. Can we make pipeline
options a parameter of get_service_credentials() and adjust credential
generation and usages of this method accordingly? Given that:
- credential is a global singleton
- pipeline options are also global,
you could, clear the gcs_options.impersonate_service_account within the code
that creates the credential singleton.
##########
sdks/python/apache_beam/examples/wordcount_it_test.py:
##########
@@ -47,6 +47,26 @@ class WordCountIT(unittest.TestCase):
def test_wordcount_it(self):
self._run_wordcount_it(wordcount.run)
+ @pytest.mark.it_postcommit
+ @pytest.mark.sickbay_direct
+ @pytest.mark.sickbay_spark
+ @pytest.mark.sickbay_flink
+ def test_wordcount_impersonation_it(self):
+ """Tests impersonation on dataflow."""
+ ACOUNT_TO_IMPERSONATE = \
Review Comment:
Let's use implicit concatenation via ('literal1' 'literal2' ) or explicit
via + instead of string continuation tokens \
##########
sdks/python/apache_beam/internal/gcp/auth.py:
##########
@@ -125,30 +139,22 @@ def get_service_credentials(cls):
"socket default timeout is %s seconds.", socket.getdefaulttimeout())
cls._credentials = cls._get_service_credentials()
+ cls._add_impersonation_credentials()
cls._credentials_init = True
return cls._credentials
@staticmethod
- def _get_service_credentials():
+ def _get_service_credentials(cls):
if not _GOOGLE_AUTH_AVAILABLE:
_LOGGER.warning(
'Unable to find default credentials because the google-auth library '
'is not available. Install the gcp extra (apache_beam[gcp]) to use '
'Google default credentials. Connecting anonymously.')
return None
- client_scopes = [
- 'https://www.googleapis.com/auth/bigquery',
Review Comment:
Was a jira filed?
Issue Time Tracking
-------------------
Worklog Id: (was: 768996)
Time Spent: 4.5h (was: 4h 20m)
> Support impersonation credentials in Dataflow runner.
> -----------------------------------------------------
>
> Key: BEAM-14014
> URL: https://issues.apache.org/jira/browse/BEAM-14014
> Project: Beam
> Issue Type: Improvement
> Components: runner-dataflow
> Reporter: Valentyn Tymofieiev
> Assignee: Ryan Thompson
> Priority: P2
> Time Spent: 4.5h
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)