[
https://issues.apache.org/jira/browse/BEAM-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883456#comment-16883456
]
Udi Meiri commented on BEAM-2264:
---------------------------------
In https://issues.apache.org/jira/browse/BEAM-3990 we found out that the
storage client might not be thread safe.
Perhaps though we can reuse the credentials instead of recreating them every
time.
> Re-use credential instead of generating a new one one each GCS call
> -------------------------------------------------------------------
>
> Key: BEAM-2264
> URL: https://issues.apache.org/jira/browse/BEAM-2264
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-core
> Reporter: Luke Cwik
> Priority: Minor
> Time Spent: 50m
> Remaining Estimate: 0h
>
> We should cache the credential used within a Pipeline and re-use it instead
> of generating a new one on each GCS call. When executing (against 2.0.0 RC2):
> {code}
> python -m apache_beam.examples.wordcount --input
> "gs://dataflow-samples/shakespeare/*" --output local_counts
> {code}
> Note that we seemingly generate a new access token each time instead of when
> a refresh is required.
> {code}
> super(GcsIO, cls).__new__(cls, storage_client))
> INFO:root:Starting the size estimation of the input
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:root:Finished the size estimation of the input at 1 files. Estimation
> took 0.286200046539 seconds
> INFO:root:Running pipeline with DirectRunner.
> INFO:root:Starting the size estimation of the input
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:root:Finished the size estimation of the input at 43 files. Estimation
> took 0.205624818802 seconds
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> ... many more times ...
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)