matthiasa4 opened a new issue, #29527:
URL: https://github.com/apache/beam/issues/29527
### What happened?
I have a very simple Beam Python pipeline, reading from PubSub, doing a
transform and writing to BigQuery. The transform includes a call to the
Translate API which is causing trouble:
```
class TranslateMessage(beam.DoFn):
def __init__(self, project_id):
self.project_id = project_id
def setup(self):
self.translate_client = translate.TranslationServiceClient()
def process(self, element):
# location = "global"
# parent = f"projects/{self.project_id}/locations/{location}"
# language_detected = self.translate_client.detect_language(
content=element.text,
parent=parent).languages[0]
# logging.info("Translating message: " + element.text)
yield Log(element.timestamp,
element.text,
element.user_id,
'en',
1
)
```
works, but
```
class TranslateMessage(beam.DoFn):
def __init__(self, project_id):
self.project_id = project_id
def setup(self):
self.translate_client = translate.TranslationServiceClient()
def process(self, element):
location = "global"
parent = f"projects/{self.project_id}/locations/{location}"
language_detected = self.translate_client.detect_language(
content=element.text,
parent=parent).languages[0]
logging.info("Translating message: " + element.text)
yield Log(element.timestamp,
element.text,
element.user_id,
language_detected.language_code,
language_detected.confidence
)
```
causes the whole pipeline to grind to a halt. It starts up, but PubSub
messages are never consumed, none of the stages adds any elements to the input
or output collections. No logs are created, no error messages are thrown.
Dependencies:
```
apache-beam[gcp]==2.52.0
google-cloud-translate==2.0.1
```
I have been able to fix the issue for now by using
`apache-beam[gcp]==2.44.0` and `--experiments=disable_runner_v2_until_2023` but
would be interested in what the underlying issue is and how I can upgrade to
Runner V2.
### Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
### Issue Components
- [X] Component: Python SDK
- [ ] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam YAML
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [X] Component: Google Cloud Dataflow Runner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]