tvalentyn commented on code in PR #26220:
URL: https://github.com/apache/beam/pull/26220#discussion_r1175294168
##########
sdks/python/apache_beam/runners/worker/sdk_worker_main.py:
##########
@@ -173,32 +173,47 @@ def create_harness(environment, dry_run=False):
return fn_log_handler, sdk_harness, sdk_pipeline_options
+def _start_profiler(gcp_profiler_service_name, gcp_profiler_service_version):
+ try:
+ import googlecloudprofiler
+ if gcp_profiler_service_version:
+ googlecloudprofiler.start(
+ service=gcp_profiler_service_name,
+ service_version=gcp_profiler_service_version,
+ verbose=1)
+ _LOGGER.info('Turning on Google Cloud Profiler.')
+ else:
+ raise RuntimeError('Unable to find the job id from envvar.')
+ except Exception as e: # pylint: disable=broad-except
+ _LOGGER.warning(
+ 'Unable to start google cloud profiler due to error: %s. For how to '
+ 'enable Cloud Profiler with Dataflow see '
+ 'https://cloud.google.com/dataflow/docs/guides/profiling-a-pipeline.'
+ 'For troubleshooting tips with Cloud Profiler see '
+ 'https://cloud.google.com/profiler/docs/troubleshooting.' % e)
+
+
+def _get_gcp_profiler_name_if_enabled(sdk_pipeline_options):
Review Comment:
1. Noting that prior version of the code verified whether
os.environ["JOB_NAME"] , os.environ["JOB_ID"] variables were defined. Did you
intentionally drop these checks?
2. After taking a look `_get_gcp_profiler_name_if_enabled` it looks like we
could merge this logic into
`GoogleCloudOptions.get_cloud_profiler_service_name()` by adding something like:
```
def get_cloud_profiler_service_name(self):
# ...
# ...
# existing code
experiments = self.view_as(DebugOptions).experiments or []
... a few more checks
...
return None
```
##########
CHANGES.md:
##########
@@ -64,6 +64,8 @@
## New Features / Improvements
+* X feature added (Java/Python)
([#X](https://github.com/apache/beam/issues/X)).
Review Comment:
drop
##########
sdks/python/apache_beam/runners/worker/sdk_worker_main.py:
##########
@@ -173,32 +173,47 @@ def create_harness(environment, dry_run=False):
return fn_log_handler, sdk_harness, sdk_pipeline_options
+def _start_profiler(gcp_profiler_service_name, gcp_profiler_service_version):
+ try:
+ import googlecloudprofiler
+ if gcp_profiler_service_version:
+ googlecloudprofiler.start(
+ service=gcp_profiler_service_name,
+ service_version=gcp_profiler_service_version,
+ verbose=1)
+ _LOGGER.info('Turning on Google Cloud Profiler.')
+ else:
+ raise RuntimeError('Unable to find the job id from envvar.')
+ except Exception as e: # pylint: disable=broad-except
+ _LOGGER.warning(
+ 'Unable to start google cloud profiler due to error: %s. For how to '
+ 'enable Cloud Profiler with Dataflow see '
+ 'https://cloud.google.com/dataflow/docs/guides/profiling-a-pipeline.'
+ 'For troubleshooting tips with Cloud Profiler see '
+ 'https://cloud.google.com/profiler/docs/troubleshooting.' % e)
+
+
+def _get_gcp_profiler_name_if_enabled(sdk_pipeline_options):
+ experiments = (sdk_pipeline_options.view_as(DebugOptions).experiments or [])
+ gcp_profiler_service_name = sdk_pipeline_options.view_as(
+ GoogleCloudOptions).get_cloud_profiler_service_name()
+
+ if _ENABLE_GOOGLE_CLOUD_PROFILER in experiments and \
Review Comment:
note: another way to avoid line continuation token is to use parenthesis,
e.g.
```
if (_ENABLE_GOOGLE_CLOUD_PROFILER in experiments and
not gcp_profiler_service_name):
```
I personally prefer that, but not a strong opinion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]