mrn-aglic commented on issue #31584:
URL: https://github.com/apache/airflow/issues/31584#issuecomment-1566640559
There are some exceptions before the triggerer starts.
One of them is this:
```
___ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2023-05-29 06:34:07,387] {triggerer_job.py:101} INFO - Starting the
triggerer
[2023-05-29 06:35:31,729] {triggerer_job.py:344} ERROR - Triggerer's async
thread was blocked for 0.75 seconds, likely by a badly-written trigger. Set
PYTHONASYNCIODEBUG=1 to get more information on overrunning coroutines.
[2023-05-29 06:35:32,258] {bigquery.py:51} INFO - Using the connection
google_cloud_default .
[2023-05-29 06:35:32,733] {triggerer_job.py:359} INFO - Trigger
<airflow.providers.google.cloud.triggers.bigquery.BigQueryInsertJobTrigger
conn_id=google_cloud_default,
job_id=airflow_attribution_customer_journey_extract_journey_2023_05_28T00_00_00_00_00_87d3332f973a72556e32f66c9905c253,
dataset_id=None, project_id=martech-sandbox-df9923f6, table_id=None,
poll_interval=4.0> (ID 1) starting
[2023-05-29 06:35:32,734] {bigquery.py:3041} INFO - Executing
get_job_status...
[2023-05-29 06:35:32,736] {connection.py:426} ERROR - Unable to retrieve
connection from secrets backend (CustomCloudSecretManagerBackend). Checking
subsequent secrets backend.
Traceback (most recent call last):
File
"/usr/local/lib/python3.9/site-packages/airflow/models/connection.py", line
422, in get_connection_from_secrets
conn = secrets_backend.get_connection(conn_id=conn_id)
File
"/usr/local/airflow/astro_plugins/google/cloud/secrets/custom_secret_manager.py",
line 113, in get_connection
value = self.get_conn_value(conn_id=conn_id)
File
"/usr/local/lib/python3.9/site-packages/airflow/providers/google/cloud/secrets/secret_manager.py",
line 141, in get_conn_value
return self._get_secret(self.connections_prefix, conn_id)
File
"/usr/local/airflow/astro_plugins/google/cloud/secrets/custom_secret_manager.py",
line 106, in _get_secret
return self.client.get_secret(secret_id=secret_id,
project_id=self.project_id)
File
"/usr/local/airflow/astro_plugins/google/cloud/secrets/custom_secret_manager.py",
line 88, in client
return CustomSecretManagerClient(credentials=self.credentials)
AttributeError: 'CustomCloudSecretManagerBackend' object has no attribute
'credentials'
[2023-05-29 06:35:32,745] {base.py:71} INFO - Using connection ID
'google_cloud_default' for task execution.
[2023-05-29 06:35:32,752] {bigquery.py:92} INFO - Query is still running...
[2023-05-29 06:35:32,752] {bigquery.py:93} INFO - Sleeping for 4.0 seconds.
[2023-05-29 06:35:36,755] {bigquery.py:3041} INFO - Executing
get_job_status...
[2023-05-29 06:35:36,756] {bigquery.py:92} INFO - Query is still running...
[2023-05-29 06:35:36,756] {bigquery.py:93} INFO - Sleeping for 4.0 seconds.
[2023-05-29 06:35:40,758] {bigquery.py:3041} INFO - Executing
get_job_status...
```
The query has already finished, but the trigger keeps polling for the job
state (it seems):
```
[2023-05-29 06:41:05,224] {bigquery.py:93} INFO - Sleeping for 4.0 seconds.
[2023-05-29 06:41:08,298] {triggerer_job.py:252} INFO - 1 triggers currently
running
[2023-05-29 06:41:09,226] {bigquery.py:3041} INFO - Executing
get_job_status...
[2023-05-29 06:41:09,227] {bigquery.py:92} INFO - Query is still running...
[2023-05-29 06:41:09,227] {bigquery.py:93} INFO - Sleeping for 4.0 seconds.
[2023-05-29 06:41:13,229] {bigquery.py:3041} INFO - Executing
get_job_status...
[2023-05-29 06:41:13,231] {bigquery.py:92} INFO - Query is still running...
[2023-05-29 06:41:13,231] {bigquery.py:93} INFO - Sleeping for 4.0 seconds.
[2023-05-29 06:41:17,232] {bigquery.py:3041} INFO - Executing
get_job_status...
[2023-05-29 06:41:17,234] {bigquery.py:92} INFO - Query is still running...
[2023-05-29 06:41:17,234] {bigquery.py:93} INFO - Sleeping for 4.0 seconds.
[2023-05-29 06:41:21,236] {bigquery.py:3041} INFO - Executing
get_job_status...
[2023-05-29 06:41:21,238] {bigquery.py:92} INFO - Query is still running...
[2023-05-29 06:41:21,238] {bigquery.py:93} INFO - Sleeping for 4.0 seconds.
[2023-05-29 06:41:25,240] {bigquery.py:3041} INFO - Executing
get_job_status...
[2023-05-29 06:41:25,242] {bigquery.py:92} INFO - Query is still running...
[2023-05-29 06:41:25,244] {bigquery.py:93} INFO - Sleeping for 4.0 seconds.
```
I can see that the query is finished as the table was modified:
<img width="374" alt="image"
src="https://github.com/apache/airflow/assets/2858231/7d6ea4f6-c1f1-4efe-b999-91e7d6ba9ddd">
And I can see the operator in deferred state still on the UI.
These are the operator logs:
```
AIRFLOW_CTX_EXECUTION_DATE=2023-05-28T00:00:00+00:00
AIRFLOW_CTX_TRY_NUMBER=1
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2023-05-28T00:00:00+00:00
[2023-05-29, 06:35:28 UTC] {secret_manager_client.py:78} ERROR - Google
Cloud API Call Error (NotFound): Secret ID
airflow-connections-google_cloud_default not found.
[2023-05-29, 06:35:28 UTC] {service_account_auth.py:20} INFO -
requires_token - empty connection uri
[2023-05-29, 06:35:28 UTC] {base.py:71} INFO - Using connection ID
'google_cloud_default' for task execution.
[2023-05-29, 06:35:28 UTC] {bigquery.py:2675} INFO - Executing: {'query':
{'query': '<**OMMITED THE QUERY**>l', 'useLegacySql': False}}'
[2023-05-29, 06:35:28 UTC] {credentials_provider.py:325} INFO - Getting
connection using `google.auth.default()` since no key file is defined for hook.
[2023-05-29, 06:35:28 UTC] {logging_mixin.py:137} WARNING -
/usr/local/lib/python3.9/site-packages/google/auth/_default.py:83 UserWarning:
Your application has authenticated using end user credentials from Google Cloud
SDK without a quota project. You might receive a "quota exceeded" or "API not
enabled" error. We recommend you rerun `gcloud auth application-default login`
and make sure a quota project is added. Or you can use service accounts
instead. For more information about service accounts, see
https://cloud.google.com/docs/authentication/
[2023-05-29, 06:35:28 UTC] {_default.py:638} WARNING - No project ID could
be determined. Consider running `gcloud config set project` or setting the
GOOGLE_CLOUD_PROJECT environment variable
[2023-05-29, 06:35:28 UTC] {bigquery.py:1542} INFO - Inserting job
airflow_<**ommited-this-part**>_2023_05_28T00_00_00_00_00_87d3332f973a72556e32f66c9905c253
[2023-05-29, 06:35:29 UTC] {bigquery.py:51} INFO - Using the connection
google_cloud_default .
[2023-05-29, 06:35:29 UTC] {taskinstance.py:1465} INFO - Pausing task as
DEFERRED. dag_id=some_dag_id, task_id=extract_data,
execution_date=20230528T000000, start_date=20230529T063526
[2023-05-29, 06:35:29 UTC] {local_task_job.py:159} INFO - Task exited with
return code 0
[2023-05-29, 06:35:29 UTC] {taskinstance.py:2623} INFO - 0 downstream tasks
scheduled from follow-on schedule check
```
The operator executes normally when not in deferred state.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]