mik-laj commented on issue #13343:
URL: https://github.com/apache/airflow/issues/13343#issuecomment-763236585
Today I had discussions with the Polidea team about this bug. @turbaszek
said that this problem is already fixed in the main branch. I can also confirm
that this problem no longer occurs and that system tests are successful. There
are some more problems with the StackdriverTaskHandler, but I already have a
patch ready and I'm testing it.
```
root@f375df8a2fd5:/opt/airflow# pytest
tests/providers/google/cloud/log/test_gcs_task_handler_system.py --system
google -s
=============================================================================================================================
test session starts
==============================================================================================================================
platform linux -- Python 3.6.12, pytest-6.2.1, py-1.10.0, pluggy-0.13.1 --
/usr/local/bin/python
cachedir: .pytest_cache
rootdir: /opt/airflow, configfile: pytest.ini
plugins: forked-1.3.0, timeouts-1.2.1, xdist-2.2.0, cov-2.10.1,
instafail-0.4.2, flaky-3.7.0, requests-mock-1.8.0, rerunfailures-9.1.1,
celery-4.4.7
setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
collected 1 item
tests/providers/google/cloud/log/test_gcs_task_handler_system.py::TestGCSTaskHandlerSystemTest::test_should_read_logs
========================= AIRFLOW ==========================
Home of the user: /root
Airflow home /root/airflow
Skipping initializing of the DB as it was initialized already.
You can re-initialize the database by adding --with-db-init flag when
running tests.
[2021-01-20 00:17:42,536] {logging_command_executor.py:33} INFO - Executing:
'gcloud config set core/project polidea-airflow'
[2021-01-20 00:17:44,288] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:17:44,288] {logging_command_executor.py:45} INFO - Stderr:
Updated property [core/project].
[2021-01-20 00:17:44,288] {logging_command_executor.py:33} INFO - Executing:
'gcloud auth activate-service-account
--key-file=/files/airflow-breeze-config/keys/gcp_gcs.json'
[2021-01-20 00:17:46,231] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:17:46,231] {logging_command_executor.py:45} INFO - Stderr:
Activated service account credentials for:
[[email protected]]
[2021-01-20 00:17:46,232] {logging_command_executor.py:33} INFO - Executing:
'gsutil mb gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv'
[2021-01-20 00:17:51,253] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:17:51,254] {logging_command_executor.py:45} INFO - Stderr:
Creating gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv/...
[2021-01-20 00:17:55,495] {__init__.py:38} INFO - Loaded API auth backend:
<module 'airflow.api.auth.backend.default' from
'/opt/airflow/airflow/api/auth/backend/default.py'>
[2021-01-20 00:17:58,331] {opentelemetry_tracing.py:29} INFO - This service
is instrumented using OpenTelemetry. OpenTelemetry could not be imported;
please add opentelemetry-api and opentelemetry-instrumentation packages in
order to get BigQuery Tracing data.
Created <DagRun example_complex @ 2021-01-20 00:18:00+00:00:
manual__2021-01-20T00:18:00+00:00, externally triggered: True>
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2021-01-20 00:18:04,044] {scheduler_job.py:1239} INFO - Starting the
scheduler
[2021-01-20 00:18:04,044] {scheduler_job.py:1244} INFO - Processing each
file at most -1 times
[2021-01-20 00:18:04,232] {dag_processing.py:250} INFO - Launched
DagFileProcessorManager with pid: 31561
[2021-01-20 00:18:04,235] {scheduler_job.py:1748} INFO - Resetting orphaned
tasks for active dag runs
[2021-01-20 00:18:04,290] {settings.py:52} INFO - Configured default
timezone Timezone('UTC')
[2021-01-20 00:18:07,141] {opentelemetry_tracing.py:29} INFO - This service
is instrumented using OpenTelemetry. OpenTelemetry could not be imported;
please add opentelemetry-api and opentelemetry-instrumentation packages in
order to get BigQuery Tracing data.
[2021-01-20 00:18:08,796] {scheduler_job.py:936} INFO - 1 tasks up for
execution:
<TaskInstance: example_complex.create_entry_group 2021-01-20
00:18:00+00:00 [scheduled]>
[2021-01-20 00:18:08,800] {scheduler_job.py:970} INFO - Figuring out tasks
to run in Pool(name=default_pool) with 128 open slots and 1 task instances
ready to be queued
[2021-01-20 00:18:08,800] {scheduler_job.py:997} INFO - DAG example_complex
has 0/16 running and queued tasks
[2021-01-20 00:18:08,801] {scheduler_job.py:1058} INFO - Setting the
following tasks to queued state:
<TaskInstance: example_complex.create_entry_group 2021-01-20
00:18:00+00:00 [scheduled]>
[2021-01-20 00:18:08,809] {scheduler_job.py:1100} INFO - Sending
TaskInstanceKey(dag_id='example_complex', task_id='create_entry_group',
execution_date=datetime.datetime(2021, 1, 20, 0, 18, tzinfo=Timezone('UTC')),
try_number=1) to executor with priority 37 and queue default
[2021-01-20 00:18:08,809] {base_executor.py:79} INFO - Adding to queue:
['airflow', 'tasks', 'run', 'example_complex', 'create_entry_group',
'2021-01-20T00:18:00+00:00', '--local', '--pool', 'default_pool', '--subdir',
'/opt/airflow/airflow/example_dags/example_complex.py']
[2021-01-20 00:18:08,816] {local_executor.py:81} INFO - QueuedLocalWorker
running ['airflow', 'tasks', 'run', 'example_complex', 'create_entry_group',
'2021-01-20T00:18:00+00:00', '--local', '--pool', 'default_pool', '--subdir',
'/opt/airflow/airflow/example_dags/example_complex.py']
[2021-01-20 00:18:08,831] {scheduler_job.py:1401} INFO - Exiting scheduler
loop as requested number of runs (1 - got to 1) has been reached
[2021-01-20 00:18:08,832] {dag_processing.py:439} INFO - Sending termination
message to manager.
[2021-01-20 00:18:08,834] {scheduler_job.py:1282} INFO - Deactivating DAGs
that haven't been touched since 2021-01-20T00:18:04.234269+00:00
[2021-01-20 00:18:08,992] {dagbag.py:440} INFO - Filling up the DagBag from
/opt/airflow/airflow/example_dags/example_complex.py
[2021-01-20 00:18:11,716] {opentelemetry_tracing.py:29} INFO - This service
is instrumented using OpenTelemetry. OpenTelemetry could not be imported;
please add opentelemetry-api and opentelemetry-instrumentation packages in
order to get BigQuery Tracing data.
Running <TaskInstance: example_complex.create_entry_group
2021-01-20T00:18:00+00:00 [queued]> on host f375df8a2fd5
[2021-01-20 00:18:14,325] {gcs_task_handler.py:182} INFO - Previous log
discarded: 404 GET
https://storage.googleapis.com/download/storage/v1/b/airflow-gcs-task-handler-tests-hiucqpbantyszrxv/o/path%2Fto%2Flogs%2Fexample_complex%2Fcreate_entry_group%2F2021-01-20T00%3A18%3A00%2B00%3A00%2F1.log?alt=media:
No such object:
airflow-gcs-task-handler-tests-hiucqpbantyszrxv/path/to/logs/example_complex/create_entry_group/2021-01-20T00:18:00+00:00/1.log:
('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK:
200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
[2021-01-20 00:18:14,735] {process_utils.py:95} INFO - Sending
Signals.SIGTERM to GPID 31561
[2021-01-20 00:18:14,736] {scheduler_job.py:1293} INFO - Exited execute loop
[2021-01-20 00:18:15,245] {dagbag.py:440} INFO - Filling up the DagBag from
/opt/airflow/airflow/example_dags/example_complex.py
[2021-01-20 00:18:15,627] {logging_command_executor.py:33} INFO - Executing:
'gcloud config set core/project polidea-airflow'
[2021-01-20 00:18:17,463] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:17,464] {logging_command_executor.py:45} INFO - Stderr:
Updated property [core/project].
[2021-01-20 00:18:17,464] {logging_command_executor.py:33} INFO - Executing:
'gcloud auth activate-service-account
--key-file=/files/airflow-breeze-config/keys/gcp_gcs.json'
[2021-01-20 00:18:19,103] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:19,104] {logging_command_executor.py:45} INFO - Stderr:
Activated service account credentials for:
[[email protected]]
[2021-01-20 00:18:21,811] {opentelemetry_tracing.py:29} INFO - This service
is instrumented using OpenTelemetry. OpenTelemetry could not be imported;
please add opentelemetry-api and opentelemetry-instrumentation packages in
order to get BigQuery Tracing data.
PASSED[2021-01-20 00:18:24,225] {logging_command_executor.py:33} INFO -
Executing: 'gcloud config set core/project polidea-airflow'
[2021-01-20 00:18:25,949] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:25,951] {logging_command_executor.py:45} INFO - Stderr:
Updated property [core/project].
[2021-01-20 00:18:25,952] {logging_command_executor.py:33} INFO - Executing:
'gcloud auth activate-service-account
--key-file=/files/airflow-breeze-config/keys/gcp_gcs.json'
[2021-01-20 00:18:27,947] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:27,948] {logging_command_executor.py:45} INFO - Stderr:
Activated service account credentials for:
[[email protected]]
[2021-01-20 00:18:27,948] {logging_command_executor.py:33} INFO - Executing:
'gsutil -m rm -r gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv'
[2021-01-20 00:18:33,055] {logging_command_executor.py:44} INFO - Stdout:
[2021-01-20 00:18:33,056] {logging_command_executor.py:45} INFO - Stderr:
Removing
gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv/path/to/logs/example_complex/create_entry_group/2021-01-20T00:18:00+00:00/1.log#1611101894698364...
/ [1/1 objects] 100% Done
Operation completed over 1 objects.
Removing gs://airflow-gcs-task-handler-tests-hiucqpbantyszrxv/...
===============================================================================================================================
warnings summary
===============================================================================================================================
tests/providers/google/cloud/log/test_gcs_task_handler_system.py::TestGCSTaskHandlerSystemTest::test_should_read_logs
/usr/local/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning:
numpy.ufunc size changed, may indicate binary incompatibility. Expected 192
from C header, got 216 from PyObject
return f(*args, **kwds)
tests/providers/google/cloud/log/test_gcs_task_handler_system.py::TestGCSTaskHandlerSystemTest::test_should_read_logs
/usr/local/lib/python3.6/site-packages/boto/plugin.py:40:
DeprecationWarning: the imp module is deprecated in favour of importlib; see
the module's documentation for alternative uses
import imp
-- Docs: https://docs.pytest.org/en/stable/warnings.html
========================================================================================================================
1 passed, 2 warnings in 52.09s
========================================================================================================================
root@f375df8a2fd5:/opt/airflow#
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]