iasoon opened a new issue #17507:
URL: https://github.com/apache/airflow/issues/17507
**Apache Airflow version**: 2.1.2
**Apache Airflow Provider versions** (please include all providers that are
relevant to your bug):
```
apache-airflow-providers-amazon==2.0.0
apache-airflow-providers-celery==2.0.0
apache-airflow-providers-cncf-kubernetes==2.0.0
apache-airflow-providers-docker==2.0.0
apache-airflow-providers-elasticsearch==2.0.2
apache-airflow-providers-ftp==2.0.0
apache-airflow-providers-google==4.0.0
apache-airflow-providers-grpc==2.0.0
apache-airflow-providers-hashicorp==2.0.0
apache-airflow-providers-http==2.0.0
apache-airflow-providers-imap==2.0.0
apache-airflow-providers-microsoft-azure==3.0.0
apache-airflow-providers-mysql==2.0.0
apache-airflow-providers-postgres==2.0.0
apache-airflow-providers-redis==2.0.0
apache-airflow-providers-sendgrid==2.0.0
apache-airflow-providers-sftp==2.0.0
apache-airflow-providers-slack==4.0.0
apache-airflow-providers-sqlite==2.0.0
apache-airflow-providers-ssh==2.0.0
```
**Environment**:
I'm using the airflow-2.1.2 container from dockerhub.
- **Cloud provider or hardware configuration**: AWS
- **OS** (e.g. from /etc/os-release): `Debian GNU/Linux 10 (buster)`
- **Kernel** (e.g. `uname -a`): `Linux fe52079d9ade 5.12.13-200.fc33.x86_64
#1 SMP Wed Jun 23 16:20:26 UTC 2021 x86_64 GNU/Linux`
**What happened**:
When using the EMRStepSensor (set to reschedule mode) to monitor EMR steps,
the task will sometimes fail while the EMR step sucessfully ran. Most of the
time the sensor will work fine, but every so often this issue occurs (on the
same DAG, without modifications).
<details>
<summary>EMRStepSensor task instance debug log</summary>
```
*** Reading local file:
/opt/airflow/logs/derived.adobe_target_catalog_sporza/watch_adobe_target_catalog_sporza_job_emr_step/2021-08-07T05:28:00+00:00/1.log
[2021-08-08 05:29:20,367] {__init__.py:51} DEBUG - Loading core task runner:
StandardTaskRunner
[2021-08-08 05:29:21,594] {base_task_runner.py:62} DEBUG - Planning to run
as the user
[2021-08-08 05:29:21,597] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> from DB
[2021-08-08 05:29:22,086] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:29:22,086] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED:
True, The task did not have depends_on_past set.
[2021-08-08 05:29:22,086] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance State' PASSED:
True, Task state queued was valid.
[2021-08-08 05:29:22,128] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED:
True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:29:22,128] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED:
True, The task instance was not marked for retrying.
[2021-08-08 05:29:22,129] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance Not Running'
PASSED: True, Task is not in running state.
[2021-08-08 05:29:22,130] {taskinstance.py:896} INFO - Dependencies all met
for <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:29:22,130] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED:
True, The task did not have depends_on_past set.
[2021-08-08 05:29:22,165] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED:
True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:29:22,206] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED:
True, The task instance was not marked for retrying.
[2021-08-08 05:29:22,327] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Pool Slots Available' PASSED:
True, ('There are enough open slots in %s to execute the task', 'default_pool')
[2021-08-08 05:29:22,327] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Concurrency' PASSED:
True, Task concurrency is not set.
[2021-08-08 05:29:22,327] {taskinstance.py:896} INFO - Dependencies all met
for <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:29:22,327] {taskinstance.py:1087} INFO -
--------------------------------------------------------------------------------
[2021-08-08 05:29:22,327] {taskinstance.py:1088} INFO - Starting attempt 1
of 1
[2021-08-08 05:29:22,327] {taskinstance.py:1089} INFO -
--------------------------------------------------------------------------------
[2021-08-08 05:29:24,417] {taskinstance.py:1107} INFO - Executing
<Task(EmrStepSensor): watch_adobe_target_catalog_sporza_job_emr_step> on
2021-08-07T05:28:00+00:00
[2021-08-08 05:29:24,421] {standard_task_runner.py:52} INFO - Started
process 7426 to run task
[2021-08-08 05:29:24,427] {standard_task_runner.py:76} INFO - Running:
['airflow', 'tasks', 'run', 'derived.adobe_target_catalog_sporza',
'watch_adobe_target_catalog_sporza_job_emr_step', '2021-08-07T05:28:00+00:00',
'--job-id', '98940', '--pool', 'default_pool', '--raw', '--subdir',
'DAGS_FOLDER/adobe_target_catalog_sporza_wf.py', '--cfg-path',
'/tmp/tmpucsqr5ep', '--error-file', '/tmp/tmpi1wh32jz']
[2021-08-08 05:29:24,427] {standard_task_runner.py:77} INFO - Job 98940:
Subtask watch_adobe_target_catalog_sporza_job_emr_step
[2021-08-08 05:29:24,428] {cli_action_loggers.py:66} DEBUG - Calling
callbacks: [<function default_action_log at 0x7fc8dfbe5d90>]
[2021-08-08 05:29:26,223] {settings.py:208} DEBUG - Setting up DB connection
pool (PID 7426)
[2021-08-08 05:29:26,223] {settings.py:244} DEBUG -
settings.prepare_engine_args(): Using NullPool
[2021-08-08 05:29:26,226] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [None]> from DB
[2021-08-08 05:29:27,030] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:29:27,523] {logging_mixin.py:104} INFO - Running
<TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> on host
ip-172-31-58-121.eu-west-1.compute.internal
[2021-08-08 05:29:27,523] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:29:28,218] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:29:28,871] {taskinstance.py:677} DEBUG - Clearing XCom data
[2021-08-08 05:29:29,515] {taskinstance.py:684} DEBUG - XCom data cleared
[2021-08-08 05:29:32,916] {taskinstance.py:1302} INFO - Exporting the
following env vars:
AIRFLOW_CTX_DAG_OWNER=vrt
AIRFLOW_CTX_DAG_ID=derived.adobe_target_catalog_sporza
AIRFLOW_CTX_TASK_ID=watch_adobe_target_catalog_sporza_job_emr_step
AIRFLOW_CTX_EXECUTION_DATE=2021-08-07T05:28:00+00:00
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-08-07T05:28:00+00:00
[2021-08-08 05:29:32,916] {__init__.py:146} DEBUG - Preparing lineage inlets
and outlets
[2021-08-08 05:29:32,916] {__init__.py:190} DEBUG - inlets: [], outlets: []
[2021-08-08 05:29:33,486] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:29:36,437] {base_aws.py:368} INFO - Airflow Connection:
aws_conn_id=aws_default
[2021-08-08 05:29:36,521] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:29:36,562] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:29:38,654] {base_aws.py:179} INFO - No credentials retrieved
from Connection
[2021-08-08 05:29:38,654] {base_aws.py:87} INFO - Creating session with
aws_access_key_id=None region_name=None
[2021-08-08 05:29:38,655] {hooks.py:417} DEBUG - Changing event name from
creating-client-class.iot-data to creating-client-class.iot-data-plane
[2021-08-08 05:29:38,656] {hooks.py:417} DEBUG - Changing event name from
before-call.apigateway to before-call.api-gateway
[2021-08-08 05:29:38,657] {hooks.py:417} DEBUG - Changing event name from
request-created.machinelearning.Predict to
request-created.machine-learning.Predict
[2021-08-08 05:29:38,659] {hooks.py:417} DEBUG - Changing event name from
before-parameter-build.autoscaling.CreateLaunchConfiguration to
before-parameter-build.auto-scaling.CreateLaunchConfiguration
[2021-08-08 05:29:38,659] {hooks.py:417} DEBUG - Changing event name from
before-parameter-build.route53 to before-parameter-build.route-53
[2021-08-08 05:29:38,659] {hooks.py:417} DEBUG - Changing event name from
request-created.cloudsearchdomain.Search to
request-created.cloudsearch-domain.Search
[2021-08-08 05:29:38,660] {hooks.py:417} DEBUG - Changing event name from
docs.*.autoscaling.CreateLaunchConfiguration.complete-section to
docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
[2021-08-08 05:29:38,663] {hooks.py:417} DEBUG - Changing event name from
before-parameter-build.logs.CreateExportTask to
before-parameter-build.cloudwatch-logs.CreateExportTask
[2021-08-08 05:29:38,663] {hooks.py:417} DEBUG - Changing event name from
docs.*.logs.CreateExportTask.complete-section to
docs.*.cloudwatch-logs.CreateExportTask.complete-section
[2021-08-08 05:29:38,663] {hooks.py:417} DEBUG - Changing event name from
before-parameter-build.cloudsearchdomain.Search to
before-parameter-build.cloudsearch-domain.Search
[2021-08-08 05:29:38,663] {hooks.py:417} DEBUG - Changing event name from
docs.*.cloudsearchdomain.Search.complete-section to
docs.*.cloudsearch-domain.Search.complete-section
[2021-08-08 05:29:38,664] {base_aws.py:157} INFO - role_arn is None
[2021-08-08 05:29:38,667] {utils.py:364} DEBUG - IMDS ENDPOINT:
http://169.254.169.254/
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: env
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: assume-role
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: assume-role-with-web-identity
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: sso
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: shared-credentials-file
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: custom-process
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: config-file
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: ec2-credentials-file
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: boto-config
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for
credentials via: container-role
[2021-08-08 05:29:38,671] {connectionpool.py:230} DEBUG - Starting new HTTP
connection (1): 169.254.170.2:80
[2021-08-08 05:29:38,673] {connectionpool.py:442} DEBUG -
http://169.254.170.2:80 "GET
/v2/credentials/c5f7099a-d46e-4472-a48f-6c137db9e75d HTTP/1.1" 200 1307
[2021-08-08 05:29:38,674] {loaders.py:174} DEBUG - Loading JSON file:
/home/airflow/.local/lib/python3.6/site-packages/botocore/data/endpoints.json
[2021-08-08 05:29:38,683] {hooks.py:210} DEBUG - Event choose-service-name:
calling handler <function handle_service_name_alias at 0x7fc8dbee8bf8>
[2021-08-08 05:29:38,691] {loaders.py:174} DEBUG - Loading JSON file:
/home/airflow/.local/lib/python3.6/site-packages/botocore/data/emr/2009-03-31/service-2.json
[2021-08-08 05:29:38,695] {hooks.py:210} DEBUG - Event
creating-client-class.emr: calling handler <function add_generate_presigned_url
at 0x7fc8dbf947b8>
[2021-08-08 05:29:38,697] {endpoint.py:292} DEBUG - Setting elasticmapreduce
timeout as (60, 60)
[2021-08-08 05:29:38,698] {loaders.py:174} DEBUG - Loading JSON file:
/home/airflow/.local/lib/python3.6/site-packages/botocore/data/_retry.json
[2021-08-08 05:29:38,698] {client.py:166} DEBUG - Registering retry handlers
for service: emr
[2021-08-08 05:29:38,699] {emr_step.py:73} INFO - Poking step s-581MKMYKRED7
on cluster j-XDTNJDHR23RQ
[2021-08-08 05:29:38,699] {hooks.py:210} DEBUG - Event
before-parameter-build.emr.DescribeStep: calling handler <function
generate_idempotent_uuid at 0x7fc8dbf06048>
[2021-08-08 05:29:38,699] {hooks.py:210} DEBUG - Event
before-call.emr.DescribeStep: calling handler <function
inject_api_version_header_if_needed at 0x7fc8dbf077b8>
[2021-08-08 05:29:38,699] {endpoint.py:101} DEBUG - Making request for
OperationModel(name=DescribeStep) with params: {'url_path': '/',
'query_string': '', 'method': 'POST', 'headers': {'X-Amz-Target':
'ElasticMapReduce.DescribeStep', 'Content-Type': 'application/x-amz-json-1.1',
'User-Agent': 'Boto3/1.17.107 Python/3.6.13 Linux/4.14.238-182.422.amzn2.x86_64
exec-env/AWS_ECS_FARGATE Botocore/1.20.107'}, 'body': b'{"ClusterId":
"j-XDTNJDHR23RQ", "StepId": "s-581MKMYKRED7"}', 'url':
'https://eu-west-1.elasticmapreduce.amazonaws.com/', 'context':
{'client_region': 'eu-west-1', 'client_config': <botocore.config.Config object
at 0x7fc8dba19470>, 'has_streaming_input': False, 'auth_type': None}}
[2021-08-08 05:29:38,700] {hooks.py:210} DEBUG - Event
request-created.emr.DescribeStep: calling handler <bound method
RequestSigner.handler of <botocore.signers.RequestSigner object at
0x7fc8dba19438>>
[2021-08-08 05:29:38,700] {hooks.py:210} DEBUG - Event
choose-signer.emr.DescribeStep: calling handler <function
set_operation_specific_signer at 0x7fc8dbf01ea0>
[2021-08-08 05:29:38,700] {auth.py:380} DEBUG - Calculating signature using
v4 auth.
[2021-08-08 05:29:38,700] {auth.py:381} DEBUG - CanonicalRequest:
POST
/
content-type:application/x-amz-json-1.1
host:eu-west-1.elasticmapreduce.amazonaws.com
x-amz-date:20210808T052938Z
x-amz-security-token:IQoJb3JpZ2luX2VjEGwaCWV1LXdlc3QtMSJHMEUCIQCZ4NXI3+3YJKZwyu+L4vTyY4swRSd7+zHkDO3Nwc0JiwIgdDuzR2+qJ5vFE03B1RRhzZYGV4Iy/pt7jVXRr0YjXTUqogQIhP//////////ARABGgwyODM3NzQxNDgzNTciDDl+9YdFHetmruZnfSr2A4vFCZwFNe0LEigdz0ayjIhvTZMsjY6CClml27JRBbUllyWz6t3b5SjafRWERgfEe1xIjnFFvp/UFSDIrZVuOefX94uUcgWQKV5zhz/dHdpCzpPltF+syapd2IeY5lgaUZWr+ax+Pfs75JILyfTpa7u43hg1M6P9L07zP/IUWdrXvyvu0wVh07r6/bHUDKJGy52Ok2q6Hr4ltT4MVAY/EZIEDZA8dlNdd7/UDddecp+/7/QIMftzZsgQJVyO+IIB+cfHCIV+1g/ACnsSbBkJuH2fn/B0YeP6e1135yTFknzNKVUxf6OeBIuPrkcu78HnaOXcjv+JSKb9AcFP+2zOo9JYmr040XGVqesZiAc+uDWbL74LDQ7ze5b4sLSNXfCCEvqsYVy9Nt7UlopCXoIS2lBrWgXNhsppDQ976Gr+WgyV1Wbjx2NZlJFCtSNarwF0qDPBmm6yhQxyKMYarOrUpy9aNl/18ujIhJZ2/JMUYlzH0TOBc1YCKIuZLNPwYfxZW6JmosfKXjAFUK7eDn4tU6LSXyo2GNLYzjY71OyCOpllJYtxg/NDAQBlmMCbFPpFCCTjPECU0sGnhfPAdlsyYngT3L0bSyjAZpPRfYrfPd4i3j2niPq9BwF/iOIR95T2CK7zqxXB4TIpporSE0pItmFgynGTh+IwtZm9iAY6pgENShKQaHxJvNj1O5ddYeCwW3H2SqjrHroeOt82xGkH5stZ7g28Uy7iPIPEehlkrB7ry/KjLq9DAo4zVneHhfKJBUBTBxz944AZhd
0d9VvSRGgEXnmxzTsHma9hnkwQfGdbXwH4ja582o7hPEI98AAgoo5OCuV/vyMHHgrJBa7z+7eRtGnCeg+EJZy9WlsilHHGze4QGDEDc4NJypNZM4JON503hXON
x-amz-target:ElasticMapReduce.DescribeStep
content-type;host;x-amz-date;x-amz-security-token;x-amz-target
2a131454435119560c6fa09b6b1eeeb557aba91e61af62fe4883ccc24890c4f3
[2021-08-08 05:29:38,700] {auth.py:383} DEBUG - StringToSign:
AWS4-HMAC-SHA256
20210808T052938Z
20210808/eu-west-1/elasticmapreduce/aws4_request
29f0d229d76daf02f334b8c7d193cb06de4619f4adb7b3057683f53b2ea79c80
[2021-08-08 05:29:38,700] {auth.py:385} DEBUG - Signature:
66e13f874221cb3c1b275a052ac56a7e827d431e940bfe9b14414104e0a3b162
[2021-08-08 05:29:38,701] {endpoint.py:187} DEBUG - Sending http request:
<AWSPreparedRequest stream_output=False, method=POST,
url=https://eu-west-1.elasticmapreduce.amazonaws.com/, headers={'X-Amz-Target':
b'ElasticMapReduce.DescribeStep', 'Content-Type':
b'application/x-amz-json-1.1', 'User-Agent': b'Boto3/1.17.107 Python/3.6.13
Linux/4.14.238-182.422.amzn2.x86_64 exec-env/AWS_ECS_FARGATE
Botocore/1.20.107', 'X-Amz-Date': b'20210808T052938Z', 'X-Amz-Security-Token':
b'IQoJb3JpZ2luX2VjEGwaCWV1LXdlc3QtMSJHMEUCIQCZ4NXI3+3YJKZwyu+L4vTyY4swRSd7+zHkDO3Nwc0JiwIgdDuzR2+qJ5vFE03B1RRhzZYGV4Iy/pt7jVXRr0YjXTUqogQIhP//////////ARABGgwyODM3NzQxNDgzNTciDDl+9YdFHetmruZnfSr2A4vFCZwFNe0LEigdz0ayjIhvTZMsjY6CClml27JRBbUllyWz6t3b5SjafRWERgfEe1xIjnFFvp/UFSDIrZVuOefX94uUcgWQKV5zhz/dHdpCzpPltF+syapd2IeY5lgaUZWr+ax+Pfs75JILyfTpa7u43hg1M6P9L07zP/IUWdrXvyvu0wVh07r6/bHUDKJGy52Ok2q6Hr4ltT4MVAY/EZIEDZA8dlNdd7/UDddecp+/7/QIMftzZsgQJVyO+IIB+cfHCIV+1g/ACnsSbBkJuH2fn/B0YeP6e1135yTFknzNKVUxf6OeBIuPrkcu78HnaOXcjv+
JSKb9AcFP+2zOo9JYmr040XGVqesZiAc+uDWbL74LDQ7ze5b4sLSNXfCCEvqsYVy9Nt7UlopCXoIS2lBrWgXNhsppDQ976Gr+WgyV1Wbjx2NZlJFCtSNarwF0qDPBmm6yhQxyKMYarOrUpy9aNl/18ujIhJZ2/JMUYlzH0TOBc1YCKIuZLNPwYfxZW6JmosfKXjAFUK7eDn4tU6LSXyo2GNLYzjY71OyCOpllJYtxg/NDAQBlmMCbFPpFCCTjPECU0sGnhfPAdlsyYngT3L0bSyjAZpPRfYrfPd4i3j2niPq9BwF/iOIR95T2CK7zqxXB4TIpporSE0pItmFgynGTh+IwtZm9iAY6pgENShKQaHxJvNj1O5ddYeCwW3H2SqjrHroeOt82xGkH5stZ7g28Uy7iPIPEehlkrB7ry/KjLq9DAo4zVneHhfKJBUBTBxz944AZhd0d9VvSRGgEXnmxzTsHma9hnkwQfGdbXwH4ja582o7hPEI98AAgoo5OCuV/vyMHHgrJBa7z+7eRtGnCeg+EJZy9WlsilHHGze4QGDEDc4NJypNZM4JON503hXON',
'Authorization': b'AWS4-HMAC-SHA256
Credential=ASIAUEESB34CT3URIQHE/20210808/eu-west-1/elasticmapreduce/aws4_request,
SignedHeaders=content-type;host;x-amz-date;x-amz-security-token;x-amz-target,
Signature=66e13f874221cb3c1b275a052ac56a7e827d431e940bfe9b14414104e0a3b162',
'Content-Length': '59'}>
[2021-08-08 05:29:38,702] {httpsession.py:50} DEBUG - Certificate path:
/home/airflow/.local/lib/python3.6/site-packages/certifi/cacert.pem
[2021-08-08 05:29:38,702] {connectionpool.py:943} DEBUG - Starting new HTTPS
connection (1): eu-west-1.elasticmapreduce.amazonaws.com:443
[2021-08-08 05:29:38,752] {connectionpool.py:442} DEBUG -
https://eu-west-1.elasticmapreduce.amazonaws.com:443 "POST / HTTP/1.1" 200 799
[2021-08-08 05:29:38,752] {parsers.py:233} DEBUG - Response headers:
{'x-amzn-RequestId': 'ecacd9e3-44b2-4a00-a4f0-48efe0d65847', 'Content-Type':
'application/x-amz-json-1.1', 'Content-Length': '799', 'Date': 'Sun, 08 Aug
2021 05:29:38 GMT'}
[2021-08-08 05:29:38,752] {parsers.py:234} DEBUG - Response body:
b'{"Step":{"ActionOnFailure":"CONTINUE","Config":{"Args":["spark-submit","--deploy-mode","cluster","--master","yarn","--driver-memory","2G","--executor-memory","8G","--num-executors","5","--executor-cores","3","--packages","io.delta:delta-core_2.11:0.6.1","--py-files","s3://vrt-datalake-artifacts-prod/adobe-target-catalog-sporza/python/adobe_target_catalog_sporza-1.0.0-py3.7_6.egg,s3://vrt-datalake-artifacts-prod/python-datalake-helpers/vrt_datalake_helpers-21-py3-none-any.whl","s3://vrt-datalake-artifacts-prod/adobe-target-catalog-sporza/python/run_6.py","prod","2021-08-08
05:28:00"],"Jar":"command-runner.jar","Properties":{}},"Id":"s-581MKMYKRED7","Name":"adobe-target-catalog-sporza-job","Status":{"State":"PENDING","StateChangeReason":{},"Timeline":{"CreationDateTime":1.628400523922E9}}}}'
[2021-08-08 05:29:38,753] {hooks.py:210} DEBUG - Event
needs-retry.emr.DescribeStep: calling handler
<botocore.retryhandler.RetryHandler object at 0x7fc8dba19860>
[2021-08-08 05:29:38,753] {retryhandler.py:187} DEBUG - No retry needed.
[2021-08-08 05:29:38,753] {emr_base.py:66} INFO - Job flow currently PENDING
[2021-08-08 05:29:38,753] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:29:39,718] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:29:39,748] {taskinstance.py:1906} DEBUG - Task Duration set
to 17.617928
[2021-08-08 05:29:42,695] {taskinstance.py:1484} INFO - Rescheduling task,
marking task as UP_FOR_RESCHEDULE
[2021-08-08 05:29:42,696] {cli_action_loggers.py:84} DEBUG - Calling
callbacks: []
[2021-08-08 05:29:44,944] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:29:48,268] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [up_for_reschedule]>
[2021-08-08 05:29:48,674] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:29:48,674] {local_task_job.py:149} INFO - Task exited with
return code 0
[2021-08-08 05:29:48,675] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [up_for_reschedule]> from DB
[2021-08-08 05:29:49,740] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [up_for_reschedule]>
[2021-08-08 05:30:49,437] {__init__.py:51} DEBUG - Loading core task runner:
StandardTaskRunner
[2021-08-08 05:30:51,569] {base_task_runner.py:62} DEBUG - Planning to run
as the user
[2021-08-08 05:30:51,572] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> from DB
[2021-08-08 05:30:52,465] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:30:52,465] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED:
True, The task did not have depends_on_past set.
[2021-08-08 05:30:52,465] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance State' PASSED:
True, Task state queued was valid.
[2021-08-08 05:30:52,544] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED:
True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:30:52,544] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED:
True, The task instance was not marked for retrying.
[2021-08-08 05:30:52,545] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance Not Running'
PASSED: True, Task is not in running state.
[2021-08-08 05:30:52,546] {taskinstance.py:896} INFO - Dependencies all met
for <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:30:52,546] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED:
True, The task did not have depends_on_past set.
[2021-08-08 05:30:52,593] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED:
True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:30:52,593] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED:
True, The task instance was not marked for retrying.
[2021-08-08 05:30:52,756] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Pool Slots Available' PASSED:
True, ('There are enough open slots in %s to execute the task', 'default_pool')
[2021-08-08 05:30:52,756] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Concurrency' PASSED:
True, Task concurrency is not set.
[2021-08-08 05:30:52,756] {taskinstance.py:896} INFO - Dependencies all met
for <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:30:52,756] {taskinstance.py:1087} INFO -
--------------------------------------------------------------------------------
[2021-08-08 05:30:52,756] {taskinstance.py:1088} INFO - Starting attempt 1
of 1
[2021-08-08 05:30:52,756] {taskinstance.py:1089} INFO -
--------------------------------------------------------------------------------
[2021-08-08 05:30:53,648] {taskinstance.py:1107} INFO - Executing
<Task(EmrStepSensor): watch_adobe_target_catalog_sporza_job_emr_step> on
2021-08-07T05:28:00+00:00
[2021-08-08 05:30:53,651] {standard_task_runner.py:52} INFO - Started
process 7972 to run task
[2021-08-08 05:30:53,655] {standard_task_runner.py:76} INFO - Running:
['airflow', 'tasks', 'run', 'derived.adobe_target_catalog_sporza',
'watch_adobe_target_catalog_sporza_job_emr_step', '2021-08-07T05:28:00+00:00',
'--job-id', '98953', '--pool', 'default_pool', '--raw', '--subdir',
'DAGS_FOLDER/adobe_target_catalog_sporza_wf.py', '--cfg-path',
'/tmp/tmppdcqbh0o', '--error-file', '/tmp/tmpdky_uy36']
[2021-08-08 05:30:53,655] {standard_task_runner.py:77} INFO - Job 98953:
Subtask watch_adobe_target_catalog_sporza_job_emr_step
[2021-08-08 05:30:53,656] {cli_action_loggers.py:66} DEBUG - Calling
callbacks: [<function default_action_log at 0x7fc8dfbe5d90>]
[2021-08-08 05:30:54,515] {settings.py:208} DEBUG - Setting up DB connection
pool (PID 7972)
[2021-08-08 05:30:54,515] {settings.py:244} DEBUG -
settings.prepare_engine_args(): Using NullPool
[2021-08-08 05:30:54,517] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [None]> from DB
[2021-08-08 05:30:57,090] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:30:58,979] {logging_mixin.py:104} INFO - Running
<TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> on host
ip-172-31-58-121.eu-west-1.compute.internal
[2021-08-08 05:30:58,979] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:30:59,588] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:31:03,931] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:31:04,181] {taskinstance.py:677} DEBUG - Clearing XCom data
[2021-08-08 05:31:04,914] {taskinstance.py:684} DEBUG - XCom data cleared
[2021-08-08 05:31:05,896] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:31:05,934] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:31:12,815] {taskinstance.py:1302} INFO - Exporting the
following env vars:
AIRFLOW_CTX_DAG_OWNER=vrt
AIRFLOW_CTX_DAG_ID=derived.adobe_target_catalog_sporza
AIRFLOW_CTX_TASK_ID=watch_adobe_target_catalog_sporza_job_emr_step
AIRFLOW_CTX_EXECUTION_DATE=2021-08-07T05:28:00+00:00
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-08-07T05:28:00+00:00
[2021-08-08 05:31:12,815] {__init__.py:146} DEBUG - Preparing lineage inlets
and outlets
[2021-08-08 05:31:12,816] {__init__.py:190} DEBUG - inlets: [], outlets: []
[2021-08-08 05:31:13,431] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:31:13,718] {base_aws.py:368} INFO - Airflow Connection:
aws_conn_id=aws_default
[2021-08-08 05:31:14,251] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:31:14,293] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:31:14,780] {base_aws.py:179} INFO - No credentials retrieved
from Connection
[2021-08-08 05:31:14,780] {base_aws.py:87} INFO - Creating session with
aws_access_key_id=None region_name=None
[2021-08-08 05:31:14,781] {hooks.py:417} DEBUG - Changing event name from
creating-client-class.iot-data to creating-client-class.iot-data-plane
[2021-08-08 05:31:14,783] {hooks.py:417} DEBUG - Changing event name from
before-call.apigateway to before-call.api-gateway
[2021-08-08 05:31:14,784] {hooks.py:417} DEBUG - Changing event name from
request-created.machinelearning.Predict to
request-created.machine-learning.Predict
[2021-08-08 05:31:14,786] {hooks.py:417} DEBUG - Changing event name from
before-parameter-build.autoscaling.CreateLaunchConfiguration to
before-parameter-build.auto-scaling.CreateLaunchConfiguration
[2021-08-08 05:31:14,786] {hooks.py:417} DEBUG - Changing event name from
before-parameter-build.route53 to before-parameter-build.route-53
[2021-08-08 05:31:14,786] {hooks.py:417} DEBUG - Changing event name from
request-created.cloudsearchdomain.Search to
request-created.cloudsearch-domain.Search
[2021-08-08 05:31:14,787] {hooks.py:417} DEBUG - Changing event name from
docs.*.autoscaling.CreateLaunchConfiguration.complete-section to
docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
[2021-08-08 05:31:14,790] {hooks.py:417} DEBUG - Changing event name from
before-parameter-build.logs.CreateExportTask to
before-parameter-build.cloudwatch-logs.CreateExportTask
[2021-08-08 05:31:14,790] {hooks.py:417} DEBUG - Changing event name from
docs.*.logs.CreateExportTask.complete-section to
docs.*.cloudwatch-logs.CreateExportTask.complete-section
[2021-08-08 05:31:14,790] {hooks.py:417} DEBUG - Changing event name from
before-parameter-build.cloudsearchdomain.Search to
before-parameter-build.cloudsearch-domain.Search
[2021-08-08 05:31:14,790] {hooks.py:417} DEBUG - Changing event name from
docs.*.cloudsearchdomain.Search.complete-section to
docs.*.cloudsearch-domain.Search.complete-section
[2021-08-08 05:31:14,791] {base_aws.py:157} INFO - role_arn is None
[2021-08-08 05:31:14,794] {utils.py:364} DEBUG - IMDS ENDPOINT:
http://169.254.169.254/
[2021-08-08 05:31:14,796] {credentials.py:1974} DEBUG - Looking for
credentials via: env
[2021-08-08 05:31:14,796] {credentials.py:1974} DEBUG - Looking for
credentials via: assume-role
[2021-08-08 05:31:14,796] {credentials.py:1974} DEBUG - Looking for
credentials via: assume-role-with-web-identity
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for
credentials via: sso
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for
credentials via: shared-credentials-file
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for
credentials via: custom-process
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for
credentials via: config-file
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for
credentials via: ec2-credentials-file
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for
credentials via: boto-config
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for
credentials via: container-role
[2021-08-08 05:31:14,798] {connectionpool.py:230} DEBUG - Starting new HTTP
connection (1): 169.254.170.2:80
[2021-08-08 05:31:14,799] {connectionpool.py:442} DEBUG -
http://169.254.170.2:80 "GET
/v2/credentials/c5f7099a-d46e-4472-a48f-6c137db9e75d HTTP/1.1" 200 1307
[2021-08-08 05:31:14,800] {loaders.py:174} DEBUG - Loading JSON file:
/home/airflow/.local/lib/python3.6/site-packages/botocore/data/endpoints.json
[2021-08-08 05:31:14,810] {hooks.py:210} DEBUG - Event choose-service-name:
calling handler <function handle_service_name_alias at 0x7fc8dbee9bf8>
[2021-08-08 05:31:14,817] {loaders.py:174} DEBUG - Loading JSON file:
/home/airflow/.local/lib/python3.6/site-packages/botocore/data/emr/2009-03-31/service-2.json
[2021-08-08 05:31:14,822] {hooks.py:210} DEBUG - Event
creating-client-class.emr: calling handler <function add_generate_presigned_url
at 0x7fc8dbf957b8>
[2021-08-08 05:31:14,824] {endpoint.py:292} DEBUG - Setting elasticmapreduce
timeout as (60, 60)
[2021-08-08 05:31:14,825] {loaders.py:174} DEBUG - Loading JSON file:
/home/airflow/.local/lib/python3.6/site-packages/botocore/data/_retry.json
[2021-08-08 05:31:14,825] {client.py:166} DEBUG - Registering retry handlers
for service: emr
[2021-08-08 05:31:14,826] {emr_step.py:73} INFO - Poking step s-581MKMYKRED7
on cluster j-XDTNJDHR23RQ
[2021-08-08 05:31:14,826] {hooks.py:210} DEBUG - Event
before-parameter-build.emr.DescribeStep: calling handler <function
generate_idempotent_uuid at 0x7fc8dbf07048>
[2021-08-08 05:31:14,826] {hooks.py:210} DEBUG - Event
before-call.emr.DescribeStep: calling handler <function
inject_api_version_header_if_needed at 0x7fc8dbf087b8>
[2021-08-08 05:31:14,826] {endpoint.py:101} DEBUG - Making request for
OperationModel(name=DescribeStep) with params: {'url_path': '/',
'query_string': '', 'method': 'POST', 'headers': {'X-Amz-Target':
'ElasticMapReduce.DescribeStep', 'Content-Type': 'application/x-amz-json-1.1',
'User-Agent': 'Boto3/1.17.107 Python/3.6.13 Linux/4.14.238-182.422.amzn2.x86_64
exec-env/AWS_ECS_FARGATE Botocore/1.20.107'}, 'body': b'{"ClusterId":
"j-XDTNJDHR23RQ", "StepId": "s-581MKMYKRED7"}', 'url':
'https://eu-west-1.elasticmapreduce.amazonaws.com/', 'context':
{'client_region': 'eu-west-1', 'client_config': <botocore.config.Config object
at 0x7fc8dba0feb8>, 'has_streaming_input': False, 'auth_type': None}}
[2021-08-08 05:31:14,826] {hooks.py:210} DEBUG - Event
request-created.emr.DescribeStep: calling handler <bound method
RequestSigner.handler of <botocore.signers.RequestSigner object at
0x7fc8dba0fe80>>
[2021-08-08 05:31:14,827] {hooks.py:210} DEBUG - Event
choose-signer.emr.DescribeStep: calling handler <function
set_operation_specific_signer at 0x7fc8dbf01ea0>
[2021-08-08 05:31:14,827] {auth.py:380} DEBUG - Calculating signature using
v4 auth.
[2021-08-08 05:31:14,827] {auth.py:381} DEBUG - CanonicalRequest:
POST
/
content-type:application/x-amz-json-1.1
host:eu-west-1.elasticmapreduce.amazonaws.com
x-amz-date:20210808T053114Z
x-amz-security-token:IQoJb3JpZ2luX2VjEGwaCWV1LXdlc3QtMSJHMEUCIQCZ4NXI3+3YJKZwyu+L4vTyY4swRSd7+zHkDO3Nwc0JiwIgdDuzR2+qJ5vFE03B1RRhzZYGV4Iy/pt7jVXRr0YjXTUqogQIhP//////////ARABGgwyODM3NzQxNDgzNTciDDl+9YdFHetmruZnfSr2A4vFCZwFNe0LEigdz0ayjIhvTZMsjY6CClml27JRBbUllyWz6t3b5SjafRWERgfEe1xIjnFFvp/UFSDIrZVuOefX94uUcgWQKV5zhz/dHdpCzpPltF+syapd2IeY5lgaUZWr+ax+Pfs75JILyfTpa7u43hg1M6P9L07zP/IUWdrXvyvu0wVh07r6/bHUDKJGy52Ok2q6Hr4ltT4MVAY/EZIEDZA8dlNdd7/UDddecp+/7/QIMftzZsgQJVyO+IIB+cfHCIV+1g/ACnsSbBkJuH2fn/B0YeP6e1135yTFknzNKVUxf6OeBIuPrkcu78HnaOXcjv+JSKb9AcFP+2zOo9JYmr040XGVqesZiAc+uDWbL74LDQ7ze5b4sLSNXfCCEvqsYVy9Nt7UlopCXoIS2lBrWgXNhsppDQ976Gr+WgyV1Wbjx2NZlJFCtSNarwF0qDPBmm6yhQxyKMYarOrUpy9aNl/18ujIhJZ2/JMUYlzH0TOBc1YCKIuZLNPwYfxZW6JmosfKXjAFUK7eDn4tU6LSXyo2GNLYzjY71OyCOpllJYtxg/NDAQBlmMCbFPpFCCTjPECU0sGnhfPAdlsyYngT3L0bSyjAZpPRfYrfPd4i3j2niPq9BwF/iOIR95T2CK7zqxXB4TIpporSE0pItmFgynGTh+IwtZm9iAY6pgENShKQaHxJvNj1O5ddYeCwW3H2SqjrHroeOt82xGkH5stZ7g28Uy7iPIPEehlkrB7ry/KjLq9DAo4zVneHhfKJBUBTBxz944AZhd
0d9VvSRGgEXnmxzTsHma9hnkwQfGdbXwH4ja582o7hPEI98AAgoo5OCuV/vyMHHgrJBa7z+7eRtGnCeg+EJZy9WlsilHHGze4QGDEDc4NJypNZM4JON503hXON
x-amz-target:ElasticMapReduce.DescribeStep
content-type;host;x-amz-date;x-amz-security-token;x-amz-target
2a131454435119560c6fa09b6b1eeeb557aba91e61af62fe4883ccc24890c4f3
[2021-08-08 05:31:14,827] {auth.py:383} DEBUG - StringToSign:
AWS4-HMAC-SHA256
20210808T053114Z
20210808/eu-west-1/elasticmapreduce/aws4_request
a2f4ed4536a8bb05d35fc345f45eec9b42f7e071c230286854f034ea4878ced2
[2021-08-08 05:31:14,827] {auth.py:385} DEBUG - Signature:
90da79c86917ea04d8aed50abba5b1b3d5152e5327941eda3bf485e6af620e6b
[2021-08-08 05:31:14,827] {endpoint.py:187} DEBUG - Sending http request:
<AWSPreparedRequest stream_output=False, method=POST,
url=https://eu-west-1.elasticmapreduce.amazonaws.com/, headers={'X-Amz-Target':
b'ElasticMapReduce.DescribeStep', 'Content-Type':
b'application/x-amz-json-1.1', 'User-Agent': b'Boto3/1.17.107 Python/3.6.13
Linux/4.14.238-182.422.amzn2.x86_64 exec-env/AWS_ECS_FARGATE
Botocore/1.20.107', 'X-Amz-Date': b'20210808T053114Z', 'X-Amz-Security-Token':
b'IQoJb3JpZ2luX2VjEGwaCWV1LXdlc3QtMSJHMEUCIQCZ4NXI3+3YJKZwyu+L4vTyY4swRSd7+zHkDO3Nwc0JiwIgdDuzR2+qJ5vFE03B1RRhzZYGV4Iy/pt7jVXRr0YjXTUqogQIhP//////////ARABGgwyODM3NzQxNDgzNTciDDl+9YdFHetmruZnfSr2A4vFCZwFNe0LEigdz0ayjIhvTZMsjY6CClml27JRBbUllyWz6t3b5SjafRWERgfEe1xIjnFFvp/UFSDIrZVuOefX94uUcgWQKV5zhz/dHdpCzpPltF+syapd2IeY5lgaUZWr+ax+Pfs75JILyfTpa7u43hg1M6P9L07zP/IUWdrXvyvu0wVh07r6/bHUDKJGy52Ok2q6Hr4ltT4MVAY/EZIEDZA8dlNdd7/UDddecp+/7/QIMftzZsgQJVyO+IIB+cfHCIV+1g/ACnsSbBkJuH2fn/B0YeP6e1135yTFknzNKVUxf6OeBIuPrkcu78HnaOXcjv+
JSKb9AcFP+2zOo9JYmr040XGVqesZiAc+uDWbL74LDQ7ze5b4sLSNXfCCEvqsYVy9Nt7UlopCXoIS2lBrWgXNhsppDQ976Gr+WgyV1Wbjx2NZlJFCtSNarwF0qDPBmm6yhQxyKMYarOrUpy9aNl/18ujIhJZ2/JMUYlzH0TOBc1YCKIuZLNPwYfxZW6JmosfKXjAFUK7eDn4tU6LSXyo2GNLYzjY71OyCOpllJYtxg/NDAQBlmMCbFPpFCCTjPECU0sGnhfPAdlsyYngT3L0bSyjAZpPRfYrfPd4i3j2niPq9BwF/iOIR95T2CK7zqxXB4TIpporSE0pItmFgynGTh+IwtZm9iAY6pgENShKQaHxJvNj1O5ddYeCwW3H2SqjrHroeOt82xGkH5stZ7g28Uy7iPIPEehlkrB7ry/KjLq9DAo4zVneHhfKJBUBTBxz944AZhd0d9VvSRGgEXnmxzTsHma9hnkwQfGdbXwH4ja582o7hPEI98AAgoo5OCuV/vyMHHgrJBa7z+7eRtGnCeg+EJZy9WlsilHHGze4QGDEDc4NJypNZM4JON503hXON',
'Authorization': b'AWS4-HMAC-SHA256
Credential=ASIAUEESB34CT3URIQHE/20210808/eu-west-1/elasticmapreduce/aws4_request,
SignedHeaders=content-type;host;x-amz-date;x-amz-security-token;x-amz-target,
Signature=90da79c86917ea04d8aed50abba5b1b3d5152e5327941eda3bf485e6af620e6b',
'Content-Length': '59'}>
[2021-08-08 05:31:14,828] {httpsession.py:50} DEBUG - Certificate path:
/home/airflow/.local/lib/python3.6/site-packages/certifi/cacert.pem
[2021-08-08 05:31:14,828] {connectionpool.py:943} DEBUG - Starting new HTTPS
connection (1): eu-west-1.elasticmapreduce.amazonaws.com:443
[2021-08-08 05:31:15,021] {connectionpool.py:442} DEBUG -
https://eu-west-1.elasticmapreduce.amazonaws.com:443 "POST / HTTP/1.1" 200 799
[2021-08-08 05:31:15,021] {parsers.py:233} DEBUG - Response headers:
{'x-amzn-RequestId': '12e9220a-4dc3-4ca0-9ec9-93a529032f9b', 'Content-Type':
'application/x-amz-json-1.1', 'Content-Length': '799', 'Date': 'Sun, 08 Aug
2021 05:31:14 GMT'}
[2021-08-08 05:31:15,021] {parsers.py:234} DEBUG - Response body:
b'{"Step":{"ActionOnFailure":"CONTINUE","Config":{"Args":["spark-submit","--deploy-mode","cluster","--master","yarn","--driver-memory","2G","--executor-memory","8G","--num-executors","5","--executor-cores","3","--packages","io.delta:delta-core_2.11:0.6.1","--py-files","s3://vrt-datalake-artifacts-prod/adobe-target-catalog-sporza/python/adobe_target_catalog_sporza-1.0.0-py3.7_6.egg,s3://vrt-datalake-artifacts-prod/python-datalake-helpers/vrt_datalake_helpers-21-py3-none-any.whl","s3://vrt-datalake-artifacts-prod/adobe-target-catalog-sporza/python/run_6.py","prod","2021-08-08
05:28:00"],"Jar":"command-runner.jar","Properties":{}},"Id":"s-581MKMYKRED7","Name":"adobe-target-catalog-sporza-job","Status":{"State":"PENDING","StateChangeReason":{},"Timeline":{"CreationDateTime":1.628400523922E9}}}}'
[2021-08-08 05:31:15,022] {hooks.py:210} DEBUG - Event
needs-retry.emr.DescribeStep: calling handler
<botocore.retryhandler.RetryHandler object at 0x7fc8db9b82e8>
[2021-08-08 05:31:15,022] {retryhandler.py:187} DEBUG - No retry needed.
[2021-08-08 05:31:15,023] {emr_base.py:66} INFO - Job flow currently PENDING
[2021-08-08 05:31:15,023] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:31:17,730] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:31:17,771] {taskinstance.py:1906} DEBUG - Task Duration set
to 25.225634
[2021-08-08 05:31:21,580] {taskinstance.py:1484} INFO - Rescheduling task,
marking task as UP_FOR_RESCHEDULE
[2021-08-08 05:31:21,581] {cli_action_loggers.py:84} DEBUG - Calling
callbacks: []
[2021-08-08 05:31:24,405] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:31:26,452] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [up_for_reschedule]>
[2021-08-08 05:31:26,531] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:31:26,531] {local_task_job.py:149} INFO - Task exited with
return code 0
[2021-08-08 05:31:26,531] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [up_for_reschedule]> from DB
[2021-08-08 05:31:27,269] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [up_for_reschedule]>
[2021-08-08 05:32:35,215] {__init__.py:51} DEBUG - Loading core task runner:
StandardTaskRunner
[2021-08-08 05:32:36,815] {base_task_runner.py:62} DEBUG - Planning to run
as the user
[2021-08-08 05:32:36,817] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> from DB
[2021-08-08 05:32:38,537] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:32:38,537] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED:
True, The task did not have depends_on_past set.
[2021-08-08 05:32:38,538] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance State' PASSED:
True, Task state queued was valid.
[2021-08-08 05:32:38,578] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED:
True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:32:38,578] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED:
True, The task instance was not marked for retrying.
[2021-08-08 05:32:38,579] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance Not Running'
PASSED: True, Task is not in running state.
[2021-08-08 05:32:38,579] {taskinstance.py:896} INFO - Dependencies all met
for <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:32:38,579] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED:
True, The task did not have depends_on_past set.
[2021-08-08 05:32:39,309] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED:
True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:32:39,310] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED:
True, The task instance was not marked for retrying.
[2021-08-08 05:32:39,477] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Pool Slots Available' PASSED:
True, ('There are enough open slots in %s to execute the task', 'default_pool')
[2021-08-08 05:32:39,477] {taskinstance.py:911} DEBUG - <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Concurrency' PASSED:
True, Task concurrency is not set.
[2021-08-08 05:32:39,477] {taskinstance.py:896} INFO - Dependencies all met
for <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:32:39,477] {taskinstance.py:1087} INFO -
--------------------------------------------------------------------------------
[2021-08-08 05:32:39,477] {taskinstance.py:1088} INFO - Starting attempt 1
of 1
[2021-08-08 05:32:39,477] {taskinstance.py:1089} INFO -
--------------------------------------------------------------------------------
[2021-08-08 05:32:43,976] {taskinstance.py:1107} INFO - Executing
<Task(EmrStepSensor): watch_adobe_target_catalog_sporza_job_emr_step> on
2021-08-07T05:28:00+00:00
[2021-08-08 05:32:43,979] {standard_task_runner.py:52} INFO - Started
process 8339 to run task
[2021-08-08 05:32:43,984] {standard_task_runner.py:76} INFO - Running:
['airflow', 'tasks', 'run', 'derived.adobe_target_catalog_sporza',
'watch_adobe_target_catalog_sporza_job_emr_step', '2021-08-07T05:28:00+00:00',
'--job-id', '98962', '--pool', 'default_pool', '--raw', '--subdir',
'DAGS_FOLDER/adobe_target_catalog_sporza_wf.py', '--cfg-path',
'/tmp/tmpbqux_rdl', '--error-file', '/tmp/tmpyb9gdojg']
[2021-08-08 05:32:43,984] {standard_task_runner.py:77} INFO - Job 98962:
Subtask watch_adobe_target_catalog_sporza_job_emr_step
[2021-08-08 05:32:43,985] {cli_action_loggers.py:66} DEBUG - Calling
callbacks: [<function default_action_log at 0x7fc8dfbe5d90>]
[2021-08-08 05:32:51,719] {settings.py:208} DEBUG - Setting up DB connection
pool (PID 8339)
[2021-08-08 05:32:51,720] {settings.py:244} DEBUG -
settings.prepare_engine_args(): Using NullPool
[2021-08-08 05:32:51,722] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [None]> from DB
[2021-08-08 05:32:53,315] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:32:54,868] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:32:55,812] {logging_mixin.py:104} INFO - Running
<TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> on host
ip-172-31-58-121.eu-west-1.compute.internal
[2021-08-08 05:32:55,813] {taskinstance.py:614} DEBUG - Refreshing
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:32:57,199] {taskinstance.py:649} DEBUG - Refreshed
TaskInstance <TaskInstance:
derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step
2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:32:57,203] {local_task_job.py:194} WARNING - Recorded pid
7972 does not match the current pid 8339
[2021-08-08 05:32:57,206] {process_utils.py:100} INFO - Sending
Signals.SIGTERM to GPID 8339
[2021-08-08 05:32:57,210] {process_utils.py:66} INFO - Process
psutil.Process(pid=8339, status='terminated', exitcode=<Negsignal.SIGTERM:
-15>, started='05:32:43') (8339) terminated with exit code Negsignal.SIGTERM
```
</details>
**What you expected to happen**:
I'd expect the EMRStepSensor to run until the EMR step succeeded, and report
a succesful run.
If my understanding is correct, these final lines in the log show the runner
terminating the task process. If I'm reading the log correctly, 8339 is the
correct PID for the task, and the recorded pid 7972 is the pid for a previous
run. Could it be possible that this pid is not correctly being updated?
```
[2021-08-08 05:32:57,203] {local_task_job.py:194} WARNING - Recorded pid
7972 does not match the current pid 8339
[2021-08-08 05:32:57,206] {process_utils.py:100} INFO - Sending
Signals.SIGTERM to GPID 8339
[2021-08-08 05:32:57,210] {process_utils.py:66} INFO - Process
psutil.Process(pid=8339, status='terminated', exitcode=<Negsignal.SIGTERM:
-15>, started='05:32:43') (8339) terminated with exit code Negsignal.SIGTERM
```
**Anything else we need to know**:
The symptoms look very similar to #17394, but I'm not using run_as_user, and
the reported pids are not the same, so I'm not sure whether this is the same
issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]