[ 
https://issues.apache.org/jira/browse/AIRFLOW-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-6672:
--------------------------------
    Affects Version/s:     (was: 1.10.7)
                       2.0.0

> AWS DataSync - better logging of error message
> ----------------------------------------------
>
>                 Key: AIRFLOW-6672
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6672
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: aws
>    Affects Versions: 2.0.0
>            Reporter: Bjorn Olsen
>            Assignee: Bjorn Olsen
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> When the AWS DataSync operator fails, it dumps a TaskDescription to the log. 
> The TaskDescription is in JSON format and contains several elements. This is 
> hard to read to try and see what exactly went wrong.
> Example 1:
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO - 
> task_execution_description=\{"TaskExecutionArn": 
> "arn:aws:datasync:***:***:task/task-***/execution/exec-***", "Status": 
> "ERROR", "Options": {"VerifyMode": "ONLY_FILES_TRANSFERRED", "OverwriteMode": 
> "ALWAYS", "Atime": "BEST_EFFORT", "Mtime": "PRESERVE", "Uid": "INT_VALUE", 
> "Gid": "INT_VALUE", "PreserveDeletedFiles": "PRESERVE", "PreserveDevices": 
> "NONE", "PosixPermissions": "PRESERVE", "BytesPerSecond": -1, "TaskQueueing": 
> "ENABLED"}, "Excludes": [], "Includes": [\{"FilterType": "SIMPLE_PATTERN", 
> "Value": "***"}], "StartTime": datetime.datetime(2020, 1, 28, 17, 36, 2, 
> 816000, tzinfo=tzlocal()), "EstimatedFilesToTransfer": 7, 
> "EstimatedBytesToTransfer": 4534925, "FilesTransferred": 7, "BytesWritten": 
> 4534925, "BytesTransferred": 4534925, "Result": \{"PrepareDuration": 9795, 
> "PrepareStatus": "SUCCESS", "TotalDuration": 351660, "TransferDuration": 
> 338568, "TransferStatus": "SUCCESS", "VerifyDuration": 7006, "VerifyStatus": 
> "ERROR", "ErrorCode": "OpNotSupp", "ErrorDetail": "Operation not supported"}, 
> "ResponseMetadata": \{"RequestId": "***", "HTTPStatusCode": 200, 
> "HTTPHeaders": {"date": "Tue, 28 Jan 2020 15:44:39 GMT", "content-type": 
> "application/x-amz-json-1.1", "content-length": "994", "connection": 
> "keep-alive", "x-amzn-requestid": "***"}, "RetryAttempts": 0}}
> Example 2:
> [2020-01-28 18:23:23,322] \{datasync.py:354} INFO - 
> task_execution_description=\{"TaskExecutionArn": 
> "arn:aws:datasync:***:***:task/task-***/execution/exec-***", "Status": 
> "ERROR", "Options": {"VerifyMode": "ONLY_FILES_TRANSFERRED", "OverwriteMode": 
> "ALWAYS", "Atime": "BEST_EFFORT", "Mtime": "PRESERVE", "Uid": "INT_VALUE", 
> "Gid": "INT_VALUE", "PreserveDeletedFiles": "PRESERVE", "PreserveDevices": 
> "NONE", "PosixPermissions": "PRESERVE", "BytesPerSecond": -1, "TaskQueueing": 
> "ENABLED"}, "Excludes": [], "Includes": [\{"FilterType": "SIMPLE_PATTERN", 
> "Value": "***"}], "StartTime": datetime.datetime(2020, 1, 28, 17, 45, 57, 
> 212000, tzinfo=tzlocal()), "EstimatedFilesToTransfer": 0, 
> "EstimatedBytesToTransfer": 0, "FilesTransferred": 0, "BytesWritten": 0, 
> "BytesTransferred": 0, "Result": \{"PrepareDuration": 16687, "PrepareStatus": 
> "SUCCESS", "TotalDuration": 2083467, "TransferDuration": 2065744, 
> "TransferStatus": "ERROR", "VerifyDuration": 5251, "VerifyStatus": "SUCCESS", 
> "ErrorCode": "SockTlsHandshakeFailure", "ErrorDetail": "DataSync agent ran 
> into an error connecting to AWS.Please review the DataSync network 
> requirements and ensure required endpoints are accessible from the agent. 
> Please contact AWS support if the error persists."}, "ResponseMetadata": 
> \{"RequestId": "***", "HTTPStatusCode": 200, "HTTPHeaders": {"date": "Tue, 28 
> Jan 2020 16:23:23 GMT", "content-type": "application/x-amz-json-1.1", 
> "content-length": "1179", "connection": "keep-alive", "x-amzn-requestid": 
> "***"}, "RetryAttempts": 0}}
>  
> Note that the 'Result' element contains the statuses and errors that are of 
> interest, however these are hard to see in the log at the moment.
> Example of a successful one:
> 'Result': \{'PrepareDuration': 9663, 'PrepareStatus': 'SUCCESS', 
> 'TotalDuration': 352095, 'TransferDuration': 338358, 'TransferStatus': 
> 'SUCCESS', 'VerifyDuration': 7171, 'VerifyStatus': 'SUCCESS'},
> Suggested output is to include the previous line/s but also add:
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO/ERROR - Status=SUCCESS/ERROR
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO/ERROR - 
> PrepareStatus=SUCCESS/ERROR PrepareDuration=9795
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO/ERROR - 
> TransferStatus=SUCCESS/ERROR TransferDuration=9795
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO/ERROR - 
> VerifyStatus=SUCCESS/ERROR TransferDuration=9795
> [2020-01-28 17:44:39,495] \{datasync.py:354} ERROR - ErrorCode=OpNotSupp, 
> ErrorDetail=Operation not supported
>  
> This should make it much clearer what the job status and errors are.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to