hubert-pietron opened a new pull request #21540:
URL: https://github.com/apache/airflow/pull/21540


   closes: #16295 
   closes: #18482
   
   JDBC operator could not log errors when task failed. For specific type of 
error (jpype.java.sql.SQLException, jaydebeapi.DatabaseError, 
jaydebeapi.InterfaceError) this line 
https://github.com/apache/airflow/blob/0a2d0d1ecbb7a72677f96bc17117799ab40853e0/airflow/models/taskinstance.py#L1733
   would produce AttributeError
   ```
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING - Traceback (most 
recent call last):
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING -   File 
"/opt/***/***/models/taskinstance.py", line 1726, in handle_failure
       self.log.error("Task failed with exception", 
exc_info=error.__traceback__)
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING -   File 
"/usr/local/lib/python3.8/logging/__init__.py", line 1475, in error
       self._log(ERROR, msg, args, **kwargs)
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING -   File 
"/usr/local/lib/python3.8/logging/__init__.py", line 1589, in _log
       self.handle(record)
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING -   File 
"/usr/local/lib/python3.8/logging/__init__.py", line 1599, in handle
       self.callHandlers(record)
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING -   File 
"/usr/local/lib/python3.8/logging/__init__.py", line 1661, in callHandlers
       hdlr.handle(record)
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING -   File 
"/usr/local/lib/python3.8/logging/__init__.py", line 950, in handle
       rv = self.filter(record)
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING -   File 
"/usr/local/lib/python3.8/logging/__init__.py", line 811, in filter
       result = f.filter(record)
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING -   File 
"/opt/***/***/utils/log/secrets_masker.py", line 169, in filter
       self._redact_exception_with_context(exc)
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING -   File 
"/opt/***/***/utils/log/secrets_masker.py", line 150, in 
_redact_exception_with_context
       exception.args = (self.redact(v) for v in exception.args)
   [2022-02-12, 09:27:16 UTC] {logging_mixin.py:115} WARNING - AttributeError: 
can't set attribute
   ``` 
   Catching that error allow to see meaningful log when JDBC task fails.
   
   example DAG
   ```
   from datetime import datetime, timedelta
   from airflow import DAG
   from airflow.providers.jdbc.operators.jdbc import JdbcOperator
   
   with DAG(
       dag_id='jdbc_with_error',
       schedule_interval='0 0 * * *',
       start_date=datetime(2021, 1, 1),
       dagrun_timeout=timedelta(minutes=60),
       tags=['example'],
       catchup=False,
   ) as dag:
   
       start = JdbcOperator(
           task_id='test_task',
           sql='select 1 x y z',
           jdbc_conn_id='my_jdbc_connection',
           autocommit=True,
       )
   ```
   log after change
   ```
   *** Reading local file: 
/root/airflow/logs/jdbc_with_error/test_task/2022-02-12T15:38:46.534291+00:00/1.log
   [2022-02-12, 15:38:47 UTC] {taskinstance.py:1052} INFO - Dependencies all 
met for <TaskInstance: jdbc_with_error.test_task 
manual__2022-02-12T15:38:46.534291+00:00 [queued]>
   [2022-02-12, 15:38:47 UTC] {taskinstance.py:1052} INFO - Dependencies all 
met for <TaskInstance: jdbc_with_error.test_task 
manual__2022-02-12T15:38:46.534291+00:00 [queued]>
   [2022-02-12, 15:38:47 UTC] {taskinstance.py:1249} INFO - 
   
--------------------------------------------------------------------------------
   [2022-02-12, 15:38:47 UTC] {taskinstance.py:1250} INFO - Starting attempt 1 
of 1
   [2022-02-12, 15:38:47 UTC] {taskinstance.py:1251} INFO - 
   
--------------------------------------------------------------------------------
   [2022-02-12, 15:38:47 UTC] {taskinstance.py:1270} INFO - Executing 
<Task(JdbcOperator): test_task> on 2022-02-12 15:38:46.534291+00:00
   [2022-02-12, 15:38:47 UTC] {standard_task_runner.py:52} INFO - Started 
process 7755 to run task
   [2022-02-12, 15:38:47 UTC] {standard_task_runner.py:76} INFO - Running: 
['***', 'tasks', 'run', 'jdbc_with_error', 'test_task', 
'manual__2022-02-12T15:38:46.534291+00:00', '--job-id', '245', '--raw', 
'--subdir', 'DAGS_FOLDER/example_dag.py', '--cfg-path', '/tmp/tmp7ryzrpgx', 
'--error-file', '/tmp/tmp0u3p6q8m']
   [2022-02-12, 15:38:47 UTC] {standard_task_runner.py:77} INFO - Job 245: 
Subtask test_task
   [2022-02-12, 15:38:47 UTC] {logging_mixin.py:115} INFO - Running 
<TaskInstance: jdbc_with_error.test_task 
manual__2022-02-12T15:38:46.534291+00:00 [running]> on host 464ba29d61dc
   [2022-02-12, 15:38:47 UTC] {logging_mixin.py:115} WARNING - 
/opt/***/***/models/taskinstance.py:830 DeprecationWarning: Passing 
'execution_date' to 'XCom.clear()' is deprecated. Use 'run_id' instead.
   [2022-02-12, 15:38:47 UTC] {logging_mixin.py:115} WARNING - 
/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/coercions.py:518 
SAWarning: Coercing Subquery object into a select() for use in IN(); please 
pass a select() construct explicitly
   [2022-02-12, 15:38:48 UTC] {taskinstance.py:1442} INFO - Exporting the 
following env vars:
   AIRFLOW_CTX_DAG_OWNER=***
   AIRFLOW_CTX_DAG_ID=jdbc_with_error
   AIRFLOW_CTX_TASK_ID=test_task
   AIRFLOW_CTX_EXECUTION_DATE=2022-02-12T15:38:46.534291+00:00
   AIRFLOW_CTX_DAG_RUN_ID=manual__2022-02-12T15:38:46.534291+00:00
   [2022-02-12, 15:38:48 UTC] {jdbc.py:68} INFO - Executing: select 1 x y z
   [2022-02-12, 15:38:48 UTC] {base.py:70} INFO - Using connection to: id: 
my_jdbc_connection. Host: jdbc:sqlite:elo, Port: None, Schema: , Login: , 
Password: None, extra: {'extra__jdbc__drv_clsname': 'org.sqlite.JDBC', 
'extra__jdbc__drv_path': '/opt/***/sqlite-jdbc-3.36.0.3.jar'}
   [2022-02-12, 15:38:48 UTC] {dbapi.py:213} INFO - Running statement: select 1 
x y z, parameters: None
   [2022-02-12, 15:38:48 UTC] {taskinstance.py:1738} ERROR - 
org.sqlite.SQLiteException: [SQLITE_ERROR] SQL error or missing database (near 
"y": syntax error)
   [2022-02-12, 15:38:48 UTC] {taskinstance.py:1278} INFO - Marking task as 
FAILED. dag_id=jdbc_with_error, task_id=test_task, 
execution_date=20220212T153846, start_date=20220212T153847, 
end_date=20220212T153848
   [2022-02-12, 15:38:48 UTC] {local_task_job.py:162} INFO - Task exited with 
return code 1
   [2022-02-12, 15:38:48 UTC] {local_task_job.py:279} INFO - 0 downstream tasks 
scheduled from follow-on schedule check
   [2022-02-12, 15:38:48 UTC] {logging_mixin.py:115} WARNING - 
/opt/***/***/models/dag.py:1068 SADeprecationWarning: Query.value() is 
deprecated and will be removed in a future release.  Please use 
Query.with_entities() in combination with Query.scalar() (deprecated since: 1.4)
   ```
   
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to