[
https://issues.apache.org/jira/browse/AIRFLOW-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16885368#comment-16885368
]
ASF GitHub Bot commented on AIRFLOW-4961:
-----------------------------------------
fahran commented on pull request #5593: [AIRFLOW-4961] Prevent cast when
writing TaskFail model to DB
URL: https://github.com/apache/airflow/pull/5593
When writing a task failure record, convert 'duration' decimal
value -> an int before persistence, to remove reliance on the
database doing this automatically and gracefully.
-------
Make sure you have checked _all_ steps below.
### Jira
- [✅] My PR addresses the following:
https://issues.apache.org/jira/browse/AIRFLOW-4961
### Description
- [✅] Here are some details about my PR, including screenshots of any UI
changes:
I raised https://issues.apache.org/jira/browse/AIRFLOW-4961 earlier today,
to cover a minor issue we encountered whilst running airflow using CockroachDB.
That ticket covers the full explanation of this PR.
### Tests
- [ 🤔] My PR adds the following unit tests __OR__ does not need testing for
this extremely good reason:
There aren't currently unit tests for testfail.py (that's not to say there
shouldn't be), and an integration test would pass with the present code anyway,
as Postgres' error handling masks the issue.
It's also a pretty trivial change. Happy to add a test for testfail.py if we
feel it's needed though.
### Commits
- [ ✅] My commits all reference Jira issues in their subject lines, and I
have squashed multiple commits if they address the same issue. In addition, my
commits follow the guidelines from "[How to write a good git commit
message](http://chris.beams.io/posts/git-commit/)":
1. Subject is separated from body by a blank line
1. Subject is limited to 50 characters (not including Jira issue reference)
1. Subject does not end with a period
1. Subject uses the imperative mood ("add", not "adding")
1. Body wraps at 72 characters
1. Body explains "what" and "why", not "how"
### Documentation
- [ ✅] In case of new functionality, my PR adds documentation that describes
how to use it.
- All the public functions and the classes in the PR contain docstrings
that explain what it does
- If you implement backwards incompatible changes, please leave a note in
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so
we can assign it to a appropriate release
### Code Quality
- [ ✅ ] Passes `flake8`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> SQL Error when writing Task Failure (using CockroachDB)
> -------------------------------------------------------
>
> Key: AIRFLOW-4961
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4961
> Project: Apache Airflow
> Issue Type: Bug
> Components: database
> Affects Versions: 1.10.3
> Reporter: Fahran Wallace
> Priority: Minor
>
> We're running Airflow against CockroachDB (which is designed to be
> "compatible" with PostgresSQL
> [https://www.cockroachlabs.com/blog/why-postgres/]), but with a few subtle
> differences
> [https://www.cockroachlabs.com/docs/stable/porting-postgres.html]).
>
> It's been fairly painless so far (we've got a few minor tweaks of SQL grammar
> in our fork, especially around DB migrations), but we've come across
> something we think is worth merging back to Airflow trunk. It's both not
> quite correct behaviour in normal operation anyway, and it's a 4 character
> fix.
> When a task has failed, we get this error writing to the task_fail table in
> (taskfail.py):
> {code:java}
> value type decimal doesn't match type INT8 of column "duration"{code}
> This is because total_seconds() returns a decimal, but it's assigned to the
> variable duration, bound to an integer database column here:
>
> {code:java}
> self.duration = (self.end_date - self.start_date).total_seconds()
> {code}
>
> [https://github.com/apache/airflow/blob/b33b9898677e70ef3fcb4bf03cd28b4077681224/airflow/models/taskfail.py#L53]
> A fix looks like:
> {code:java}
> self.duration = int((self.end_date - self.start_date).total_seconds()){code}
> which means we're no longer reliant on the database to truncate and cast this
> correctly,
> The flipside of this argument is documented here:
> [https://www.postgresql.org/message-id/[email protected]] -
> postgres made a conscious decision to not throw errors in this situation.
> I'll ping a copy of this ticket over to Cockroach labs too, to see if they
> have any thoughts (it's not listed as a current incompatibility, but probably
> should be).
>
> From the stacktrace:
> {code:java}
> [2019-07-03 09:06:33,402] {{base_task_runner.py:101}} INFO - Job
> 465548343190323206: Subtask update_snapshot cursor.execute(statement,
> parameters)
> [2019-07-03 09:06:33,402] {{base_task_runner.py:101}} INFO - Job
> 465548343190323206: Subtask update_snapshot sqlalchemy.exc.ProgrammingError:
> (psycopg2.errors.DatatypeMismatch) value type decimal doesn't match type INT8
> of column "duration"
> [2019-07-03 09:06:33,402] {{base_task_runner.py:101}} INFO - Job
> 465548343190323206: Subtask update_snapshot [SQL: 'INSERT INTO task_fail
> (task_id, dag_id, execution_date, start_date, end_date, duration) VALUES
> (%(task_id)s, %(dag_id)s, %(execution_date)s, %(start_date)s, %(end_date)s,
> %(duration)s) RETURNING task_fail.id'] [parameters: {'task_id':
> 'update_snapshot', 'dag_id': 'snapshot_table_update', 'execution_date':
> <Pendulum [2019-07-03T08:30:46.221467+00:00]>, 'start_date':
> datetime.datetime(2019, 7, 3, 9, 1, 19, 436714, tzinfo=<TimezoneInfo [UTC,
> GMT, +00:00:00, STD]>), 'end_date': datetime.datetime(2019, 7, 3, 9, 6, 33,
> 337921, tzinfo=<Timezone [UTC]>), 'duration': 313.901207}] (Background on
> this error at: http://sqlalche.me/e/f405)
> {code}
> I'll attach a PR to this ticket in a sec.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)