[
https://issues.apache.org/jira/browse/AIRFLOW-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372416#comment-15372416
]
Bence Nagy commented on AIRFLOW-139:
------------------------------------
Oh, sorry, I'm subscribed to all Jira notifications but I think that's a bad
idea since I didn't see me specifically being mentioned here. So to answer a
month late:
[~mattland]'s suggestion does not sound right to me at all. Special cases are
already bad, but in this case they wouldn't be even correct; Postgres 8.0.2
does not support autocommit and even though it's unlikely to be running for
people, it is just plain wrong in my opinion to include an exception for it
just because Redshift happened to fork that version. Not only that, but then
Redshift would break as well if Amazon ever decided to change their server
version (even if just rebasing onto 8.0.3 or something.) So, yeah, my stance
here is a big 'no, don't do that'.
As I mentioned in the GitHub commit comments as well, the entire fact that
Redshift worked with the PostgresOperator is a coincidence, and as far as I
know, was never officially said to be supported by it. Due to nuances like this
one, most other products working with databases that I've seen (for instance
DataGrip and Metabase) consider Redshift to be separate from vanilla
PostgreSQL, with the former requiring a special DB driver for it, and the
latter not supporting Redshift at all even though they support Postgres. In
light of all this, I think the correct and logical solution here is to add a
RedshiftOperator, with different behavior around the value of
{{supports_autocommit}} and possibly some other differences as well down the
road if the operators get more complex. I don't have a strong opinion on
whether it should be a subclass of PostgresOperator, but I think I would
implement this like that myself, since after all, Redshift is a derivative of
Postgres, sharing quite a lot of behavior.
> Executing VACUUM with PostgresOperator
> --------------------------------------
>
> Key: AIRFLOW-139
> URL: https://issues.apache.org/jira/browse/AIRFLOW-139
> Project: Apache Airflow
> Issue Type: Bug
> Affects Versions: Airflow 1.7.0
> Reporter: Rafael
>
> Dear Airflow Maintainers,
> h1. Environment
> * Airflow version: *v1.7.0*
> * Airflow components: *PostgresOperator*
> * Python Version: *Python 3.5.1*
> * Operating System: *15.4.0 Darwin*
> h1. Description of Issue
> I am trying to execute a `VACUUM` command as part of DAG with the
> `PostgresOperator`, which fails with the following error:
> {quote}
> [2016-05-14 16:14:01,849] {__init__.py:36} INFO - Using executor
> SequentialExecutor
> Traceback (most recent call last):
> File "/usr/local/bin/airflow", line 15, in <module>
> args.func(args)
> File
> "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/bin/cli.py",
> line 203, in run
> pool=args.pool,
> File
> "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/models.py",
> line 1067, in run
> result = task_copy.execute(context=context)
> File
> "/usr/local/lib/python3.5/site-packages/airflow/operators/postgres_operator.py",
> line 39, in execute
> self.hook.run(self.sql, self.autocommit, parameters=self.parameters)
> File
> "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/hooks/dbapi_hook.py",
> line 109, in run
> cur.execute(s)
> psycopg2.InternalError: VACUUM cannot run inside a transaction block
> {quote}
> I could create a small python script that performs the operation, as
> explained in [this stackoverflow
> entry](http://stackoverflow.com/questions/1017463/postgresql-how-to-run-vacuum-from-code-outside-transaction-block).
> However, I would like to know first if the `VACUUM` command should be
> supported by the `PostgresOperator`.
> h1. Reproducing the Issue
> The operator can be declared as follows:
> {quote}
> conn = ('postgres_default')
> t4 = PostgresOperator(
> task_id='vacuum',
> postgres_conn_id=conn,
> sql=("VACUUM public.table"),
> dag=dag
> )
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)