Hi, Thank you for this explanation. If I summarize, I'll have to write a file_to_postgres Operator, with pg_preoperator and pg_postoperator parameters.
Just a simple question: Where should I add and store this Operator in the airflow ecosystem ? Regards, Flo On Wed, Mar 20, 2019 at 5:05 PM jiajie zhong <[email protected]> wrote: > Hi, Flo. I am not good at PG, but I find code in out master branch > > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/hooks/postgres_hook.py#L63-L89 > < > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/hooks/postgres_hook.py#L63-L83 > > > I think maybe this is what you looking for. > > And, we recommend use Operator to do something instead of Hook. But in we > have no "local-file-pg-operator". maybe you should and this function by > youself. > > BWT, I think > BEGIN; > > CREATE TEMP TABLE catalog_tmp ON COMMIT DROP AS SELECT * FROM catalog WITH > NO DATA; > > \COPY catalog_tmp (...) FROM '/home/cat/catalog.csv' WITH DELIMITER ';' CSV > ENCODING 'LATIN1' NULL ''; > > DELETE FROM catalog_tmp WHERE code IS NULL; > ... > COMMIT; > what you said is a transaction, and so do in a single operator. you could > write code just like > > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/operators/hive_to_mysql.py#L70-L71 > [https://avatars3.githubusercontent.com/u/47359?s=400&v=4]< > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/operators/hive_to_mysql.py#L70-L71 > > > > apache/airflow< > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/operators/hive_to_mysql.py#L70-L71 > > > Apache Airflow. Contribute to apache/airflow development by creating an > account on GitHub. > github.com > > that have "pg_preoperator" and "pg_postoperator" parameter, but extract > data from local file instand of hive. > > ________________________________ > From: Flo Rance <[email protected]> > Sent: Wednesday, March 20, 2019 23:30 > To: [email protected] > Subject: PostgreSQL hook > > Hi, > > I don't know if it's the correct place to ask for that. > > I'm trying to implement one of my cronjob using airflow. One of the tasks > is to load files in a temporary table and then update another table in a > postgres db. > For that, I was previously using a sql script like that: > > BEGIN; > > CREATE TEMP TABLE catalog_tmp ON COMMIT DROP AS SELECT * FROM catalog WITH > NO DATA; > > \COPY catalog_tmp (...) FROM '/home/cat/catalog.csv' WITH DELIMITER ';' CSV > ENCODING 'LATIN1' NULL ''; > > DELETE FROM catalog_tmp WHERE code IS NULL; > ... > COMMIT; > > I would like to replace \copy with the copy_expert from postgresql hook. Is > that realistic ? > If yes, how can I combine a sql script and that hook in one task ? > > Regards, > Flo >
