+1 to Jarek

On Sun, Dec 29, 2019, 21:58 Jarek Potiuk <[email protected]> wrote:

> I think it's a rare thing for anyone to use a generic DB operations (as
> operator). You usually know which database you have as source/target.
> Even if in one company you have more than one database, then it is not a
> problem to use one hook/operator in Postgres and another in MySQL etc.
>
> IMHO it's better to have a specialized DB hook rather than an SQLAlchemy
> one as you can use specialized SQL for that database.
> Also those specialized hooks already use the DbApiHook with all the common
> code extracted so there is not a lot of duplicated code.
>
> SQLAlchemy is good for an application that has a lot of complex logic and
> OO architecture and should rely on several different DBs as backend.
> But it is mostly only limitation in case of simple ETL scripts (which is
> the case for pretty much all Airflow tasks).
>
> J.
>
> On Fri, Dec 27, 2019 at 1:31 PM Bjorn Olsen <[email protected]>
> wrote:
>
> > Hi community
> >
> >
> >
> > I have a question/suggestion about the different SQL Database hooks +
> > operators.
> >
> >
> >
> > Examples:
> >
> > - MsSqlHook (Implements DbApiHook using pymssql)
> >
> > - MySqlHook (Implements DbApiHook using MySQLdb)
> >
> > - PostgresHook (Implements DbApiHook using psycopg2)
> >
> >
> >
> > These hooks tend to have convenience methods for doing more complex DB
> > functions (eg bulk loading), which is useful.
> >
> >
> >
> > However there is also a downside to them.
> >
> > First I need to figure out which operator and hook to use and what driver
> > it is using. Then I have to review the source code to understand what it
> is
> > doing and what it isn't doing. Finally (at my unique client site) I
> request
> > the driver lib to be whitelisted and installed, and I test it for my use
> > case.
> >
> >
> >
> > This can be quite time consuming if working with several different DBs.
> >
> >
> >
> > Does Airflow have a general-purpose SQL hook + operator?
> >
> > I didn't see one but perhaps I'm missing it :)
> >
> >
> >
> > I was considering implementing something based on SQL Alchemy.
> >
> >
> >
> > Advantages:
> >
> > - Range of database dialects
> >
> > - Simple to use esp. for prototyping
> >
> > - Consistent - no DB-specific code
> >
> > - No extra dependencies - Airflow already requires SQL Alchemy
> >
> > - Maybe SQL Alchemy features would be useful eg function-based querying
> >
> >
> >
> > Disadvantages:
> >
> > - Another way to do the same thing (some existing DB hooks)
> >
> > - Maybe less efficient than using DB specific hooks
> >
> >
> >
> > What are your thoughts?
> >
> >
> >
> > Kind regards
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Reply via email to