If nothing else, write an ugly adapter using sync_to_async?

On Mon, Apr 8, 2024 at 1:06 PM Daniel Standish <
daniel.stand...@astronomer.io> wrote:

> https://github.com/omnilib/aiosqlite maybe?
>
> On Mon, Apr 8, 2024 at 1:03 PM Scheffler Jens (XC-AS/EAE-ADA-T)
> <jens.scheff...@de.bosch.com.invalid> wrote:
>
>> I understand the „all-in“ approach as we were dropping MSSQL… how can we
>> keep one codenase bit cooe with sqlite? I assume we must support this at
>> least for dev setups.
>>
>> Sent from Outlook for iOS<https://aka.ms/o0ukef>
>> ________________________________
>> From: Jarek Potiuk <ja...@potiuk.com>
>> Sent: Monday, April 8, 2024 8:30:18 PM
>> To: dev@airflow.apache.org <dev@airflow.apache.org>
>> Subject: Re: [DISCUSS] Asynchronous SQLAlchemy
>>
>> Yep. If we can make both Postgres and MySQL work with Async - I am also
>> all
>> for the "All" approach. If it means that we need to support only certain
>> drivers and certain versions of the DBs - so be it. As mentioned in my
>> original comments (long time ago when we had MSSQL support) - this was not
>> really possible back then - but now, by getting rid of Mssql and if we
>> have
>> the right drivers for mysql, it should be possible - I guess.
>>
>> On Mon, Apr 8, 2024 at 8:17 PM Daniel Standish
>> <daniel.stand...@astronomer.io.invalid> wrote:
>>
>> > I wholeheartedly agree with Ash that it should be all or nothing.  And
>> > *all* sounds
>> > better to me :)
>> >
>> >
>> >
>> > On Mon, Apr 8, 2024 at 10:54 AM Ash Berlin-Taylor <a...@apache.org>
>> wrote:
>> >
>> > > I’m all in favour of async SQLAlchemy. We’ve built two products
>> > > exclusively at @ Astronomer that use sqlalchemy+psycopg3+async and
>> love
>> > it.
>> > > Async does take a bit of a learning curve, but SQLA has done it nicely
>> > and
>> > > it works really well.
>> > >
>> > > I think this needs to be an all or nothing thing — having to maintain
>> > sync
>> > > and async versions of functions/features is a non-starter in my mind;
>> > it’d
>> > > just be a worryingly large amount of duplicated work. Given the only
>> DBs
>> > we
>> > > support now is postgres and mysql then I can’t think of any reason
>> users
>> > > should even care — they give it a DSN and that’s the end of their
>> > > involvement.
>> > >
>> > > Amogh: I don’t understand what you mean by point 3 below.
>> > >
>> > > -ash
>> > >
>> > > > On 8 Apr 2024, at 05:31, Amogh Desai <amoghdesai....@gmail.com>
>> wrote:
>> > > >
>> > > > I checked the content and the PR that you attached.
>> > > >
>> > > > The results do seem promising and I like the general idea of this
>> > > approach.
>> > > > But as Jarek
>> > > > also mentioned on the PR:
>> > > >
>> > > > 1. Not everyone might be on the board to go all async due to certain
>> > > > limitations around
>> > > > access to the drivers, or corporate limitations. So, we definitely
>> > need a
>> > > > way to opt-out
>> > > > for the ones who aren't interested.
>> > > >
>> > > > 2. We should have a seamless fallback to sync if async doesn't work
>> for
>> > > > whatever reasons.
>> > > >
>> > > > 3. Are we going all in or are we limiting the scope to lets say
>> > > > connections + variables and expanding
>> > > > based on the results in the long term?
>> > > >
>> > > > Looking forward to improvements async can bring in!
>> > > >
>> > > > Thanks & Regards,
>> > > > Amogh Desai
>> > > >
>> > > >
>> > > > On Sun, Apr 7, 2024 at 3:13 AM Hussein Awala <huss...@awala.fr>
>> wrote:
>> > > >
>> > > >> The Metadata Database is the brain of Airflow, where all scheduling
>> > > >> decisions, cross-communication, synchronization between components,
>> > and
>> > > >> management via the web server, are made using this database.
>> > > >>
>> > > >> One option to optimize the DB queries is to merge many into a
>> single
>> > > query
>> > > >> to reduce latency and overall time, but this is not always possible
>> > > because
>> > > >> the queries are sometimes completely independent, and it is
>> > > impossible/too
>> > > >> complicated to merge them. But in this case, we have another option
>> > > which
>> > > >> is running them concurrently since they are independent. The only
>> way
>> > > to do
>> > > >> this currently is to use multithreading (the sync_to_async
>> decorator
>> > > >> creates a thread and waits for it using an asyncio coroutine),
>> which
>> > is
>> > > >> already a good start, but by using the asyncio extension for
>> > sqlalchemy
>> > > we
>> > > >> will be able to create thousands of lightweight coroutines with the
>> > same
>> > > >> amount of resources as a few threads, which will also help to
>> reduce
>> > > >> resources consumption.
>> > > >>
>> > > >> A few months ago I started a PoC to add support for this extension
>> and
>> > > >> implement an asynchronous version of connections and variables to
>> be
>> > > able
>> > > >> to get/set them from triggers without blocking the event loop and
>> > > affecting
>> > > >> the performance of the triggerer, and the result was impressive (
>> > > >>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fpull%2F36504&data=05%7C02%7CJens.Scheffler%40de.bosch.com%7C453e379d0e284252391708dc57f9fd87%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638481978411968167%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=LfaqvpAcffa830qqp1fLdsbKuVkpgqsGOSt%2FrnQL2Wk%3D&reserved=0
>> )<https://github.com/apache/airflow/pull/36504>.
>> > > >>
>> > > >> I see a good opportunity to improve the performance of our REST API
>> > and
>> > > web
>> > > >> server (for example
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fissues%2F38776&data=05%7C02%7CJens.Scheffler%40de.bosch.com%7C453e379d0e284252391708dc57f9fd87%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638481978411976153%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=FDHZsKZ6d5%2BTrfI43pVY%2BSHJ2RsMW93MpxIqidhlSoE%3D&reserved=0
>> )<https://github.com/apache/airflow/issues/38776>,
>> > > >> knowing that we can mix sync and async endpoints, which will help
>> for
>> > a
>> > > >> smooth migration.
>> > > >>
>> > > >> I also think that it will be possible (and very useful) to migrate
>> > some
>> > > of
>> > > >> our executors to a full asynchronous version to improve their
>> > > performance
>> > > >> (kubernetes and celery)
>> > > >>
>> > > >> I use the sqlalchemy asyncio extension in many personal and company
>> > > >> projects, and I'm very happy with it, but I would like to hear from
>> > > others
>> > > >> if they have any positive or negative feedback about it.
>> > > >>
>> > > >> I will create a new AIP for integrating the asyncio extension of
>> > > >> sqlaclhemy, and other following AIPs to migrate/support each
>> component
>> > > once
>> > > >> the first one is implemented, but first, I prefer to check what the
>> > > >> community and other committers think about this integration.
>> > > >>
>> > >
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>> > > For additional commands, e-mail: dev-h...@airflow.apache.org
>> > >
>> > >
>> >
>>
>

Reply via email to