> > Maybe a stupid question but why not make SequentialExecutor extend > LocalExecutor with parallelism set to one as you described the similarity?
I think the answer to this is, there is not much point in keeping a sequential executor class around if it's just localexecutor with parallelism 1. Just use local executor with parallelism 1 if that's what you want; don't need a class for that. On Tue, Dec 17, 2024 at 11:18 AM Blain David <david.bl...@infrabel.be> wrote: > Maybe a stupid question but why not make SequentialExecutor extend > LocalExecutor with parallelism set to one as you described the similarity? > > Then you're still backward compatible (for those who would use it anyway), > you get rid of the SequentialExecutor specific code but you still have to > possibility to use it? Or I'm missing something? > > Kind regards, > David > > -----Original Message----- > From: Ash Berlin-Taylor <a...@apache.org> > Sent: Tuesday, 17 December 2024 18:33 > To: dev@airflow.apache.org > Subject: Re: [DISCUSS] Make Sqlite3 "low-prod-ready" and get rid of the > Sequential Executor > > EXTERNAL MAIL: Indien je de afzender van deze e-mail niet kent en deze > niet vertrouwt, klik niet op een link of open geen bijlages. Bij twijfel, > stuur deze e-mail als bijlage naar ab...@infrabel.be<mailto: > ab...@infrabel.be>. > > What Jens said. > > I think sqlite will be a valid path forward for `airflow standalone` adn > the SequentialExecutor could almost be silently upgraded/replaced with the > LocalExecutor, but in terms of priorities for 3.0 release it's certainly > not one of mine. > > So, yes, but I don't have cycles to focus on it :) > > -ash > > > On 17 Dec 2024, at 15:48, Jens Scheffler <j_scheff...@gmx.de.INVALID> > wrote: > > > > HI All, > > > > I'v very much favor such cleanup. Mainly getting rid of sequential > > executor and some flags. > > > > The intend to make it "low production ready" smells dangerous for me as > > this would assume production stability. Which I'd recommend rather to go > > with Postgres. Maybe it could develop into this direction but the > > promise is tooo big atm. > > > > But positively speaking it really could enable "airflow standalone" > > being more like a first class citizen and would allow a much easier > > enable single docker/machine development and debug environment and would > > lower the footprint very much to (DAG but not limited to) developers. > > > > But seeing the stuff we have in front of us for 3.0, I'd propose to > > focus on 3.0 first, if there is spare time then we can make it for 3.0, > > but also w/o any breaking changes I think we can also make it for 3.1 > > (if we maybe deprecate SequentialExecutor early that after a 3.0 we are > > "OK" to remove it. > > > > Jens > > > > P.S.: At the moment there are a couple of feature flags but actually for > > me SequentialExecutor == LocalExecutor(paralellism=1) > > > > On 17.12.24 13:29, Jarek Potiuk wrote: > >> Hello here, > >> > >> TL;DR; Recently Ash created and merged this PR > >> https://github.com/apache/airflow/pull/44839 > >> "Remove 'single process' restrictions on SQLite in favour of using WAL > >> mode" and I think it opens up an interesting possibility - to make > SQLite a > >> "low production ready" database. > >> > >> With this change, some of the limitations of SQLite integration for > Airflow > >> have been removed (multi-process access). With Airflow 3 and moving DB > >> access out from Tasks, we are getting into the situation that all the DB > >> access will be concentrated in the "central" place - webserver, > scheduler, > >> triggerer, dag processor , task api - and with WAL, it seems that all > those > >> **could** access sqlite database locally if they are run on a single > >> machine - while with things like "edge executor" the tasks could run > >> elsewhere (or also on the same machine - with Local Executor). > >> > >> One thing that it enables - we could simply remove SequentialExecutor. > IMHO > >> the only reason why it continued to exist was the case with SQLIte (and > >> even there for quite some time sqlite could work with LocalExecutor with > >> parallelism = 1). There is also a "debuggability" thing - possibly - but > >> with `airflow dag test` - I think Sequential Executor has no longer an > >> advantage there. And we could make LocalExecutor with n = num available > >> processors (maybe - 2 or -3) as default airflow setting - which would > >> mitigate some of the "first-time" experience of people who see that > Airflow > >> is "slow" (with sequential executor it is). And we could get rid of the > >> pesky "Do not use sequential executor in production" warning and > simplify > >> the Executor interface (now executor has a special `is_production` > >> flag/mode). > >> > >> But there is more. > >> > >> If we add to it "airflow standalone" and some ways (even just > instructions > >> or guidelines) for the users how to back-up, possibly compact and > maintain > >> sqlite database, I don't think we are far away from announcing the > Sqlite > >> DB as "low production ready". SQLite is a "real" database, for many > years > >> it's used in production in many, many products and I would say - we have > >> far less problems with sqlite than we have with MySQL - in our CI for > >> example. And if we combine it with "airflow standalone" - I think we > >> **could** say "If you want to run Airflow on one machine, without bit > >> expectations about scalability - Airflow 3 + Sqlite is a **GOOD** > >> production choice" > >> > >> Likely we would have to test it a bit more, and do some documentation > >> around, but I think that could alleviate a lot of concerns and address a > >> bit of a "drawback" people have around Airflow that it is "difficult to > >> start with". Currently when you try airflow - you have all the warnings > >> "don't use this setup - it's only suitable to play with airflow" - but I > >> think we are not too far to say this: > >> > >> > >> 1) run pip install airflow[google,amazon,cohere]==3.0.0 > >> 2) run "airflow standalone" in whatever way you think is best to manage > >> restarts > >> 3) -> that's it. you have very low-scale, production-ready airflow up > and > >> running > >> > >> Especially if we document and figure out some of the limitations, when > >> people should consider switching to more "higher production" settings > with > >> MySQL, Postgres and maybe give them tools to do so - that could also be > a > >> very nice come-back to the original success story of Airflow - where > data > >> engineers were really installing airflow on their own to make their life > >> easier, and after some time their companies had to adopt them and > install > >> Airflow or migrate to managed version at scale - kind of driving Airflow > >> adoption from the "bottom". > >> > >> I think the investment to make "standalone airflow with sqlite3" > >> low-production-ready is relatively small, but being able to openly say - > >> "it's actually SUPER EASY to run airflow for small setup" - is a very > >> powerful selling point of Airflow 3 potentially. > >> > >> But - of course - maybe there are some limitations of Sqlite that I am > not > >> aware of. Ash mentioned in his PR: "Will this be without problems? No, > not > >> entirely," - and yeah, likely it has some limitations and constraints, > but > >> maybe they are not as big, and maybe we **could** commit as a community > to > >> support Sqlite3 as "good" to use for really small installations. > >> > >> WDYT? > >> > >> J. > >> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > For additional commands, e-mail: dev-h...@airflow.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > For additional commands, e-mail: dev-h...@airflow.apache.org > >