Yeah. I think there are two things I wanted to ask for: 1) removing the sequential executor 2) whether sqlite is "good enough" we remove the "don't use it for anything serious" warning
Re: 1) >> Then you're still backward compatible (for those who would use it anyway), you get rid of the SequentialExecutor specific code but you still have to possibility to use it? Or I'm missing something? > > I think the answer to this is, there is not much point in keeping a > sequential executor class around if it's just localexecutor with > parallelism 1. > > I think sqlite will be a valid path forward for `airflow standalone` adn the SequentialExecutor could almost be silently upgraded/replaced with the LocalExecutor, but in terms of priorities for 3.0 release it’s certainly not one of mine. > Very much agree with Daniel. I don't see (so far) any reason why we should keep Sequential Executor - since SQLite can also now use LocalExecutor - not only with parallelism = 1 but also potentially with more parallelism. Backwards compatibility is not a concern - one we can break it in Airlfow 3 and also it's only been used/recommended for basic quick start. I don't even think we need to deprecate it - the biggest deprecation which is there forever is the warning in the UI "don't use it for anything serious". I cannot think of a better deprecation message :). I think it's a low-cost, nice cleanup - we can have it as good-first-issue and let anyone implement it. Not a high priority, but I think it follows nicely our "Axe everything that might be a distraction later and is painless to remove" now. Unless I hear otherwise, I will propose to remove it formally via "[LAZY CONSENSUS]". Re: 2) Is sqlite "good" for small prod setup > The intend to make it "low production ready" smells dangerous for me as > this would assume production stability. Which I'd recommend rather to go > with Postgres. Maybe it could develop into this direction but the > promise is tooo big atm. > > I think sqlite will be a valid path forward for `airflow standalone` Yeah putting more focus for Airlfow 3.0 is not a good idea. I thought a bit more about it - I think yes, we should not promote it as "production ready" (and promise too much), but on the other hand if we can make "airflow standalone" the first and foremost way of interacting with Airflow as your first time experience (that you might want to to carry for quite some time even for more serious tests/runs - is quite a good "story" to tell. I.e. "You thought it's difficult to run Airlfow? Not any more - with Airflow 3 you can use "airflow standalone" and get the full experience of local running without doing much more. I think this is the story I would like to be able to tell with Airflow 3 - to dispell the perception that Airflow is "notoriously difficult to setup" which I heard multiple times. I am not sure if there is any action here other than maybe doing some tests with "airflow standalone" and updating and restructuring the documentation of ours - to put more focus on "first time user experience" - and possibly modifying slightly (or removing?) the warning that is displayed when you use sqlite (with Local Executor as I assume we remove Sequential one). Maybe that's what's the scope of 2) that we might feel good about for Airflow 3 ? J. On Wed, Dec 18, 2024 at 2:20 AM Daniel Standish <daniel.stand...@astronomer.io.invalid> wrote: > > > > Maybe a stupid question but why not make SequentialExecutor extend > > LocalExecutor with parallelism set to one as you described the > similarity? > > > I think the answer to this is, there is not much point in keeping a > sequential executor class around if it's just localexecutor with > parallelism 1. Just use local executor with parallelism 1 if that's what > you want; don't need a class for that. > > > > > On Tue, Dec 17, 2024 at 11:18 AM Blain David <david.bl...@infrabel.be> > wrote: > > > Maybe a stupid question but why not make SequentialExecutor extend > > LocalExecutor with parallelism set to one as you described the > similarity? > > > > Then you're still backward compatible (for those who would use it > anyway), > > you get rid of the SequentialExecutor specific code but you still have to > > possibility to use it? Or I'm missing something? > > > > Kind regards, > > David > > > > -----Original Message----- > > From: Ash Berlin-Taylor <a...@apache.org> > > Sent: Tuesday, 17 December 2024 18:33 > > To: dev@airflow.apache.org > > Subject: Re: [DISCUSS] Make Sqlite3 "low-prod-ready" and get rid of the > > Sequential Executor > > > > EXTERNAL MAIL: Indien je de afzender van deze e-mail niet kent en deze > > niet vertrouwt, klik niet op een link of open geen bijlages. Bij twijfel, > > stuur deze e-mail als bijlage naar ab...@infrabel.be<mailto: > > ab...@infrabel.be>. > > > > What Jens said. > > > > I think sqlite will be a valid path forward for `airflow standalone` adn > > the SequentialExecutor could almost be silently upgraded/replaced with > the > > LocalExecutor, but in terms of priorities for 3.0 release it's certainly > > not one of mine. > > > > So, yes, but I don't have cycles to focus on it :) > > > > -ash > > > > > On 17 Dec 2024, at 15:48, Jens Scheffler <j_scheff...@gmx.de.INVALID> > > wrote: > > > > > > HI All, > > > > > > I'v very much favor such cleanup. Mainly getting rid of sequential > > > executor and some flags. > > > > > > The intend to make it "low production ready" smells dangerous for me as > > > this would assume production stability. Which I'd recommend rather to > go > > > with Postgres. Maybe it could develop into this direction but the > > > promise is tooo big atm. > > > > > > But positively speaking it really could enable "airflow standalone" > > > being more like a first class citizen and would allow a much easier > > > enable single docker/machine development and debug environment and > would > > > lower the footprint very much to (DAG but not limited to) developers. > > > > > > But seeing the stuff we have in front of us for 3.0, I'd propose to > > > focus on 3.0 first, if there is spare time then we can make it for 3.0, > > > but also w/o any breaking changes I think we can also make it for 3.1 > > > (if we maybe deprecate SequentialExecutor early that after a 3.0 we are > > > "OK" to remove it. > > > > > > Jens > > > > > > P.S.: At the moment there are a couple of feature flags but actually > for > > > me SequentialExecutor == LocalExecutor(paralellism=1) > > > > > > On 17.12.24 13:29, Jarek Potiuk wrote: > > >> Hello here, > > >> > > >> TL;DR; Recently Ash created and merged this PR > > >> https://github.com/apache/airflow/pull/44839 > > >> "Remove 'single process' restrictions on SQLite in favour of using WAL > > >> mode" and I think it opens up an interesting possibility - to make > > SQLite a > > >> "low production ready" database. > > >> > > >> With this change, some of the limitations of SQLite integration for > > Airflow > > >> have been removed (multi-process access). With Airflow 3 and moving DB > > >> access out from Tasks, we are getting into the situation that all the > DB > > >> access will be concentrated in the "central" place - webserver, > > scheduler, > > >> triggerer, dag processor , task api - and with WAL, it seems that all > > those > > >> **could** access sqlite database locally if they are run on a single > > >> machine - while with things like "edge executor" the tasks could run > > >> elsewhere (or also on the same machine - with Local Executor). > > >> > > >> One thing that it enables - we could simply remove SequentialExecutor. > > IMHO > > >> the only reason why it continued to exist was the case with SQLIte > (and > > >> even there for quite some time sqlite could work with LocalExecutor > with > > >> parallelism = 1). There is also a "debuggability" thing - possibly - > but > > >> with `airflow dag test` - I think Sequential Executor has no longer an > > >> advantage there. And we could make LocalExecutor with n = num > available > > >> processors (maybe - 2 or -3) as default airflow setting - which would > > >> mitigate some of the "first-time" experience of people who see that > > Airflow > > >> is "slow" (with sequential executor it is). And we could get rid of > the > > >> pesky "Do not use sequential executor in production" warning and > > simplify > > >> the Executor interface (now executor has a special `is_production` > > >> flag/mode). > > >> > > >> But there is more. > > >> > > >> If we add to it "airflow standalone" and some ways (even just > > instructions > > >> or guidelines) for the users how to back-up, possibly compact and > > maintain > > >> sqlite database, I don't think we are far away from announcing the > > Sqlite > > >> DB as "low production ready". SQLite is a "real" database, for many > > years > > >> it's used in production in many, many products and I would say - we > have > > >> far less problems with sqlite than we have with MySQL - in our CI for > > >> example. And if we combine it with "airflow standalone" - I think we > > >> **could** say "If you want to run Airflow on one machine, without bit > > >> expectations about scalability - Airflow 3 + Sqlite is a **GOOD** > > >> production choice" > > >> > > >> Likely we would have to test it a bit more, and do some documentation > > >> around, but I think that could alleviate a lot of concerns and > address a > > >> bit of a "drawback" people have around Airflow that it is "difficult > to > > >> start with". Currently when you try airflow - you have all the > warnings > > >> "don't use this setup - it's only suitable to play with airflow" - > but I > > >> think we are not too far to say this: > > >> > > >> > > >> 1) run pip install airflow[google,amazon,cohere]==3.0.0 > > >> 2) run "airflow standalone" in whatever way you think is best to > manage > > >> restarts > > >> 3) -> that's it. you have very low-scale, production-ready airflow up > > and > > >> running > > >> > > >> Especially if we document and figure out some of the limitations, when > > >> people should consider switching to more "higher production" settings > > with > > >> MySQL, Postgres and maybe give them tools to do so - that could also > be > > a > > >> very nice come-back to the original success story of Airflow - where > > data > > >> engineers were really installing airflow on their own to make their > life > > >> easier, and after some time their companies had to adopt them and > > install > > >> Airflow or migrate to managed version at scale - kind of driving > Airflow > > >> adoption from the "bottom". > > >> > > >> I think the investment to make "standalone airflow with sqlite3" > > >> low-production-ready is relatively small, but being able to openly > say - > > >> "it's actually SUPER EASY to run airflow for small setup" - is a very > > >> powerful selling point of Airflow 3 potentially. > > >> > > >> But - of course - maybe there are some limitations of Sqlite that I am > > not > > >> aware of. Ash mentioned in his PR: "Will this be without problems? No, > > not > > >> entirely," - and yeah, likely it has some limitations and > constraints, > > but > > >> maybe they are not as big, and maybe we **could** commit as a > community > > to > > >> support Sqlite3 as "good" to use for really small installations. > > >> > > >> WDYT? > > >> > > >> J. > > >> > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > >