One more proposal on that. Why don't we fail hard Airflow in Postgres/MySQL when Sequential Executor is used?
I think we might avoid some confusion. We had this long discussion with Kaxil - where (after 2 years of working with Airflow) I've been (wrongly) almost 100% sure that Postgres/MySQL already use local executor by default (because 1.5 years ago we configured it like that for our system tests) and I have not realized that this is not the default. I do not think there is any benefit of using Sequential now for Postgres/MySQL so we can simply fail hard if it is set for those with (Please change to Local Executor message) This might be 2.0-only change. J. On Mon, Oct 12, 2020 at 1:35 AM Daniel Imberman <[email protected]> wrote: > +1 to the general notes of the convo not much to add > > via Newton Mail > <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.6&source=email_footer_2> > > On Wed, Oct 7, 2020 at 6:05 AM, Kaxil Naik <[email protected]> wrote: > > As long as we make sure LocalExecutor works fine with Sqlite, I am fine > with that. But we find any issues with making Sqlite work with > LocalExecutor, we should the SequentialExecutor as for new users, they can > easily start Airflow without having to worry about DB setup. > > Regards, > Kaxil > > On Wed, Oct 7, 2020, 10:36 Jarek Potiuk <[email protected]> wrote: > >> Right - if we make sqlite works with LocalExecutor, there is no reason to >> keep Sequential Executor :). >> >> J. >> >> >> >> On Wed, Oct 7, 2020 at 11:26 AM Ash Berlin-Taylor <[email protected]> wrote: >> >>> Oh good point. >>> >>> I'll take a look -- I think our "don't use SQLite from more than one >>> process" is over-zealous, as SQLite has built in locking and can be used by >>> multiple processes at the same time, with a few caveats. >>> >>> http://www.sqlite.org/draft/faq.html#q5 >>> >>> Multiple processes can have the same database open at the same time. >>> Multiple processes can be doing a SELECT at the same time. But only one >>> process can be making changes to the database at any moment in time, >>> however. >>> >>> SQLite uses reader/writer locks to control access to the database. >>> (Under Win95/98/ME which lacks support for reader/writer locks, a >>> probabilistic simulation is used instead.) But use caution: this locking >>> mechanism might not work correctly if the database file is kept on an NFS >>> filesystem. This is because fcntl() file locking is broken on many NFS >>> implementations. You should avoid putting SQLite database files on NFS if >>> multiple processes might try to access the file at the same time. On >>> Windows, Microsoft's documentation says that locking may not work under FAT >>> filesystems if you are not running the Share.exe daemon. People who have a >>> lot of experience with Windows tell me that file locking of network files >>> is very buggy and is not dependable. If what they say is true, sharing an >>> SQLite database between two or more Windows machines might cause unexpected >>> problems. >>> >>> We are aware of no other *embedded* SQL database engine that supports >>> as much concurrency as SQLite. SQLite allows multiple processes to have the >>> database file open at once, and for multiple processes to read the database >>> at once. When any process wants to write, it must lock the entire database >>> file for the duration of its update. But that normally only takes a few >>> milliseconds. Other processes just wait on the writer to finish then >>> continue about their business. Other embedded SQL database engines >>> typically only allow a single process to connect to the database at once. >>> >>> >>> So it's doable, we've just been overly cautious in the past. >>> >>> You're right that the change isn't just removing the executor though! I >>> think worth it overall though. >>> >>> -ash >>> >>> On Oct 7 2020, at 10:07 am, Jarek Potiuk <[email protected]> >>> wrote: >>> >>> How about sqlite? I believe it only runs with Sequential Executor? >>> >>> On Wed, Oct 7, 2020 at 10:59 AM Ash Berlin-Taylor <[email protected]> >>> wrote: >>> >>> Hi everyone, >>> >>> I've just had a thought: the sequential executor is gives an all around >>> pretty bad experience (it blocks the scheduler, you'll see "scheduler >>> stopped heartbeating" messages if your task run takes a while. >>> >>> So I'd like to propose we change the default executor to LocalExecutor >>> -- to do this we should probably change the default number of >>> slots/processes from 16 to num cpus. >>> >>> Thoughts? >>> >>> None of this has to happen for 2.0 (I don't have time to do it), but >>> just wanted to suggest it. >>> >>> -ash >>> >>> >>> >>> -- >>> >>> Jarek Potiuk >>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>> >>> M: +48 660 796 129 <+48660796129> >>> [image: Polidea] <https://www.polidea.com/> >>> >>> >> >> -- >> >> Jarek Potiuk >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >> M: +48 660 796 129 <+48660796129> >> [image: Polidea] <https://www.polidea.com/> >> >> -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
