+1 to the general notes of the convo not much to add

via Newton Mail [https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.6&source=email_footer_2]
On Wed, Oct 7, 2020 at 6:05 AM, Kaxil Naik <[email protected]> wrote:
As long as we make sure LocalExecutor works fine with Sqlite, I am fine with that. But we find any issues with making Sqlite work with LocalExecutor, we should the SequentialExecutor as for new users, they can easily start Airflow without having to worry about DB setup.
Regards, Kaxil
On Wed, Oct 7, 2020, 10:36 Jarek Potiuk < [email protected] [[email protected]] > wrote: Right - if we make sqlite works with LocalExecutor, there is no reason to keep Sequential Executor :).
J.


On Wed, Oct 7, 2020 at 11:26 AM Ash Berlin-Taylor < [email protected] [[email protected]] > wrote:
Oh good point.
I'll take a look -- I think our "don't use SQLite from more than one process" is over-zealous, as SQLite has built in locking and can be used by multiple processes at the same time, with a few caveats. http://www.sqlite.org/draft/faq.html#q5 [http://www.sqlite.org/draft/faq.html#q5] Multiple processes can have the same database open at the same time. Multiple processes can be doing a SELECT at the same time. But only one process can be making changes to the database at any moment in time, however. SQLite uses reader/writer locks to control access to the database. (Under Win95/98/ME which lacks support for reader/writer locks, a probabilistic simulation is used instead.) But use caution: this locking mechanism might not work correctly if the database file is kept on an NFS filesystem. This is because fcntl() file locking is broken on many NFS implementations. You should avoid putting SQLite database files on NFS if multiple processes might try to access the file at the same time. On Windows, Microsoft's documentation says that locking may not work under FAT filesystems if you are not running the Share.exe daemon. People who have a lot of experience with Windows tell me that file locking of network files is very buggy and is not dependable. If what they say is true, sharing an SQLite database between two or more Windows machines might cause unexpected problems. We are aware of no other embedded SQL database engine that supports as much concurrency as SQLite. SQLite allows multiple processes to have the database file open at once, and for multiple processes to read the database at once. When any process wants to write, it must lock the entire database file for the duration of its update. But that normally only takes a few milliseconds. Other processes just wait on the writer to finish then continue about their business. Other embedded SQL database engines typically only allow a single process to connect to the database at once.
So it's doable, we've just been overly cautious in the past.
You're right that the change isn't just removing the executor though! I think worth it overall though.
-ash
On Oct 7 2020, at 10:07 am, Jarek Potiuk < [email protected] [[email protected]] > wrote: How about sqlite? I believe it only runs with Sequential Executor? On Wed, Oct 7, 2020 at 10:59 AM Ash Berlin-Taylor < [email protected] [[email protected]] > wrote: Hi everyone, I've just had a thought: the sequential executor is gives an all around pretty bad experience (it blocks the scheduler, you'll see "scheduler stopped heartbeating" messages if your task run takes a while. So I'd like to propose we change the default executor to LocalExecutor -- to do this we should probably change the default number of slots/processes from 16 to num cpus.
Thoughts?
None of this has to happen for 2.0 (I don't have time to do it), but just wanted to suggest it.
-ash

-- Jarek Potiuk Polidea [https://www.polidea.com/] | Principal Software Engineer M: +48 660 796 129 [tel:+48660796129] [https://www.polidea.com/]


--
Jarek Potiuk Polidea [https://www.polidea.com/] | Principal Software Engineer M: +48 660 796 129 [tel:+48660796129] [https://www.polidea.com/]

Reply via email to