+1 to the general notes of the convo not much to add
via Newton Mail
[https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.6&source=email_footer_2]
On Wed, Oct 7, 2020 at 6:05 AM, Kaxil Naik <[email protected]> wrote:
As long as we make sure LocalExecutor works fine with Sqlite, I am fine
with that. But we find any issues with making Sqlite work with
LocalExecutor, we should the SequentialExecutor as for new users, they can
easily start Airflow without having to worry about DB setup.
Regards, Kaxil
On Wed, Oct 7, 2020, 10:36 Jarek Potiuk < [email protected]
[[email protected]] > wrote:
Right - if we make sqlite works with LocalExecutor, there is no reason to
keep Sequential Executor :).
J.
On Wed, Oct 7, 2020 at 11:26 AM Ash Berlin-Taylor < [email protected]
[[email protected]] > wrote:
Oh good point.
I'll take a look -- I think our "don't use SQLite from more than one
process" is over-zealous, as SQLite has built in locking and can be used by
multiple processes at the same time, with a few caveats.
http://www.sqlite.org/draft/faq.html#q5
[http://www.sqlite.org/draft/faq.html#q5]
Multiple processes can have the same database open at the same time.
Multiple processes can be doing a SELECT at the same time. But only one
process can be making changes to the database at any moment in time,
however. SQLite uses reader/writer locks to control access to the database.
(Under Win95/98/ME which lacks support for reader/writer locks, a
probabilistic simulation is used instead.) But use caution: this locking
mechanism might not work correctly if the database file is kept on an NFS
filesystem. This is because fcntl() file locking is broken on many NFS
implementations. You should avoid putting SQLite database files on NFS if
multiple processes might try to access the file at the same time. On
Windows, Microsoft's documentation says that locking may not work under FAT
filesystems if you are not running the Share.exe daemon. People who have a
lot of experience with Windows tell me that file locking of network files
is very buggy and is not dependable. If what they say is true, sharing an
SQLite database between two or more Windows machines might cause unexpected
problems. We are aware of no other embedded SQL database engine that
supports as much concurrency as SQLite. SQLite allows multiple processes to
have the database file open at once, and for multiple processes to read the
database at once. When any process wants to write, it must lock the entire
database file for the duration of its update. But that normally only takes
a few milliseconds. Other processes just wait on the writer to finish then
continue about their business. Other embedded SQL database engines
typically only allow a single process to connect to the database at once.
So it's doable, we've just been overly cautious in the past.
You're right that the change isn't just removing the executor though! I
think worth it overall though.
-ash
On Oct 7 2020, at 10:07 am, Jarek Potiuk < [email protected]
[[email protected]] > wrote: How about sqlite? I believe it only
runs with Sequential Executor?
On Wed, Oct 7, 2020 at 10:59 AM Ash Berlin-Taylor < [email protected]
[[email protected]] > wrote: Hi everyone,
I've just had a thought: the sequential executor is gives an all around
pretty bad experience (it blocks the scheduler, you'll see "scheduler
stopped heartbeating" messages if your task run takes a while.
So I'd like to propose we change the default executor to LocalExecutor --
to do this we should probably change the default number of slots/processes
from 16 to num cpus.
Thoughts?
None of this has to happen for 2.0 (I don't have time to do it), but just
wanted to suggest it.
-ash
-- Jarek Potiuk
Polidea [https://www.polidea.com/] | Principal Software Engineer
M: +48 660 796 129 [tel:+48660796129]
[https://www.polidea.com/]
--
Jarek Potiuk
Polidea [https://www.polidea.com/] | Principal Software Engineer
M: +48 660 796 129 [tel:+48660796129]
[https://www.polidea.com/]