One more proposal on that. Why don't we fail hard Airflow in Postgres/MySQL
when Sequential Executor is used?

I think we might avoid some confusion.

We had this long discussion with Kaxil - where  (after 2 years of working
with Airflow) I've been (wrongly) almost 100% sure that Postgres/MySQL
already use local executor by default (because 1.5 years ago we configured
it like that for our system tests) and I have not realized that this is not
the default.

I do not think there is any benefit of using Sequential now for
Postgres/MySQL so we can simply fail hard if it is set for those with
(Please change to Local Executor message)

This might be 2.0-only change.

J.




On Mon, Oct 12, 2020 at 1:35 AM Daniel Imberman <[email protected]>
wrote:

> +1 to the general notes of the convo not much to add
>
> via Newton Mail
> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.6&source=email_footer_2>
>
> On Wed, Oct 7, 2020 at 6:05 AM, Kaxil Naik <[email protected]> wrote:
>
> As long as we make sure LocalExecutor works fine with Sqlite, I am fine
> with that. But we find any issues with making Sqlite work with
> LocalExecutor, we should the SequentialExecutor as for new users, they can
> easily start Airflow without having to worry about DB setup.
>
> Regards,
> Kaxil
>
> On Wed, Oct 7, 2020, 10:36 Jarek Potiuk <[email protected]> wrote:
>
>> Right - if we make sqlite works with LocalExecutor, there is no reason to
>> keep Sequential Executor :).
>>
>> J.
>>
>>
>>
>> On Wed, Oct 7, 2020 at 11:26 AM Ash Berlin-Taylor <[email protected]> wrote:
>>
>>> Oh good point.
>>>
>>> I'll take a look -- I think our "don't use SQLite from more than one
>>> process" is over-zealous, as SQLite has built in locking and can be used by
>>> multiple processes at the same time, with a few caveats.
>>>
>>> http://www.sqlite.org/draft/faq.html#q5
>>>
>>> Multiple processes can have the same database open at the same time.
>>> Multiple processes can be doing a SELECT at the same time. But only one
>>> process can be making changes to the database at any moment in time,
>>> however.
>>>
>>> SQLite uses reader/writer locks to control access to the database.
>>> (Under Win95/98/ME which lacks support for reader/writer locks, a
>>> probabilistic simulation is used instead.) But use caution: this locking
>>> mechanism might not work correctly if the database file is kept on an NFS
>>> filesystem. This is because fcntl() file locking is broken on many NFS
>>> implementations. You should avoid putting SQLite database files on NFS if
>>> multiple processes might try to access the file at the same time. On
>>> Windows, Microsoft's documentation says that locking may not work under FAT
>>> filesystems if you are not running the Share.exe daemon. People who have a
>>> lot of experience with Windows tell me that file locking of network files
>>> is very buggy and is not dependable. If what they say is true, sharing an
>>> SQLite database between two or more Windows machines might cause unexpected
>>> problems.
>>>
>>> We are aware of no other *embedded* SQL database engine that supports
>>> as much concurrency as SQLite. SQLite allows multiple processes to have the
>>> database file open at once, and for multiple processes to read the database
>>> at once. When any process wants to write, it must lock the entire database
>>> file for the duration of its update. But that normally only takes a few
>>> milliseconds. Other processes just wait on the writer to finish then
>>> continue about their business. Other embedded SQL database engines
>>> typically only allow a single process to connect to the database at once.
>>>
>>>
>>> So it's doable, we've just been overly cautious in the past.
>>>
>>> You're right that the change isn't just removing the executor though! I
>>> think worth it overall though.
>>>
>>> -ash
>>>
>>> On Oct 7 2020, at 10:07 am, Jarek Potiuk <[email protected]>
>>> wrote:
>>>
>>> How about sqlite? I believe it only runs with Sequential Executor?
>>>
>>> On Wed, Oct 7, 2020 at 10:59 AM Ash Berlin-Taylor <[email protected]>
>>> wrote:
>>>
>>> Hi everyone,
>>>
>>> I've just had a thought: the sequential executor is gives an all around
>>> pretty bad experience (it blocks the scheduler, you'll see "scheduler
>>> stopped heartbeating" messages if your task run takes a while.
>>>
>>> So I'd like to propose we change the default executor to LocalExecutor
>>> -- to do this we should probably change the default number of
>>> slots/processes from 16 to num cpus.
>>>
>>> Thoughts?
>>>
>>> None of this has to happen for 2.0 (I don't have time to do it), but
>>> just wanted to suggest it.
>>>
>>> -ash
>>>
>>>
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to