That wasn't actually quite what I had in mind :)
I was thinking that we _wouldn't_ go cross process at all, but in the
"local"/direct mode we will as-directly-as-possible call the handler
code. So for local/no-isolation we would still use the handler for the
RPC, but there it's just not "remote".
-ash
On Wed, Feb 16 2022 at 13:01:11 +0100, Jarek Potiuk <[email protected]>
wrote:
Hey Everyone,
Based on the feedback, I updated DAG-44
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API>
- the "implementation notes" with improved approach.
Ash had a good suggestion (which I really like) that instead of
inventing our own decorators and different way of handling the
internal and external communication for the "coarse" functions that
require the database, we could approach it differently - namely we
could always use RPC - no matter if we are in DB isolation mode or
"no isolation" mode. Of course in case of the "no isolation" mode,
the communication should have very low overhead (local TCP or
Sockets, no authorization). I looked at existing RPC implementations
we could use for that and I narrowed down potential choice of
technologies to gRPC and Apache Thrift for that.
This approach has multiple advantages:
* we can leverage existing RPC implementations (Thrift and gRPC are
both mature and have integration with HTTPS, various authentication
options and can be also run using local sockets)
* the code will be much simpler to maintain - we will use existing
serialization mechanisms from those protocols
* no custom code for communication needed - both Thrift and gRPC have
all that is needed for scalable, robust communication
I think this way we will be able to implement a more robust and
maintainable solution much faster.
I also reached out to Apache Beam (they have support for both gRPC
and Thrift and are in the process of transitioning - from Thrift to
gRPC as primary protocol and I am sure they have done a lot of
analysis that can help us to make the final decision.
This approach changes only the implementation details of the AIP-44 -
all the rest is the same, the approach, deployment options remain
untouched by this change.
If you have any comments to that - feel free/ I will also discuss it
today at the meeting and if there will be general consensus that the
direction is right I would love to start voting on AIP-44 ideally
tomorrow - so that next week we can start implementing it. I am not
sure if we want to make a final decision about gRPC/Thrift (maybe
there are people who have good experience both and can share it
here?).
I think more detailed POC and benchmarking might be the first step of
the AiP - where we make the final choice based on an attempt to
implement POC for both - but I am also happy to listen to those who
have more experience with both (and maybe Beam experience will help
with that)..
J.
On Tue, Feb 15, 2022 at 1:49 PM Jarek Potiuk <[email protected]
<mailto:[email protected]>> wrote:
The meeting is tomorrow :)/ Feel free to join I will also record it
and publish minutes!
On Tue, Feb 15, 2022 at 12:31 PM Giorgio Zoppi
<[email protected] <mailto:[email protected]>> wrote:
>
> Hello Everyone,
> is there any follow up of this meeting? I would like to
participate if it's possible.
> Best Regards,
> Giorgio
>
> Il giorno mar 1 feb 2022 alle ore 15:29 Jarek Potiuk
<[email protected] <mailto:[email protected]>> ha scritto:
>>
>> Hello Everyone,
>>
>> I think it's about the time for the next sig-multitenancy
meeting :
>>
>> I created a doodle poll for next week - please mark your
availability till Friday the 4th.
>>
>>
<https://doodle.com/poll/axvu2gz7zhv8ieye?utm_source=poll&utm_medium=link>
>>
>> I think what the rough agenda will be:
>>
>> * AIP-43 Dag Processor Separation [1] - implementation progress
- Mateusz
>> * AIP-44 Airflow Internal API [2] - voting progress (hopefully)
- Jarek
>> * AIP-45 Remove double DAG parsing [3] - discussion - Ping
>> * AIP-46 Docker runtime isolation [4] - discussion - Ping
>> * Also there are some ideas (not yet in AIP form) around
optimizing DagProcessorLoop that might be good to talk about - also
Ping.
>>
>> If there are any more proposals - feel free to ping me.
>> I also encourage everyone to comment the AIP-45/46 proposals
from Ping before the meeting.
>>
>> [1]
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-43+DAG+Processor+separation>
>> [2]
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API>
>> [3]
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-45+Remove+double+dag+parsing+in+airflow+run>
>> [4]
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-46+Add+support+for+docker+runtime+isolation+for+airflow+tasks+and+dag+parsing>
>>
>> J.
>>
>>
>
>
> --
> Life is a chess game - Anonymous.