Re: [VOTE] AIP-44 - Airflow Internal API

Abhishek Bhakat Thu, 11 Aug 2022 02:06:36 -0700

 +1 (non-binding) for gRPC only.

On 11-Aug-2022 at 8:42:52 AM, Ash Berlin-Taylor <a...@apache.org> wrote:


> +1 (binding) now with that change. Thank you very much.
>
> I'm also okay with us to proceed with only gRPC for now -- with this
> architectural change it's much easier to replace it in future if we
> want/need to, and for others (such as service providers like
> Google/Amazon/Astronomer) to experiment with custom API implementations.
>
> I'm not able to point at any examples, as the ones I know of personally
> aren't open source; they were all written for companies projects/products.
>
> I'm am sorry that we missed it -- speaking personally I've had _a lot_
> going on at home and I've barely had time to "look around" at what else is
> going on in Airflow over the last few months. I'm only now just having the
> head space to actually think in detail about other things that what is
> right in front of me, hence why it came now.
>
> -ash
>
> On Thu, Aug 11 2022 at 09:27:39 +02:00:00, Jarek Potiuk <ja...@potiuk.com>
> wrote:
>
> I made the updates in
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API
> to reflect the above. As I suspected, there were really very few changes
> needed, to make it "GRPC/JSON OpenAPI" agnostic. Ir does not change the
> "gist" and "purpose" of the AIP and we can easily turn it into
> implementation detail after extended POC is complete.
>
> On Thu, Aug 11, 2022 at 9:08 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
>> Hey Ash, Andrew,
>>
>> TL;DR; I slept over it to try to understand what just happened, and I
>> have a proposal.
>>
>> * I am happy to extend my POC with the pure JSon/Open API approach -
>> providing that Andrew/Ash point me to some good examples where the RPC
>> style API has been implemented. If you could point me in the right
>> direction, I am a maverick kind of person and just find my ways there.
>> * I propose we conclude the vote (with added reservation in it, that
>> final choice of the transport will be based on the extended POC) as planned.
>> * I am rather curious to try the things that both of you mentioned you
>> are concerned about (threading/multiprocessing/load balancing) and see how
>> they compare in both approaches - before making final decision and doing
>> first, founding PR
>>
>> More comments (personal feeling):
>>
>> I really treat your comments - (now that they finally came), seriously.
>> And value your expert knowledge and experience. And I personally expect the
>> same in return. I always respond to any AIPs/discussions when I am given
>> enough time and opportunity to comment, I do - In due time. This time it
>> did not happen  - not sure why. I personally feel this is a little
>> disrespectful to my work, but I do not hold any grudges. I am happy to put
>> that completely aside - now that I have your attention and can use
>> your experience, the goal is that we do not make any personal complaints or
>> escalations here, we are all here to make Airflow a better product.
>> Community is the most important, respect for each other's work is an
>> important part of it. And I hope this can improve in the future AIP
>> discussions.
>>
>> But going back to the merit of the discussion:
>>
>> Just to let everyone know I am not too "vested" in gRPC . I spend a
>> little time on implementing the POC with it and after doing the POC and
>> reading testimonies of people I feel this is the right choice. But for this
>> AIP. I made deliberate decisions in the way that the transport can be
>> swapped out if we find reasons we should.
>>
>> I am reasonably happy with the way proposed by Ash (In fact I am going to
>> update the AIP with it in a moment). For me the way how we actually
>> implement the "if"  is an implementation detail that will be worked out on
>> the first PR. The way proposed is good for me, though I would rather
>> experiment a bit with decorators and see if we can make it a bit nicer -
>> but  this is not a dealbreaker at all for me. One concern I have is that we
>> will have another abstraction layer (possibly needless) and that we will
>> have to again repeat all the methods signatures and parameters and keep
>> them in sync (they will now be repeated in 4 or 5 places). But again - this
>> is something that can be likely done in the first "founding" PR we are
>> going to iterate on and work out the best balance between duplication and
>> flexibility/maintainability. And we can always update the AIP with this
>> implementation detail later - very much like it happend with a number of
>> AIPs before - including AIP-39, AIP-40, AIP-42 - all of them went through
>> similar rounds of updates and clarifications as implementation was
>> progressing. It's hard to work out all the details "on paper" or even in
>> "POC". Also I would REALLY love to tap in the experience of people like the
>> both of you, but it seems that the only way to get some good and serious
>> feedback is to call a vote (or make a PR with the intention of merging it).
>>
>> But I even can go further than that - I think independently from voting
>> whether the whole AIP-44 is a good or bad idea. I think there is no doubt
>> we need it, and the "general scope" and approach seems to already reach
>> general consensus, so if we can just continue and complete the vote. I am
>> happy to continue running the POC on using OpenAPI spec and gRPC in
>> parallel and see how they compare. I am always eager to learn and try other
>> things, and if you have valid concerns I am happy to address them by trying
>> out. I would personally like to compare both from the development
>> "friction" point of view, performance, as well as doing some tests trying
>> to address and test the operational (process/thread/load balancing)
>> concerns you both have and see how they can be solved in both and compare.
>> I think there is nothing like "show me the code" and performing
>> actual working POC.
>>
>> And I am super happy to continue with the POC and extend it with a pure
>> JSON/OpenAPI based proposal after voting completes and make the final
>> decision during the first founding PR. And we can even arrange extra votes
>> or lazy consensus before the first PR lands - after seeing all the
>> "ins/outs".
>> The Founding PR is still quite a bit away - I do not want to make any
>> commits before we branch-off the 2.4 - and I don't even want to take too
>> much of your time for that other than discussion and raising concerns and
>> commenting on my findings. I am happy to do all the ground-work here.
>>
>> That's all I ask for. Just treating the work I do seriously.
>>
>> So Ash, Andrew
>>
>> Can you please point me to some examples where RPC-API like ours has been
>> implemented with Open API/JSON? I am curious to learn from those and turn
>> them into POC.
>>
>> And I propose - let's just continue the vote as planned. We already have
>> 3 binding votes, and more +1s than -1s and the time has already passed, but
>> I am happy to let it run till the end of day tomorrow to see if my proposal
>> above will be good to conclude the vote with more consent from all the
>> people involved.
>>
>> J.
>>
>>
>>
>>
>> On Thu, Aug 11, 2022 at 7:49 AM Eugen Kosteev <eu...@kosteev.com> wrote:
>>
>>> +1 (non-binding)
>>>
>>> On Thu, Aug 11, 2022 at 12:08 AM Ash Berlin-Taylor <a...@apache.org>
>>> wrote:
>>>
>>>> So my concerns (in addition to the ones Andrew already pointed out)
>>>> with gRPC: it uses threads! Threads + python always makes me nervous.
>>>> Doubly so when you then couple that with DB access via sqlalchemy - we're
>>>> heading down paths less well travelled if we do this.
>>>>
>>>> From https://github.com/grpc/grpc/blob/master/doc/fork_support.md
>>>>
>>>> > gRPC Python wraps gRPC core, which uses multithreading for
>>>> performance, and hence doesn't support fork(). Historically, we didn't
>>>> support forking in gRPC, but some users seemed to be doing fine until their
>>>> code started to break on version 1.6. This was likely caused by the
>>>> addition of background c-threads and a background Python thread
>>>>
>>>> And there's https://github.com/grpc/grpc/issues/16001 which may or may
>>>> not be a problem for us, I'm unclear:
>>>>
>>>> > The main constraint that must be satisified is that your process must
>>>> only invoke the fork() syscall *before* you create your gRPC server(s).
>>>>
>>>>
>>>> But anyway, the only change I'd like to see is to make the internal API
>>>> more pluggable.
>>>>
>>>> So instead of something like this:
>>>>
>>>>     def process_file(
>>>>         self,
>>>>         file_path: str,
>>>>         callback_requests: List[CallbackRequest],
>>>>         pickle_dags: bool = False,
>>>>     ) -> Tuple[int, int]:
>>>>         if self.use_grpc:
>>>>             return self.process_file_grpc(
>>>>                 file_path=file_path,
>>>> callback_requests=callback_requests, pickle_dags=pickle_dags
>>>>             )
>>>>         return self.process_file_db(
>>>>             file_path=file_path, callback_requests=callback_requests,
>>>> pickle_dags=pickle_dags
>>>>         )
>>>>
>>>> I'd very much like us to have it be something like this:
>>>>
>>>>     def process_file(
>>>>         self,
>>>>         file_path: str,
>>>>         callback_requests: List[CallbackRequest],
>>>>         pickle_dags: bool = False,
>>>>     ) -> Tuple[int, int]:
>>>>         if settings.DATABASE_ACCESS_ISOLATION:
>>>>             return InternalAPIClient.dagbag_process_file(
>>>>                 file_path=file_path,
>>>> callback_requests=callback_requests, pickle_dags=pickle_dags
>>>>             )
>>>>         return self.process_file_db(
>>>>             file_path=file_path, callback_requests=callback_requests,
>>>> pickle_dags=pickle_dags
>>>>         )
>>>>
>>>> i.e. all API access is "marshalled" (perhaps the wrong word) via a
>>>> single pluggable (and eventually configurable/replaceable) API client.
>>>> Additionally (though less important) `self.use_grpc` is changed to a
>>>> property on the `airflow.settings` module (as it is a global setting, not
>>>> an attribute/property of any single instance.)
>>>>
>>>> (In my mind the API client would include the
>>>> from_protobuff/to_protobuff methods that you added to TaskInstance on your
>>>> POC PR. Or the from_protbuff could live in the FileProcessorServiceServicer
>>>> class in your example, but that probably doesn't scale when multiple
>>>> services would take a TI/SimpleTI. I guess they could still also live on
>>>> TaskInstance et al and just not be used - but that doesn't feel as clean to
>>>> me is all)
>>>>
>>>> Thoughts? It's not too big change to encapsulate things like this I
>>>> hope?
>>>>
>>>> Sorry again that we didn't look at the recent work on this AIP sooner.
>>>>
>>>> -ash
>>>>
>>>>
>>>>
>>>> On Wed, Aug 10 2022 at 13:51:02 -06:00:00, Andrew Godwin
>>>> <andrew.god...@astronomer.io.INVALID> wrote:
>>>>
>>>> I also wish we'd had this discussion before!
>>>>
>>>> I've converted several million lines of code into API-driven
>>>> services over the past decade so I have my ways and I'm set in them, I
>>>> guess :)
>>>>
>>>> Notice that I didn't say "use REST" - I don't think REST maps well to
>>>> RPC style codebases, and agree the whole method thing is confusing. I just
>>>> mean "ship JSON in a POST and receive JSON in the response".
>>>>
>>>> As you said before though, the line where you draw the abstraction is
>>>> what matters more than the transport layer, and "fat endpoints" (doing
>>>> transactions and multiple calls on the API server etc.) is, I agree, the
>>>> way this has to go, so it's not like this is something I think is totally
>>>> wrong, just that I've repeatedly tried gRPC on projects like this and been
>>>> disappointed with what it actually takes to ship changes and
>>>> prototype things quickly. Nothing beats the ability to just throw curl at a
>>>> problem - or maybe that's just me.
>>>>
>>>> Anyway, I'll leave you to it - I have my own ideas in this area I'll be
>>>> noodling on, but it's a bit of a different take and aimed more at execution
>>>> as a whole, so I'll come back and discuss them if they're successful.
>>>>
>>>> Andrew
>>>>
>>>> On Wed, Aug 10, 2022 at 1:36 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>
>>>>> 1st of all - I wish we had this discussion before :)
>>>>>
>>>>> > The autogenerated code also is available from OpenAPI specs, of
>>>>> course, and the request/response semantics in the same thread are 
>>>>> precisely
>>>>> what make load balancing these calls a little harder, and horizontal
>>>>> scaling to multiple threads comes with every HTTP server, but I digress -
>>>>> clearly you've made a choice here to write an RPC layer rather than a
>>>>> lightweight API layer, and go with the more RPC-style features, and I get
>>>>> that.
>>>>>
>>>>> OpenAPI is REST not RPC and it does not map well to RPC style calls. I
>>>>> tried to explain in detail in the AIP (and in earlier discussions). And 
>>>>> the
>>>>> choice is basically made for us because of our expectation to have
>>>>> minimal impact on the existing code (and ability to switch off the remote
>>>>> part). We really do not want to introduce new API calls.  We want to make
>>>>> sure that existing "DB transactions" (i.e. coarse grained calls) are
>>>>> remotely executed. So we are not really talking about lightweight API
>>>>> almost by definition. Indeed, Open API also maps a definition described in
>>>>> a declarative way to python code.  but it has this non-nice part that
>>>>> OpenAPI/REST, it is supposed to be run on some resources. We have no
>>>>> resources to run it on - every single method we call is doing something.
>>>>> Even from the REST/OpenAPI semantics I'd have a really hard time to decide
>>>>> whether it should be GET, POST or PUT or DELETE. In most cases this will 
>>>>> be
>>>>> a combination of those 4 on several different resources. All the "nice
>>>>> parts" of Open API (Swagger UI etc.) become next to useless if you try to
>>>>> map such remote procedures we have, to REST calls.
>>>>>
>>>>>
>>>>> > I still think it's choosing something that's more complex to
>>>>> maintain over a simpler, more accessible option, but since I won't be the
>>>>> one writing it, that's not really for me to worry about. I am very curious
>>>>> to see how this evolves as the work towards multitenancy really kicks in
>>>>> and all the API schemas need to change to add namespacing!
>>>>>
>>>>> The gRPC (proto) is designed from ground-up with maintainability in
>>>>> mind. The ability to evolve the API, add new parameters etc. is built-in
>>>>> the protobuf definition. From the user perspective it's actually easier to
>>>>> use than OpenAPI when it comes to remote method calls, because you
>>>>> basically - call a method.
>>>>>
>>>>> And also coming back to monitoring - literally every monitoring
>>>>> platform supports gRPC to monitor. Grafana, NewRelic, CloudWatch, Datadog,
>>>>> you name it. In our case.
>>>>>
>>>>> Also load-balancing:
>>>>>
>>>>> Regardless of the choice we talk about HTTP request/response happening
>>>>> for 1 call. This is the boundary. Each of the calls we have will be a
>>>>> separate transaction, separate call, not connected to any other call. The
>>>>> server handling it will be stateless (state will be stored in a DB when
>>>>> each call completes). I deliberately put the "boundary" of each of
>>>>> the remotely callable methods, to be a complete DB transaction to achieve
>>>>> it.
>>>>>
>>>>> So It really does not change whether we use gRPC or REST/JSON.
>>>>> REST/JSON vs. gRPC is just the content of the message, but this is the 
>>>>> very
>>>>> same HTTP call, with same authentication added on top, same headers - just
>>>>> how the message is encoded is different. The same tools for load balancing
>>>>> works in the same way regardless if we use gRPC or REST/JSON. This is
>>>>> really a higher layer than the one involved in load balancing.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Aug 10, 2022 at 9:10 PM Andrew Godwin
>>>>> <andrew.god...@astronomer.io.invalid> wrote:
>>>>>
>>>>>> Well, binary representation serialization performance is worse for
>>>>>> protobuf than for JSON in my experience (in Python specifically) - unless
>>>>>> you mean size-on-the-wire, which I can agree with but tend to not worry
>>>>>> about very much since it's rarely a bottleneck.
>>>>>>
>>>>>> The autogenerated code also is available from OpenAPI specs, of
>>>>>> course, and the request/response semantics in the same thread are 
>>>>>> precisely
>>>>>> what make load balancing these calls a little harder, and horizontal
>>>>>> scaling to multiple threads comes with every HTTP server, but I digress -
>>>>>> clearly you've made a choice here to write an RPC layer rather than a
>>>>>> lightweight API layer, and go with the more RPC-style features, and I get
>>>>>> that.
>>>>>>
>>>>>> I still think it's choosing something that's more complex to maintain
>>>>>> over a simpler, more accessible option, but since I won't be the one
>>>>>> writing it, that's not really for me to worry about. I am very curious to
>>>>>> see how this evolves as the work towards multitenancy really kicks in and
>>>>>> all the API schemas need to change to add namespacing!
>>>>>>
>>>>>> Andrew
>>>>>>
>>>>>> On Wed, Aug 10, 2022 at 12:52 PM Jarek Potiuk <ja...@potiuk.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Sure I can explain - the main reasons are:
>>>>>>>
>>>>>>> - 1) binary representation performance - impact of this is
>>>>>>> rather limited because our API calls are doing rather heavy processing
>>>>>>> compared to the data being transmitted. But I believe it's not 
>>>>>>> negligible.
>>>>>>> - 2) automated tools to automatically generate strongly typed Python
>>>>>>> code (that's the ProtoBuf part). The strongly typed Python code is what
>>>>>>> convinced me (see my POC). The tooling we got for that is excellent. Far
>>>>>>> more superior than dealing with json-encoded data even with schema.
>>>>>>> - 2) built-in "remote procedure" layer - where we have
>>>>>>> request/response semantics optimisations (for multiple calls over the 
>>>>>>> same
>>>>>>> chanel) and exception handling done for us (This is basically what 
>>>>>>> "Remote
>>>>>>> Procedure" interface provide us)
>>>>>>> - 3) built-in server that efficiently distributes the method called
>>>>>>> from multiple client into a multi-threaded/multi-threaded execution (all
>>>>>>> individual calls are stateless so multi-processing can be added on top
>>>>>>> regardless from the "transport" chosen).
>>>>>>>
>>>>>>> BTW. If you look at my POC code, there is nothing that strongly
>>>>>>> "binds" us to gRPC. The nice thing is that once it is implemented, it 
>>>>>>> can
>>>>>>> be swapped out very easily. The only Proto/gRPC code that "leaks" to
>>>>>>> "generic" Airflow code is mapping of some (most complex) parameters to
>>>>>>> Proto . And this is only for most complex cases - literally only few of 
>>>>>>> our
>>>>>>> types require custom serialisation - most of the mapping is handled
>>>>>>> automatically by generated protobuf code. And we can easily put
>>>>>>> the "complex" mapping in a separate package. Plus there is an "if"
>>>>>>> statement for each of the ~ 30 or so methods that we will have to turn 
>>>>>>> into
>>>>>>> remotely-callable. We can even (as I proposed it as an option) add a 
>>>>>>> little
>>>>>>> python magic and add a simple decorator to handle the "ifs". Then the
>>>>>>> decorator "if" can be swapped with some other "remote call" 
>>>>>>> implementation.
>>>>>>>
>>>>>>> The actual bulk of the implementation is to make sure that all the
>>>>>>> places are covered (that's the testing harness).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 10, 2022 at 8:25 PM Andrew Godwin
>>>>>>> <andrew.god...@astronomer.io.invalid> wrote:
>>>>>>>
>>>>>>>> Hi Jarek,
>>>>>>>>
>>>>>>>> Apologies - I was as not involved as I wanted to be in the AIP-44
>>>>>>>> process, and obviously my vote is non-binding anyway - but having done 
>>>>>>>> a
>>>>>>>> lot of Python API development over the past decade or so I wanted to 
>>>>>>>> know
>>>>>>>> why the decision was made to go with gRPC over just plain HTTP+JSON 
>>>>>>>> (with a
>>>>>>>> schema, of course).
>>>>>>>>
>>>>>>>> The AIP covers why XMLRPC and Thrift lost out to gRPC, which I
>>>>>>>> agree with - but does not go into the option of using a standard Python
>>>>>>>> HTTP server with JSON schema enforcement, such as FastAPI. In my
>>>>>>>> experience, the tools for load balancing, debugging, testing and 
>>>>>>>> monitoring
>>>>>>>> JSON/HTTP are superior and more numerous than those for gRPC, and in
>>>>>>>> addition the asynchronous support for gRPC servers is lacking compared 
>>>>>>>> to
>>>>>>>> their plain HTTP counterparts, and the fact that you can interact and 
>>>>>>>> play
>>>>>>>> with the APIs in prototyping stages without having to handle obtaining
>>>>>>>> correct protobuf versions for the Airflow version you're using.
>>>>>>>>
>>>>>>>> I wouldn't go so far as to suggest a veto, but I do want to see the
>>>>>>>> AIP address why gRPC would win over this option. Apologies again for 
>>>>>>>> the
>>>>>>>> late realisation that gRPC got chosen and was being voted on - it's 
>>>>>>>> been a
>>>>>>>> very busy summer.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Andrew
>>>>>>>>
>>>>>>>> On Wed, Aug 10, 2022 at 12:12 PM Jarek Potiuk <ja...@potiuk.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Just let me express my rather strong dissatisfaction with the way
>>>>>>>>> this "last minute" is raised.
>>>>>>>>>
>>>>>>>>> It is very late to come up with such a statement - not that it
>>>>>>>>> comes at all, but when it comes when everyone had a chance to take a 
>>>>>>>>> look
>>>>>>>>> and comment, including taking a look at the POC and result of checks. 
>>>>>>>>> This
>>>>>>>>> has never been raised even 4 months ago where the only choices were 
>>>>>>>>> Thrift
>>>>>>>>> and gRPc).
>>>>>>>>>
>>>>>>>>> I REALLY hope the arguments are very strong and backed by real
>>>>>>>>> examples and data why it is a bad choice rather than opinions.
>>>>>>>>>
>>>>>>>>> J,.
>>>>>>>>>
>>>>>>>>> On Wed, Aug 10, 2022 at 7:50 PM Ash Berlin-Taylor <a...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Sorry to weigh in at the last minute, but I'm wary of gRPC over
>>>>>>>>>> just JSON, so -1 to that specific choice. Everything else I'm happy 
>>>>>>>>>> with.
>>>>>>>>>>
>>>>>>>>>> I (or Andrew G) will follow up with more details shortly.
>>>>>>>>>>
>>>>>>>>>> -ash
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 10 2022 at 19:38:59 +02:00:00, Jarek Potiuk <
>>>>>>>>>> ja...@potiuk.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Oh yeah :)
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 10, 2022 at 7:23 PM Ping Zhang <pin...@umich.edu>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> ah, good call.
>>>>>>>>>>>
>>>>>>>>>>> I guess the email template can be updated:
>>>>>>>>>>>
>>>>>>>>>>> Only votes from PMC members are binding, but members of the
>>>>>>>>>>>> community are encouraged to check the AIP and vote with 
>>>>>>>>>>>> "(non-binding)".
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Ping
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 10, 2022 at 10:20 AM Jarek Potiuk <ja...@potiuk.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thank you . And BTW. It's binding Ping :). For AIP's commiter's
>>>>>>>>>>>> votes are binding. See
>>>>>>>>>>>> https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#commit-policy
>>>>>>>>>>>> :D
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Aug 10, 2022 at 7:16 PM Ping Zhang <pin...@umich.edu>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ping
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Aug 4, 2022 at 1:42 AM Jarek Potiuk <ja...@potiuk.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hey everyone,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would like to cast a vote for "AIP-44 - Airflow Internal
>>>>>>>>>>>>>> API".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The AIP-44 is here:
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Discussion thread:
>>>>>>>>>>>>>> https://lists.apache.org/thread/nsmo339m618kjzsdkwq83z8omrt08zh3
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The voting will last for 5 days (until 9th of August 2022
>>>>>>>>>>>>>> 11:00 CEST), and until at least 3 binding votes have been
>>>>>>>>>>>>>> cast.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please vote accordingly:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [ ] + 1 approve
>>>>>>>>>>>>>> [ ] + 0 no opinion
>>>>>>>>>>>>>> [ ] - 1 disapprove with the reason
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Only votes from PMC members are binding, but members of the
>>>>>>>>>>>>>> community are encouraged to check the AIP and vote with
>>>>>>>>>>>>>> "(non-binding)".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ----
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just a summary of where we are:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It's been long in the making, but I think it might be a great
>>>>>>>>>>>>>> step-forward to our long-term multi-tenancy goal. I believe the 
>>>>>>>>>>>>>> proposal I
>>>>>>>>>>>>>> have is quite well thought out and discussed a lot in the past:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> * we have a working POC for implementation used for
>>>>>>>>>>>>>> performance testing:
>>>>>>>>>>>>>> https://github.com/apache/airflow/pull/25094
>>>>>>>>>>>>>> * it is based on on industry-standard open-source gRPC (which
>>>>>>>>>>>>>> is already our dependency) which fits better the RPC "model" we 
>>>>>>>>>>>>>> need than
>>>>>>>>>>>>>> our public REST API.
>>>>>>>>>>>>>> * it has moderate performance impact and rather good
>>>>>>>>>>>>>> maintainability features (very little impact on regular 
>>>>>>>>>>>>>> development effort)
>>>>>>>>>>>>>> * it is fully backwards compatible - the new DB isolation
>>>>>>>>>>>>>> will be an optional feature
>>>>>>>>>>>>>> * has a solid plan for full test coverage in our CI
>>>>>>>>>>>>>> * has a backing and plans for more extensive complete testing
>>>>>>>>>>>>>> in "real" environment with Google Composer team support
>>>>>>>>>>>>>> * allows for further extensions as part of AIP-1 (I am
>>>>>>>>>>>>>> planning to re-establish sig-multitenancy effort for follow up 
>>>>>>>>>>>>>> AIPs once
>>>>>>>>>>>>>> this one is well in progress).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> J.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>
>>> --
>>> Eugene
>>>
>>

Re: [VOTE] AIP-44 - Airflow Internal API

Reply via email to