Thanks for all the pointers. I have finally gotten to implement a POC
based on GRPC and I am super-happy with it so far. It has all the
modern support we need in Airflow and seems performant enough to serve
our case.

J.

On Thu, Feb 17, 2022 at 6:52 PM Kenneth Knowles <[email protected]> wrote:
>
> Another TL;DR that may not be covered in the history is that we initially set 
> out with a couple of goals that have since been abandoned:
>
> 1. Allow Beam to be used in a particular language/ecosystem without a 
> dependency on the portability framework (NO - we want everything to use the 
> portability framework)
> 2. Allow Beam's portable model to be independent of transport (NO - using 
> protobuf for the messages it really only makes sense to use protobuf + gRPC 
> for transport)
> 2a. Potentially allow Beam's portable model to be represented in multiple 
> serialization formats (NO - there are enough impedance mismatches that it is 
> just not worthwhile, even though proto has lots of problems at least we can 
> develop workarounds only once)
>
> We never did develop with anything other than proto+gRPC in mind.
>
> Kenn
>
> On Thu, Feb 17, 2022 at 4:55 AM Jarek Potiuk <[email protected]> wrote:
>>
>> Thank you ! I will dive deeper - but having just those pointers is a good 
>> start (I likely mixed up gRPC - Thrift bridges with replacing of Thrift  
>> Luke!)
>>
>> On Thu, Feb 17, 2022 at 5:28 AM Kenneth Knowles <[email protected]> wrote:
>>>
>>> I can find you that fun mailing list pointer, if you like. Here's a 
>>> starting point with the subject "[DISCUSS] Beam data plane serialization 
>>> tech"
>>>
>>> https://lists.apache.org/thread/dz24chmm18skzgcmxl2jxookd3yn79r1
>>>
>>> Kenn
>>>
>>> On Wed, Feb 16, 2022 at 10:23 AM Luke Cwik <[email protected]> wrote:
>>>>
>>>> Apache Beam never had an RPC layer for the internal workings of the 
>>>> project until the portability project[1] started so there never was a 
>>>> transition from Apache Thrift to gRPC.
>>>>
>>>> Generally the support for HTTP2 and long lived streaming connections were 
>>>> the key differentiators for gRPC.
>>>>
>>>> 1: https://beam.apache.org/roadmap/portability/
>>>>
>>>> On Wed, Feb 16, 2022 at 2:38 AM Jarek Potiuk <[email protected]> wrote:
>>>>>
>>>>> Hello Beam friends,
>>>>>
>>>>> I have a question, we are preparing (as part of 
>>>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API)
>>>>>  to split Airflow into more components which will be communicating using 
>>>>> RPC.
>>>>>
>>>>> Basically we need to extract some of the internal methods into a "remote 
>>>>> procedure calls" which then we would like to be able to call either 
>>>>> "really remotely" (over HTTPS) or locally (via local TCP/Unix domain 
>>>>> sockets).
>>>>>
>>>>> I have narrowed down the options we have to Apache Thrift and gRPC. I 
>>>>> know that Apache Beam was (is ?) in a transition period Thrift -> GRPC 
>>>>> and I am sure you have some experiences to share and (following your 
>>>>> mailing lists) I am sure there was a deep analysis done for those two 
>>>>> before you decided to switch.
>>>>>
>>>>> Before I start searching through your mailing list, maybe someone knows a 
>>>>> document or some summary of the two that you could share with us - that 
>>>>> probably could save us a lot of effort deciding which of those two might 
>>>>> be better for our needs.
>>>>>
>>>>> Is there something that you know of easily that can be shared?
>>>>>
>>>>> J,
>>>>>

Reply via email to