Hey Giorgio: We use JSONSchema for validation in our JSONSerializer when we need it. We can do the same in this case. But we can also choose not to do it - based on actual implementation and testing how much it costs. This is typical practice when you have full control of both sides and you can run a comprehensive test suite (which we will).
1) The method inventory is in the AIP docs - https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API. It might be slightly outdated as Airflow is constantly being developed and we deliberately have not put details in the docs - but you can find all the methods in the code and check what parameters there are to pass/return. Those are basically parameters of each method that we are "remoting" and returning values. 2) Those are already provided. As you can see from the initial communication Both POCs - mine https://github.com/apache/airflow/pull/25094 and Mateusz''s (you can find it at the beginning of the thread) contains the code that we used for testing. If you want to experiment with those - feel free. > P.S when we'll feel the need of speed, PyO3 + Rust is the way to go or also > without going native, asyncio+uvloop. Absolutely. If you need the best speed, those would be my favourites too. PyO3 + Rust is precisely what Pydantic v2 uses (see this plan that Samuel came up with https://pydantic-docs.helpmanual.io/blog/pydantic-v2/). Unfortunately Pydantic v2 is still months away (likely more than few). And maybe one day we switch when we will not have to fight with its teething problems. We are in a little different situation than most "public" APIs out there. All those methods of ours that we are going to remote will make a (usually remote) Relational Database Query, converting Python objects ORM to SQL - usually pretty heavy query at that. Executing them in the DB and going back. Our tests confirm that and since then optimising that part is non-goal for us (or rather has much lower priority than familiarity with other parts of the codebase). We do not want to micro-optimise the part of the process that can give us low, single-percentage improvement. And we can always do it in the future if we get to the point that this is our bottleneck - and it will be easy to switch if we decide to. J. On Tue, Nov 8, 2022 at 6:13 PM Giorgio Zoppi <[email protected]> wrote: > > Makes sense. > It's ok exchanging a json, but it's also important to provide a schema for > input validation in those cases. > Yes, you'll have to maintain the schema, but safer is better than sorry. Two > questions: > 1. which is the model that you want to serialize? I don't see a clear > speration of concern between rpc rest call. > 2. And also can you provide the tests for minimal experimentation? > Best Regards, > Giorgio > P.S when we'll feel the need of speed, PyO3 + Rust is the way to go or also > without going native, asyncio+uvloop. > in my recent REST service tests, the latency became comparable to Go. >
