> vs
>
>         conf:
>           type: object
>           description: >
>             JSON object describing additional configuration parameters.


I looked at the swagger/OpenAPI definition and this one will indeed validate
that we pass arbitrary object with properties/values.

It appeared in OpenAPI 3.0 (so very late) and it's called a FreeFormObject
(https://swagger.io/docs/specification/data-models/dictionaries/)
but it has a known problem that both Kamil and I describe.

The automated client generation does not work if this is a "true"
FreeForm object
without type specification. One of the benefits of using OpenAPI is that you can
generate client implementation in many languages.

>From what I understand - that single choice prevents our customers to
use auto-generation in many languages
= reasons for that explained by Kamil). Here are few examples:

* Typescript: 
https://www.gitmemory.com/issue/OpenAPITools/openapi-generator/4119/540506783
* Java: https://github.com/OpenAPITools/openapi-generator/issues/2037
* REST client for JIRA (not sure which language) :
https://community.developer.atlassian.com/t/jira-rest-client-generation-fails-swagger-open-api-generator/36398
* GO: 
https://medium.com/real-apis/free-form-objects-with-array-values-72e85c044e4a

The problem is that the right code cannot be generated for those
languages for freeForm
unless you specify "additionalProperties" field which will tell at
least what types the keys and values
have. Which we cannot do in our case.

Unless someone proves that we can generate automatically, usable
clients in those
languages automatically this one will be vetoed from my side.

I think the ease-of-use and the Open API specification we have is to
make it possible
for the users to use the API easily. but in this case, by one line
decision we are basically
forcing our users to hand-write their whole code
instead of generating it all because of one "FreeForm" object in the API.

I don't think there is a lot of empathy to our users if we make this decision.
To be honest I see it as a bit selfish decision (if what I understand
is correct).

> The first one means technically, acording to the spec you can submit `{
> "conf": "abc" }` -- but that will fail. Where as the second means that
> both the OpenAPI client, and the automatic validation from Connexion on
> the server will handle that for us.

Not fully really because in at least few languages the OpenAPI client
- and validation -
will have to be manually written anyway.

> (With the string case, we would have to JSON deserialize it and check
> it's an JSON object/ python dict. Nothing else is allowed.)

Yeah. Now that we know that it's not allowed we know it needs to be dictionary.
Indeed - we will have to validate that ourselves. There is no
performance loss though
more security  (open API has to do it anyway while parsing for "object
type" and does not
have to do it for string)

As I've written before - the only real validation of the data is done by the
DAG reading the values. Nothing prevents the user from sending { "a":
{"b" : [] } },
when dag expects {"a": "b"}. We cannot *really* validate the data with Open API.

And I think single `isinstance(json.loads(conf),dict)` is far, far, far
much easier to write than hand-coding your client class instead of
generating it.

As I see it now in this case, the real trade-off we are dealing with here is:

* "should we add this line of code to validate" vs.
*  "should we make our users suffer and spend extra xx days to write their
   client libraries instead of generating them".

> To use your example DagRun class, the signature could become:
>
>   public DagRun(String dagRunId, String executionDate, Object conf);
>
> and then the client can do `gson.toJsonTree(conf).getAsJsonObject()` 
> internally.

Actually - It does not work that easily with the dynamic Objects and
structure of the objects.
If you want to pass conf as the conf object to the class and get it working
as you described, you will have to define a separate class for EVERY
TYPE of conf object
that you pass to different DAG and know which of the Type object
should be passed
to the DagRun constructor.

This basically means that if you have two DAGs with two different set
of config parameters you need to define
different Class for each DAG and know which class shoudl be passed to
the constructor
Basically this means that the users will have to develop, recompile
and release Java client
every time they add a new DAG that might require some configuration.

With the string solution the natural thing will be that conf is
hand"coded" and generated dynamically rather
then recompiled as a new Java class.

And you have not shown how you would convert the "client" case I
mentioned where you specify the conf object
via command line as string. How would you decide which class to use?
Based on what? How do you know
which DAG is which conf class? I think this is a really difficult one
to pull by the client if they use
statically typed language like Java.

> JSON-encoded-string in a JSON api is a "code smell" to me.

I think we are giving our static language client users much more than
a "smell" here.

Reply via email to