nicolaferraro commented on issue #1980:
URL: https://github.com/apache/camel-k/issues/1980#issuecomment-773151586
Let's do another iteration on this...
I'm thinking to your comments and I like the idea of having stuff also as
CRs. I remember some brainstorming with @lburgazzoli about how dynamic schemas
may work in this model. The idea was to let Kamelets define their schemas, if
known in advance, but also let KameletBindings redefine them, if needed.
DataFormats are generic in Camel, but when talking about connectors (a.k.a.
Kamelets), I think it's better for the Kamelet to enumerate all the possible
dataformats it supports. E.g. @davsclaus was talking about sources that can
only produce `binary` data (i.e. no dataformat), but there are many other
examples: e.g. a "hello world" string cannot be transformed into FHIR data by
simply plugging the FHIR JSON dataformat, as well as not all data is suitable
for CSV encoding..
I also see that we're talking about formats and schemas as if they were the
same thing, but even if they are related (i.e. dataFormat + Kamelet [+ Binding
Properties] may imply a Schema), maybe we can do a better job in treating them
as separate entities.
I think the following model may be good for the in-Kamelet specification of
a "format":
```yaml
kind: Kamelet
apiVersion: camel.apache.org/v1alpha1
metadata:
name: chuck-source
# ...
spec:
definition:
properties:
format:
title: Format
type: string
enum:
- JSON
- Avro
default: JSON
# ...
formats:
- name: JSON
# optional, useful in case of in/out Kamelets
scope: out
schema:
mediaType: "application/json"
data: # the JSON schema inline
url: # alternative link to the shema
ref: # alternative Kubernetes reference to the schema (see below)
name: # ...
# the source produces JSON by default, no libs or transformations needed
- name: Avro
schema:
type: avro-schema
mediaType: "application/avro"
data: # the avro schema inline
url: # alternative link to the schema
ref: # alternative Kubernetes reference to the schema (see below)
name: # ...
dataFormat:
# optional, but if not provided "no format" is assumed
id: "avro"
properties: # only if "id" is present
class-name: org.apache.camel.xxx.MyClass
compute-schema: true|false
# ...
dependencies:
- camel:jackson
- camel:avro
- mvn:org.acme/my-artifact/1.0.0
```
You can notice the `scope` property that allows to define the specific
details of transformations for input and output of a particular format. I'd not
complicate life and assume that users will choose only 1 format using the
standard `format` property (not an `inputFormat` and `outputFormat`). So if I
choose `CSV`, the Kamelet will consume and produce CSV. Anyway, the shape
(schema) of the input CSV can be different from the one of the output CSV (and
that's described in the Kamelet).
The `schema` here is declared inline in the Kamelet, to make it
self-contained, but we can create also a `Schema` CR:
```yaml
kind: Schema
apiVersion: camel.apache.org/v1alpha1
metadata:
name: my-avro-schema
spec:
type: avro-schema
mediaType: application/avro
data: # the avro schema inline
url: # alternative URL reference
# no, ref is forbidden here
```
Structure is almost the same as the inline version.
The binding can use the predefined schema:
```yaml
kind: KameletBinding
apiVersion: camel.apache.org/v1alpha1
metadata:
name: chuck-to-channel
spec:
source:
kind: Kamelet
apiVersion: camel.apache.org/v1alpha1
name: chuck-source
properties:
# may have been omitted, since it's the default
format: JSON
sink:
# ...
```
The binding above will produce objects in JSON format with the inline
definition of the schema. The one below is using a custom schema:
```yaml
kind: KameletBinding
apiVersion: camel.apache.org/v1alpha1
metadata:
name: chuck-to-channel
spec:
source:
kind: Kamelet
apiVersion: camel.apache.org/v1alpha1
name: chuck-source
properties:
# since there's no inline format named "my-avro", it refers to the
external one
format: Avro
schema:
# since it's a source, we assume this is the schema of the output
ref:
name: my-avro-schema
# or alternatively also inline
data: #...
url: # ...
sink:
# ...
```
This mechanism may be used also in cases where the schema can be computed
dynamically before running the integration. In this case, an external entity
saves the schema in a CR and references it in the KameletBinding.
For the use case of using the Schema CR to sync external entities (like
registries), it's possible, but we should think more about that because of edge
cases: sometimes the schema is known only at runtime and sometimes it varies
from message to message. In that cases, it's the integration itself that needs
to update the registries. Probably it would be cleaner if it's the integration
that always updates the registry.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]