[DISCUSS] Changing Schema for Kafka Source Connector

Ben Hutcheson Sat, 07 Nov 2020 03:11:15 -0800

Hi,

Putting together a doc for the schema for the source connector for Kafka it
appears that the existing schema is something like this:-


{
  "type": "record",
  "name": "source-connector",
  "namespace": "com.apache.plc4x.kafka.config",
  "fields": [
    {
      "name": "running",
      "type": "int16"
    },
    {
      "name": "conveyorLeft",
      "type": "int16"
    },
    ....
  ]
}

In which the schema needs to be modified depending on what tags are being
collected.

If we change it so that the tags are included in an array of name/value
pairs then we wouldn't have to modify the schema when tags are
added/deleted. The new schema would be something like this.

{
  "type": "record",
  "name": "source-connector",
  "namespace": "com.apache.plc4x.kafka.config",
  "fields": [
    {
      "name": "tag",
      "type":{
                "type": "array",
                "items":{
                    "name":"tag",
                    "type":"record",
                    "fields":[
                        {"name":"name", "type":"string"},
                        {"name":"value", "type":"string"}    <-  With this
eventually being a union of different types.
                    ]
                }
            }
    }
    {
      "name": "timestamp",
      "type": "string"
    }
  ]
}

What are your thoughts on changing this? It would allow us not to have to
send the schema within each packet.

It does increase the packet size for the specific case that the tags will
never change and the schema isn't being included in each packet, but I
think this would be few and far between.

Ben

[DISCUSS] Changing Schema for Kafka Source Connector

Reply via email to