Hi all,
I would like to bring up a topic which I would like to address in the next few
days.
We currently have multiple scraper-based integration modules. All of these need
configuration.
However the configuration strategies differ slightly. I think it would be good
to unify them.
Even if I know there will be slight differences for every integration, I think
the general concept should be the same.
The Logstash integration follows a strategy which is directly linked to the
scrapers internal structure:
## logstash pipeline config - input
input {
## use plc4x plugin (logstash-input-plc4x)
plc4x {
## define sources (opc-ua examples)
sources => {
source1 => "opcua:tcp://opcua-server:4840/"
source2 => "opcua:tcp://opcua-server1:4840/"
}
## define jobs
jobs => {
job1 => {
# pull rate in milliseconds
rate => 1000
# sources queried by job1
sources => ["source1"]
# defined queries [logstash_internal_fieldname => "IIoT query"]
queries => {
PreStage => "ns=2;i=3"
MidStage => "ns=2;i=4"
PostStage => "ns=2;i=5"
Motor => "ns=2;i=6"
ConvoyerBeltTimestamp => "ns=2;i=7"
RobotArmTimestamp => "ns=2;i=8"
}
}
}
}
}
For the Kafka Connect adapter I took a slightly different route:
name=plc-0
connector.class=org.apache.plc4x.kafka.Plc4xSourceConnector
default-topic=machineData
tasks.max=2
sources=machineA
sources.machineA.connectionString=s7://10.10.64.20
sources.machineA.jobReferences=s7-dashboard,s7-heartbeat
sources.machineA.jobReferences.s7-heartbeat.topic=heartbeat
jobs=s7-dashboard,s7-heartbeat
jobs.s7-dashboard.interval=1000
jobs.s7-dashboard.fields=running,conveyorEntry,load,unload,transferLeft,transferRight,conveyorLeft,conveyorRight,numLargeBoxes,numSmallBoxes
jobs.s7-dashboard.fields.running=%DB3.DB31.0:BOOL
jobs.s7-dashboard.fields.conveyorEntry=%Q0.0:BOOL
jobs.s7-dashboard.fields.load=%Q0.1:BOOL
jobs.s7-dashboard.fields.unload=%Q0.2:BOOL
jobs.s7-dashboard.fields.transferLeft=%Q0.3:BOOL
jobs.s7-dashboard.fields.transferRight=%Q0.4:BOOL
jobs.s7-dashboard.fields.conveyorLeft=%Q0.5:BOOL
jobs.s7-dashboard.fields.conveyorRight=%Q0.6:BOOL
jobs.s7-dashboard.fields.numLargeBoxes=%DB3.DBW32:INT
jobs.s7-dashboard.fields.numSmallBoxes=%DB3.DBW34:INT
jobs.s7-heartbeat.interval=500
jobs.s7-heartbeat.fields=active
jobs.s7-heartbeat.fields.active=%DB3.DB31.0:BOOL
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
The main difference is that in Logstash the “sources” are somewhat “dumb” and
they are referenced from the jobs.
In my case the sources are a little heavier as they reference which jobs they
should be used with.
The reason I did this, was that I thought the case of adding a new PLC into the
picture would then be to define the source and then add the jobs we want to
collect on this source.
I guess the other philosophy is that the PLCs are sort of static and the Jobs
are subject to frequent change.
I personally like the option where the source tells which jobs to run on, but I
guess Julian and his team prefer the other (as they built the scraper that way)
…
So I would like to hear some general feedback on which way we should be going
and then I’ll make sure all integrations use a similar strategy.
Chris