You will have to enrich the data coming in for eg- { "equipment-id" :
"1-234", "sensor-id" : "1-vcy", ..... } . Since you will most likely have
a keyedstream based on equipment-id+sensor-id or equipment-id, you can have
a control stream with data about equipment to workshop/factory mapping
something like this - { "equipment-id" : "1-234", "workshop-id" :
"1-234","factory-id" : "1-vcy", ..... } and then you can use CoProcess
function to join these two streams to have the enriched stream. Once you
have the enriched stream you can do aggregations at level you want to.
You can refer here [1] or [2] for some sample and reference.

[1] https://www.youtube.com/watch?v=cJS18iKLUIY
[2] https://training.ververica.com/exercises/eventTimeJoin.html

Hemant

On Wed, May 6, 2020 at 3:10 AM Aissa Elaffani <aissaelaff...@gmail.com>
wrote:

> Hello Guys,
> I am new to the real-time streaming field, and I am trying to build a BIG
> DATA architecture for processing real-time streaming. I have some sensors
> that generate data in json format, they are sent to Apache kafka cluster
> then i want to consume them with Apache flinkin ordre to do some
> aggregation. The probleme is that the data coming from kafka contains " the
> sensor ID , the equipement ID in wiche it is installed, and the status of
> the equipment..", knowing that the each sensor is installed in an
> equipement, and the equipement is linked to an workshop that it self linked
> to factory. So i need an other data source for the workshop and factories,
> because i want to do aggregation on factories, and the data sent by the
> sensors contains just the sensorIDand the equipementID...
> Guys I am new to the this field, and i am stuck in this. Can someone
> please help me to achieve my goal, and explain to me how can i do that. And
> how can i do this complexed aggregation??And if there is any optmisation to
> do? Sorry for disturbing you !!!
> AISSA
>
>

Reply via email to