Re: Nifi integration record oriented processor for reading

Otto Fowler Thu, 27 May 2021 07:10:35 -0700

Awesome.

If you can, can I ask you to:


1.  Mark the PR as ready to review in github
2. rebase or squash it to a single commit and force push to your branch to 
clean it up



> On May 27, 2021, at 06:45, Iñigo Angulo <[email protected]> wrote:
> 
> Hi Otto, Chris
> 
> we have finally commited the updates on the Nifi Processor to the Pull 
> Request. The changes we have done are the following:
> - deducing avro datatypes from PlcResponse. Here we may check the method 
> (org.apache.plc4x.nifi.util.Plc4xCommon.createSchema()), in order to see if 
> it is the best way to do it.
> - we have added in the  plc4j/pom.xml an "ignoredDependency" for 
> org.apache.nifi:nifi-standard-nar, as it is used on runtime and was rising 
> errors during compilation.
> - we have changed onScheduled method in Plc4xSourceRecordProcessor (comparing 
> to BaseProcessor), as we have included the posibility to have an input 
> connection into the processor, and indicate the target addressMap through 
> flowfile attributes. The addressMap is now created in the onTrigger method.
> - we have tested the performance with S7 and Modbus protocols (using a 
> Siemens S7-1200 and Schneider M221). We will upload an updated nifi template 
> for both protocols, but regarding this, do you have any testing environment 
> to simulate PLCs? If that is the case, we could prepare the Processors 
> configuration to match these ones (connection Strings and addressMaps).
> 
> Please take a look at the code, any suggestion will be very welcome.
> 
> iñigo
> 
> ----------------------------------------- 
> Iñigo Angulo 
> 
> ZYLK.net :: consultoría.openSource 
> telf.: 747412337 
> Ribera de Axpe, 11 
> Edificio A, modulo 201-203 
> 48950 Erandio (Bizkaia) 
> -----------------------------------------
> 
> ----- Mensaje original -----
> De: "Iñigo Angulo" <[email protected]>
> Para: "dev" <[email protected]>
> Enviados: Miércoles, 12 de Mayo 2021 14:42:22
> Asunto: Re: Nifi integration record oriented processor for reading
> 
> Hi Otto, Chris,
> 
> we have been working on the processor to include the logic of getting the 
> values for variables from the PLC response. We tested it with the S7-1200 and 
> seems to work fine, however we would like to make some further tests before 
> commiting it.
> 
> Regarding the actual method which takes the datatype from the response 
> object, we did it in the following way:
> 
> //PlcReadResponse readResponse
> Map<String, ? extends PlcValue> responseDataStructure = 
> readResponse.getAsPlcValue().getStruct();
> for (Map.Entry<String, ? extends PlcValue> entry : 
> responseDataStructure.entrySet()) {
> 
>    if (entry.getValue() instanceof PlcINT) {                                  
>     
>        
> builder.name(fieldName).type().unionOf().nullBuilder().endNull().and().intType().endUnion().noDefault();
>    }else if (entry.getValue() instanceof PlcREAL) {
>        
> builder.name(fieldName).type().unionOf().nullBuilder().endNull().and().doubleType().endUnion().noDefault();
>    } ... and so on for the rest of the classes on the package 
> (org.apache.plc4x.java.spi.values.*)
> 
>    //the builder object is used to build avro schema, with desired datatypes 
> (for example intType())
> }
> 
> 
> Is this the solution you had in mind? If you think there is a better way to 
> access PlcValues, please let us know and we will update it.
> 
> We will upload the code soon so that you can take a deeper look.
> 
> thank you!
> iñigo
>       
> ----------------------------------------- 
> Iñigo Angulo 
> 
> ZYLK.net :: consultoría.openSource 
> telf.: 747412337 
> Ribera de Axpe, 11 
> Edificio A, modulo 201-203 
> 48950 Erandio (Bizkaia) 
> -----------------------------------------
> 
> ----- Mensaje original -----
> De: "ottobackwards" <[email protected]>
> Para: "dev" <[email protected]>
> Enviados: Viernes, 30 de Abril 2021 14:50:11
> Asunto: Re: Nifi integration record oriented processor for reading
> 
> Sounds like a good plan!
> 
>> On Apr 30, 2021, at 05:26, Iñigo Angulo <[email protected]> wrote:
>> 
>> Hi Otto, Chris,
>> 
>> we have been reviewing the comments on the pull request, and started to 
>> think about the approach of extracting values from response directly. During 
>> next week, we will work on this and make some updates to the code with the 
>> suggestions you made. We keep you informed
>> 
>> thank you,
>> 
>> iñigo
>> 
>> ----------------------------------------- 
>> Iñigo Angulo 
>> 
>> ZYLK.net :: consultoría.openSource 
>> telf.: 747412337 
>> Ribera de Axpe, 11 
>> Edificio A, modulo 201-203 
>> 48950 Erandio (Bizkaia) 
>> -----------------------------------------
>> 
>> ----- Mensaje original -----
>> De: "Christofer Dutz" <[email protected]>
>> Para: "dev" <[email protected]>
>> Enviados: Viernes, 23 de Abril 2021 18:10:35
>> Asunto: AW: AW: Nifi integration record oriented processor for reading
>> 
>> Hi Inigo,
>> 
>> especially if you have a look at the KNX protocol. This doesn't define the 
>> usual IEC datatypes we tried to use for all normal PLC drivers.
>> Here we have hundreds of datatypes that don't match any other protocol. I 
>> think the PLCValue approach would be the simplest.
>> The one thing you have to keep in mind, is that you should check a PLCValue, 
>> if it's a list (Array type) or a Structure (Which sort of relates to komplex 
>> types with a more sophisticated structure).
>> 
>> Chris
>> 
>> 
>> -----Ursprüngliche Nachricht-----
>> Von: Iñigo Angulo <[email protected]> 
>> Gesendet: Freitag, 23. April 2021 15:34
>> An: dev <[email protected]>
>> Betreff: Re: AW: Nifi integration record oriented processor for reading
>> 
>> Hi Otto, Chris,
>> 
>> Yes, I think the approach you propose will be best. By now, we are 
>> generating the schema ourselves. We have a record writer who is in charge of 
>> reading PLC values. Schema is defined previously to reading the values. We 
>> build this schema getting the protocol from the 'connectionString' (S7, 
>> Modbus) and the specified variable type from the 'PLC resource address 
>> String' containing the list of variable to read. From this we deduce the 
>> expected Avro datatype when reading, for instance, a word in S7 or a coil in 
>> Modbus. 
>> 
>> However, as you mentioned, the other approach will be much clearer and 
>> useful. Ideally, getting the actual datatype from PLCValue when getting the 
>> response. Regarding this, we tried to keep the previously described 
>> 'mapping' separated from the rest of the code, so that hopefully it can be 
>> easily replaced..
>> 
>> We have done the pull request, hope you can take a look at the code and let 
>> us know what you think. We will fill the ICLA document too.
>> 
>> thank you
>> iñigo
>> 
>> 
>> 
>> -----------------------------------------
>> Iñigo Angulo 
>> 
>> ZYLK.net :: consultoría.openSource
>> telf.: 747412337
>> Ribera de Axpe, 11
>> Edificio A, modulo 201-203
>> 48950 Erandio (Bizkaia)
>> -----------------------------------------
>> 
>> ----- Mensaje original -----
>> De: "Christofer Dutz" <[email protected]>
>> Para: "dev" <[email protected]>
>> Enviados: Jueves, 22 de Abril 2021 17:12:49
>> Asunto: AW: Nifi integration record oriented processor for reading
>> 
>> Hi all,
>> 
>> Well, you get PlcValues from the response that wrap the different datatypes. 
>> So generally you shouldn't care about the detail type.
>> 
>> However, you can call getObject() which returns the core value the plc-value 
>> has ... so if it's the PLCValue for a Short, getObject will return a short 
>> value.
>> 
>> Does that help?
>> 
>> Chris
>> 
>> 
>> -----Ursprüngliche Nachricht-----
>> Von: Otto Fowler <[email protected]>
>> Gesendet: Donnerstag, 22. April 2021 15:21
>> An: [email protected]
>> Betreff: Re: Nifi integration record oriented processor for reading
>> 
>> So, you are generating the schema yourself, such that downstream if they 
>> inherit schema they will just get what you generate?  And you are trying to 
>> do that by the connection string?  If so, a different way I could imagine 
>> doing would be to get the ’types’ of the data from the responses themselves. 
>>  This would be more generic.  The flow I could imagine ( in OnTrigger ):
>> 
>> DO READ
>> IF NOT HAS SCHEMA
>>      GENERATE SCHEMA FROM RESPONSE AND CACHE IN ATOMIC WRITE WITH SCHEMA
>> 
>> Maybe Chris can speak to how to get the types from the responses.
>> 
>> 
>>> On Apr 22, 2021, at 05:48, Iñigo Angulo <[email protected]> wrote:
>>> 
>>> Hi Chris, Otto,
>>> 
>>> Regarding the Record Processor concept, i will try to give an overview. In 
>>> Nifi, information packages are called Flowfiles, and these are the actual 
>>> units of information that are exchanged between Procesors, all along the 
>>> dataflow. Flowfiles have two sections where we can manage data: Attributes 
>>> and Content. In the "traditional" Nifi approach, you work with both 
>>> sections, extracting information from the Content to the Attributes and 
>>> viceversa to perform operations. This approach could have one limitation 
>>> when you are processing batch data (lines from a CSV file for instance), 
>>> where you need to split each of the lines into different Flowfiles. Thus, a 
>>> 1000 line CSV file leads to 1000 Flowfiles to process, each of them 
>>> containing a single record.
>>> 
>>> On later versions of the product, they introduced the Record oriented 
>>> approach. This approach allows you to manage multiple records on a single 
>>> Flowfile's Content, as long as these records have all the same schema. This 
>>> means that the operations defined by the Processors are applied 
>>> simultaneously to the whole content at once. Following with the previous 
>>> example, a 1000 line CSV file could produce a single Flowfile with a 
>>> content of 1000 records. 
>>> 
>>> To do this, Nifi uses Avro, to serialize the Flowfile's Content. Then, the 
>>> Record Oriented Processors use Writers and Readers to present this 
>>> information in the desired format (such as Avro, Json, CSV, etc). 
>>> Basically, with the record oriented approach, Nifi introduced multiple new 
>>> Processors, and also included the Record version of many of the "old" ones. 
>>> Using this Record approach, Nifi perfomance enhances notably, specially 
>>> when working with large structured information.
>>> 
>>> The work we did was creating a Record Oriented Processor, based on the 
>>> previously existing one Plc4xSourceProcessor, to read values from the 
>>> devices. We have also included a README on the 
>>> plc4x/plc4j/integrations/apache-nifi module explaining the Processor 
>>> configuration and giving an example. Moreover, we put a nifi template with 
>>> a dataflow for testing these processors, if useful.
>>> 
>>> Otto, regarding the idea behind this new Processor, that is right. We added 
>>> the writer capability to the existing PLC4XSourceProcessor, so that it 
>>> formats the output to the desired configuration in a record manner. At the 
>>> actual implementation, we did this "protocol adaptation" from the sintax of 
>>> the particular properties on Processor's configuration. For example, from 
>>> connection string 's7://IP:PORT', we extract the S7 idenifier and map 
>>> variable datatypes to the actual Avro datatypes for build the record output 
>>> schema. However, here we dont have vast experience with PLC4X libraries, 
>>> and for sure there will be better ways for doing this.
>>> Also about the Base Processor, we were thinking that maybe the best 
>>> approach could be to have this Base Processor, and then implement readers 
>>> for particular protocols as Controller Services. But here also, it could be 
>>> very helpful to have your opinion.
>>> 
>>> Lastly, regarding the pull request, do you have any documentation on 
>>> how to do this? I mean, maybe you have defined some naming 
>>> conventions, or expected structure to facilitate later work. At the 
>>> present, we have a fork of the project where we have been working on 
>>> these Nifi changes. We updated the content of our fork (fetch/merge
>>> upstream) about 2 weeks ago, and commited our changes to the 'develop' 
>>> branch. Do we better create a new branch with our commits? how do you 
>>> prefer to receive the code? (we are not very experts on git, just in 
>>> case we could cause some problems...)
>>> 
>>> thank you in advance
>>> 
>>> iñigo
>>> 
>>> 
>>> -----------------------------------------
>>> Iñigo Angulo
>>> 
>>> ZYLK.net :: consultoría.openSource
>>> telf.: 747412337
>>> Ribera de Axpe, 11
>>> Edificio A, modulo 201-203
>>> 48950 Erandio (Bizkaia)
>>> -----------------------------------------
>>> 
>>> ----- Mensaje original -----
>>> De: "Christofer Dutz" <[email protected]>
>>> Para: "dev" <[email protected]>
>>> Enviados: Miércoles, 21 de Abril 2021 20:01:15
>>> Asunto: AW: Nifi integration record oriented processor for reading
>>> 
>>> The more I think of it,
>>> 
>>> Perhaps we should also think of potentially providing some information on 
>>> supported configuration options.
>>> Wouldn't it be cool if the driver could say: "I generally have these 
>>> options and they have these datatypes and mean this"
>>> Additionally, the transports could too say: "I generally have these options 
>>> and they have these datatypes and mean this"
>>> 
>>> I would be our StreamPipes friends would love something like that? Right?
>>> 
>>> Chris
>>> 
>>> 
>>> -----Ursprüngliche Nachricht-----
>>> Von: Otto Fowler <[email protected]>
>>> Gesendet: Mittwoch, 21. April 2021 17:46
>>> An: [email protected]
>>> Betreff: Re: Nifi integration record oriented processor for reading
>>> 
>>> Hi Inigo,
>>> 
>>> I’m a committer on Apache Nifi as well as PLC4X, I would be happy to review 
>>> your processor.
>>> If I understand what you are saying correctly, you have a single processor 
>>> which supports record writing output?
>>> 
>>> plc4x -> records
>>> 
>>> And that you have, for configuration purposes for that processor created 
>>> support on a per protocol basis for configuration and validation?
>>> 
>>> If there is per protocol configuration / validation etc, it may be better 
>>> to have a base processor, and derived processors per protocol to handle 
>>> those differences.
>>> 
>>> I look forward to seeing the code.
>>> 
>>> 
>>>> On Apr 21, 2021, at 04:05, Iñigo Angulo <[email protected]> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> I am writing as we have been working on the Apache Nifi integration part 
>>>> of the project. We have created a Record oriented processor for reading 
>>>> PLC data. It is based on the previous existing SourceProcessor, but works 
>>>> with records, using a Nifi Writer (such as Avro, Json, and so on) to write 
>>>> data on flowfiles content. 
>>>> 
>>>> We updated the code on our fork with the actual PLC4X git repo about 2 
>>>> weeks ago, and tested it reading values with S7 from a S7-1200 CPU from 
>>>> Nifi. Also, one of our customers has recently started to use it for 
>>>> validation. 
>>>> 
>>>> Currently, it works with S7 and Modbus over TCP. This is because we had to 
>>>> write some classes to map connectionString and variableList properties 
>>>> (sintax) of the processor to the actual protocol, to be able to build then 
>>>> avro schema for output flowfile, taking into account variable datatypes, 
>>>> etc. We only did this for S7 and Modbus. I am sure that there is a better 
>>>> way to do this, so at this point you maybe could take a look to find the 
>>>> best solution and avoid needing to do this mapping. 
>>>> 
>>>> If you find this useful, we could do a pull request to the main PLC4x 
>>>> repo. Let us know what you think. 
>>>> 
>>>> best regards,
>>>> iñigo
>>>> 
>>>> -----------------------------------------
>>>> Iñigo Angulo
>>>> 
>>>> ZYLK.net :: consultoría.openSource
>>>> telf.: 747412337
>>>> Ribera de Axpe, 11
>>>> Edificio A, modulo 201-203
>>>> 48950 Erandio (Bizkaia)
>>>> -----------------------------------------

Re: Nifi integration record oriented processor for reading

Reply via email to