Christian you might need to check what are the XSD substitution groups. I've seen a great use of these to leave "open doors" for document extension with limited use of new name spaces.
Beside above I need to support Julian a bit since no one really knows how to handle it. Part of blame goes toward us as we didn't read DFDL spec yet. XML Schema is great tool and works quite reliably for situations where data is structured in certain way. This means that in most of cases XML Schema was introduced to create a contract for new services and its caller, in some other it was added as verification mechanism for existing XML payloads. All above are a bit in contrast to what PLC communication is - all of them already have communication contract in form of native protocols. These are distinct from XML and use byte payloads instead of text formats (mostly). I see a clear match between XSD/DFDL and these protcols - these are primitive and complex types plus payload definitions, but do we have any chance to use a validation capabilities of these XSDs at any point? Only if we will generate a parser. Keeping DFDL makes a clear point because it will be very helpful for all "enterpriseish" scenarios which require a more "XML" a like type definitions. I think that if we would have any traditional ESB programmers from big corps we would have their claps already. We know that PLC frames are mostly sequential, usually have length field somewhere in beginning, some might have CRC embedded at beginning or end. More advanced formats might have complex types, some in the end might be encrypted. I come over very few binary protocols during my career and PLCs are not first ones. I had a touch with EBus (do not confuse with EEBus), BACnet and WM-Bus. These represent completely different ways to communicate with devices where first one is typical serial interface, second is radio with rich application layer and possible AES encryption and third have rich, abstract and extensible structure. I can't talk much about S7, ADS or KNX/IP simply because I don't know these protocols yet. However based on my BACnet knowledge I can already tell you that reflecting its will be difficult without leaving a plug at parser and application layer. Doing that at XML Schema will be supper difficult as we would have a "main" protocol and later "something" to plug vendor specific extensions (if any). Extensibility of BACnet starts at frame types and ends at enumerations used to describe elements. On other hand WM-Bus with its simplicity bring few points. Frame is structured with control byte, header, crc, payload and crc again. Payload can be turned into series of variable length elements where first bytes and their bits determine kind of data, multiplier, unit and so on. I can't see myself doing bit operations in plain Java and I can't see this readable in XML Schema/DFDL. To add more context of WM-Bus, as it is radio based, payload can be encoded with AES thus there is extra lookup from parser/receiver to obtain device key which will allow to decode data. BACnet supports encryption too, however according to my knowledge it is very rarely used in field due to complicated nature of this part of standard. One of unique abilities of BACnet is support for segmentation of payloads, meaning that longer answers will go with several frames (AFAIR up to 250 or 255). This represents yet another problem which needs to be addressed at parser/generator level to do not build application layer structures until data is fully collected. To make things even more complicated BACnet have several layers: - virtual link - network layer - application layer 2016 version of BACnet ASN.1 with comments, but not all enumerations is 4600 lines long. I'm writing all this to add additional edge cases which we will need to cover - with DFDL or any other tool. It won't be easy neither way! Coming to the point, I am not sure if DFDL is the best way of describing payload and application layer formats. As you already committed a lot of time and will in it you know it best and feel natural. However do we want each and everyone who would like to contribute to PLC4X to go over 244 pages long [1] PDF and check some of more than 30 external references, just to write their protocol description? For me this represents a quite step to go over especially that it seems to be a mixture of XML Schema, XPath, XQuery and/or schematron. None or very few folks who works with low level stuff will be able to get over it. I know most of above, and I would say that I am fluent with XML Schema as I did a lot of gymnastics with WSDL and SOAP, I've written plenty of transformations with XSLT. It always worked well because these fit perfectly into XML universe. I'm not saying that DFDL does not fit into what PLC4X is trying to do, but PLCs don't do any XML processing. Commonalities I mentioned above (types, payloads) do not speak loud enough to invest into such thing. I wouldn't mind to have a simplistic way to describe frames with all possible edge cases we know and generate DFDL out of it. With this in mind we could retain work which has been done already to generate parsers (reuse what Apache Daffodil gives us) but also limit amount of work for new protocols and reduce verbosity of first step. Once someone will feel comfortable enough to get into DFDL - we still have this option. After all big and fast adoption of protobufs, thrift and other formats started from very simplistic descriptor format. Cheers, Łukasz -- http://connectorio.com [1] https://www.ogf.org/documents/GFD.207.pdf On 08.05.2019 15:20, Julian Feinauer wrote: > Hi Chris, > > sorry, for just responding now. > I guess you are the only one that deeply involved in the DFDL Schema and > stuff. > I only scratched at the surface and guess it is incredibly powerful. > But as everybody knows... with great power complex great complexity. > > So, on one hand, your suggestion sounds reasonable, to have some kind of > "base" definition. > But on the other hand I mislike the "central" nature of that. > I meant, there are some standards which will be used by all drivers like > units, ints, C-Strings and so with LE/BE but other than that I'm pretty sure > (but only a feeling, nothing I could prove) that every protocol we touch > brings something we did not think about yet. > > So I personally would prefer if we do these types directly in the protocol, > and perhaps simply with a clear description how to deserialize or so (byte > shifts, bit shifts, ...). > Otherwise we have some kind of coupling behind all these drivers through the > backdoor and in the worst case we would make one driver fall silently, > because we fix some kind of bug with the types in another one or something. > > Don’t we have some possibilities to do something like that? > > Julian > > Am 08.05.19, 15:12 schrieb "Christofer Dutz" <[email protected]>: > > Hi all, > > I also just had another idea ... > > No matter how we define the schemas we'll always have one problem in the > end ... how to map some type like an "unsigned-16-bit-integer" into something > the language can understand. > So we were thinking of some Language adapters ... now this could handle > the mapping to code, but we don't have control over how these types are > defined in the protocol specifications. > Each protocol spec currently defines all the types it needs locally. > > Now I had an idea that might help solve both problems: > - I create a "plc4x-dfdl" schema which contains definitions for all of > the base types > - We use and import this schema into dfdl protocol specs to have the same > base-line in all plc4x protocol specs > - When we write new language packs, we do so by providing implementations > for all of the types in the plc4x-dfdl schema > > Guess this should be a pretty clean definition of what plc4x provides, > what protocol engineers need to define in their drivers and what language > engineers need to provide in their language templates. > > Chris > > > > Am 08.05.19, 11:29 schrieb "Christofer Dutz" <[email protected]>: > > Hi, > > I think while refactoring the DFDL schemas a little more, I came up > with an idea on how we can support inheritance with DFDL: > > > * In all cases with inheritance, we have a “choice” element in > the schema > * Some sort of “type” element is parsed before the choice element > itself > > Now the idea is that if a type contains a choice, that the name of > the base class of all sub-types is based on the name of the element that > contains the choice. > > Example: > > <xs:complexType name="S7RequestMessage"> > <xs:sequence> > <!-- Reserved value always 0x0000 --> > <xs:element name="reserved" type="s7:short" fixed="0"/> > <xs:element name="tpduReference" type="s7:short"/> > <xs:element name="parametersLength" type="s7:short"/> > <xs:element name="payloadsLength" type="s7:short"/> > <xs:element name="parameters" minOccurs="0" > dfdl:lengthKind="explicit" > dfdl:lengthUnits="bytes" dfdl:length="{../parametersLength}" > dfdl:occursCountKind="expression" > dfdl:occursCount="{if(../parametersLength gt 0) > then 1 else 0}"> > <xs:complexType> > <xs:sequence> > <xs:element name="parameter" > maxOccurs="unbounded"> > <xs:complexType> > <xs:sequence> > <xs:element name="type" > type="s7:byte"/> > <xs:choice > dfdl:choiceDispatchKey="{xs:string(type)}"> > <xs:element > dfdl:choiceBranchKey="240" name="s7GeneralParameterSetupCommunication" > > type="s7:S7GeneralParameterSetupCommunication"/> > <xs:element > dfdl:choiceBranchKey="4" name="s7RequestParameterReadVar" > > type="s7:S7RequestParameterReadVar"/> > <xs:element > dfdl:choiceBranchKey="5" name="s7RequestParameterWriteVar" > > type="s7:S7RequestParameterWriteVar"/> > </xs:choice> > </xs:sequence> > </xs:complexType> > </xs:element> > </xs:sequence> > </xs:complexType> > </xs:element> > > In this case we would have an S7RequestMessage type which contains a > property “parameters” of type “List<Parameter>”. > Parameter (containing a choice) would be an abstract class with an > abstract “getDenominator” method. > S7GeneralParameterSetupCommunication would extend Parameter. > > You think that’s a path to go? … Had to add some artificial elements > in order to set the boundaries of the types. > > Chris > > > >
