[
https://issues.apache.org/jira/browse/DAFFODIL-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285277#comment-17285277
]
Mike Beckerle commented on DAFFODIL-2202:
-----------------------------------------
There is a separate ticket for TDML runner floating point comparisons to be
value based, not string based. DAFFODIL-2402. It enables xsi:type annotations
on elements to be respected for simple types.
> Code Gen Framework
> ------------------
>
> Key: DAFFODIL-2202
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2202
> Project: Daffodil
> Issue Type: Improvement
> Components: Back End
> Affects Versions: 2.4.0
> Reporter: Mike Beckerle
> Assignee: John Interrante
> Priority: Minor
>
> We have built an initial C code generator backend for Apache Daffodil.
> Currently the C code generator can generate C code to read and write binary
> real and integer numbers, arrays of such numbers, and choices of alternative
> structures within wire protocol packets. We plan to continue building out the
> C code generator until it supports a minimal subset of the DFDL 1.0
> specification for embedded devices.
> Here are some notes to keep track of changes that have been requested by
> collaborators or reviewers so we don't forget them. If someone wants to help
> (which would be appreciated), please add a comment to this issue or let the
> dev list know in order to avoid duplication.
> 1. Validation of "fixed" values
> Is there a way for us to find a fixed="value" attribute in a schema within
> runtime2 so we can generate C code to check that the corresponding C struct
> member has the matching value? Suppose a schema has
> {code:xml}
> <xs:complexType name="Limits">
> <xs:sequence>
> <xs:element name="sync0" fixed="210" type="idl:uint8"/>
> <xs:element name="checksum" type="idl:uint16"/>
> </xs:sequence>
> </xs:complexType>
> <xs:element name="LimitsDecl" type="idl:Limits"/>
> </xs:schema>
> {code}
> and a binary data file does not have the number 210 in sync0's position, we
> would want the generated C code to report an error like:
> {noformat}
> Validation error: The value of element 'sync0' does not match the value of
> its 'fixed' attribute.
> {noformat}
> 2. C struct/field name collisions
> To avoid possible name collisions, we should prepend struct names and field
> names with namespace prefixes if their infoset elements have non-null
> namespace prefixes.
> 3. Anonymous/multiple choice groups
> In addition to handling elements with xs:choice complex types, we should
> detect anonymous choice groups and refine the choice runtime structure in
> order to allow multiple choice groups to be inlined into parent elements.
> Example schema and corresponding C code:
> {code:xml}
> <xs:complexType name="NestedUnionType">
> <xs:sequence>
> <xs:element name="first_tag" type="idl:int32"/>
> <xs:choice dfdl:choiceDispatchKey="{xs:string(./first_tag)}">
> <xs:element name="foo" type="idl:FooType" dfdl:choiceBranchKey="1 2"/>
> <xs:element name="bar" type="idl:BarType" dfdl:choiceBranchKey="3 4"/>
> </xs:choice>
> <xs:element name="second_tag" type="idl:int32"/>
> <xs:choice dfdl:choiceDispatchKey="{xs:string(./second_tag)}">
> <xs:element name="fie" type="idl:FieType" dfdl:choiceBranchKey="1"/>
> <xs:element name="fum" type="idl:FumType" dfdl:choiceBranchKey="2"/>
> </xs:choice>
> </xs:sequence>
> </xs:complexType>
> {code}
> {code:c}
> typedef struct NestedUnion
> {
> InfosetBase _base;
> int32_t first_tag;
> size_t _choice_1; // choice of which union field to use
> union
> {
> foo foo;
> bar bar;
> };
> int32_t second_tag;
> size_t _choice_2; // choice of which union field to use
> union
> {
> fie fie;
> fum fum;
> };
> } NestedUnion;
> {code}
> 4. Choice dispatch key expressions
> We currently support only very restricted and simple subset of choice
> dispatch key expressions. We would like to refactor the DPath expression
> compiler and make it generate C code in order to support more kinds of choice
> dispatch key expressions.
> 5. No match between choice dispatch key and choice branch keys
> Right now c-daffodil is more strict than scala-daffodil when unparsing
> infoset XML files with no matches (or mismatches) between choice dispatch
> keys and branch keys. Perhaps c-daffodil should load such an XML file without
> a no match processing error and unparse the infoset to a binary data file
> without a no match processing error. We would have to code and call a choice
> branch resolver in C which peeks at the next XML element, figures out which
> branch does that element indicate exists inside the choice group, and
> initializes the choice and element runtime data (_choice and childNode->erd
> member fields) accordingly. We probably would replace the initChoice() call
> in walkInfosetNode() with a call to that choice branch resolver and we might
> not need to call initChoice() in unparseSelf(). When I called initChoice() in
> all these parse, walk, and unparse places, I was pondering removing the
> _choice member field and calling initChoice() as a function to tell us which
> element to visit next, but we probably should have a mutable choice runtime
> data structure.
> 6. Floating point numbers
> Right now runtime2 prints floating point numbers in XML infoset files
> slightly differently than runtime1 does. This means TDML tests may need to
> use different XML infoset files for different runtimes. We should be able to
> make the TDML Runner compare floating point numbers numerically, not
> textually, so that TDML tests won't have to use two different XML infoset
> files.
> 7. Arrays
> Instead of expanding arrays inline within childrenERDs, we may want to store
> a single entry for an array in childrenERDs giving the array's offset and
> size of all its elements. We would have to write code for special case
> treatment of array member fields versus scalar member fields but we could
> save space/memory in childrenERDs for use cases with very large arrays. An
> array element's ERD should have minOccurs and maxOccurs where minOccurs is
> unsigned and maxOccurs is signed with -1 meaning "unbounded". The actual
> number of children in an array instance would have to be stored in the array
> instance object (where, in the C struct or what?). An array node has to be a
> different kind of infoset node with a place for this number of actual
> children to be stored. Probably all ERDs should just get minOccurs and
> maxOccurs and a scalar is just one with 1, 1 as those values, an optional
> element is 0,1, and an array is all other legal combinations. N, -1 and N, M
> with N<=M. A restriction that minOccurs is 0, 1, or equal to maxOccurs (which
> is not -1) is acceptable. A restriction that maxOccurs is 1, -1, or equal to
> minOccurs is also fine (means variable-length arrays always have unbounded
> number of elements.)
> 8. Daffodil module/subdirectory names
> When Daffodil is ready to move from a 3.x to a 4.x release, rename the
> modules to have shorter and easier to understand names as discussed in
> DAFFODIL-2406.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)