[ 
https://issues.apache.org/jira/browse/DAFFODIL-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294944#comment-17294944
 ] 

Mike Beckerle commented on DAFFODIL-2202:
-----------------------------------------

I think a document is better than Jira issue threads when working at this level 
of detail where we're really working on a design and concepts. 

I prefer to do design docs as asciidoc in the dev subtree of the daffodil-site. 
IDEA has good asciidoc support and there is AsciidocFX tool also, and that way 
the content is being maintained in the daffodil-site git repo. 

Check out [https://daffodil.apache.org/dev] which is a tree of developer design 
notes.

[https://daffodil.apache.org/dev/aboutAsciiDoc/] is a page about asciidoc 
showing example content and with hints on how to use asciidoc for design 
documents.

An actual design doc page there is 
[https://daffodil.apache.org/dev/design-notes/term-sharing-in-schema-compiler/]

The creation of that page is what convinced me that asciidoc and related tools 
like asciiflow ([https://asciiflow.com/#/|https://asciiflow.com/#/)] which is 
what I used to draw most of the pictures in that) are the way to go.

These things live in the daffodil-site git repo in the dev subtree, i.e., here: 
https://github.com/apache/daffodil-site/tree/master/site/dev

> Code Gen Framework
> ------------------
>
>                 Key: DAFFODIL-2202
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2202
>             Project: Daffodil
>          Issue Type: Improvement
>          Components: Back End
>    Affects Versions: 2.4.0
>            Reporter: Mike Beckerle
>            Assignee: John Interrante
>            Priority: Minor
>
> We have built an initial C code generator backend for Apache Daffodil. 
> Currently the C code generator can support binary boolean, integer, and real 
> numbers, arrays of simple and complex elements, choice groups using 
> dispatch/branch keys, validation of "fixed" values, and padding of explicit 
> length complex elements with fill bytes. We plan to continue building out the 
> C code generator until it supports a minimal subset of the DFDL 1.0 
> specification for embedded devices.
> Here are some changes which have been requested by collaborators or reviewers 
> so we don't forget them. If someone wants to help (which would be 
> appreciated), please add a comment to this issue or let the dev list know in 
> order to avoid duplication.
> h3. C struct/field name collisions
> To avoid possible name collisions, we should prepend struct names and field 
> names with namespace prefixes if their infoset elements have non-null 
> namespace prefixes.
> h3. Error reporting 
> To make runtime2 error messages easier to format and translate for 
> internationalization, we should change the way runtime2 functions report 
> errors to callers. Currently runtime2 functions report errors by returning a 
> non-null pointer to a constant char array (that is, a pointer to a string 
> literal). It would be better to report errors by returning a non-null pointer 
> to an error struct object with member fields initialized to report an error. 
> Only the runtime2 function which prints error messages would need to perform 
> formatting and translation - all the other functions only need to fill in 
> some member fields and return a pointer.
> h3. Anonymous/multiple choice groups
> In addition to handling elements with xs:choice complex types, we should 
> detect anonymous choice groups and refine the choice runtime structure in 
> order to allow multiple choice groups to be inlined into parent elements. 
> Example schema and corresponding C code:
> {code:xml}
>   <xs:complexType name="NestedUnionType">
>     <xs:sequence>
>       <xs:element name="first_tag" type="idl:int32"/>
>       <xs:choice dfdl:choiceDispatchKey="{xs:string(./first_tag)}">
>         <xs:element name="foo" type="idl:FooType" dfdl:choiceBranchKey="1 2"/>
>         <xs:element name="bar" type="idl:BarType" dfdl:choiceBranchKey="3 4"/>
>       </xs:choice>
>       <xs:element name="second_tag" type="idl:int32"/>
>       <xs:choice dfdl:choiceDispatchKey="{xs:string(./second_tag)}">
>         <xs:element name="fie" type="idl:FieType" dfdl:choiceBranchKey="1"/>
>         <xs:element name="fum" type="idl:FumType" dfdl:choiceBranchKey="2"/>
>       </xs:choice>
>     </xs:sequence>
>   </xs:complexType>
> {code}
> {code:c}
> typedef struct NestedUnion
> {
>     InfosetBase _base;
>     int32_t     first_tag;
>     size_t      _choice_1; // choice of which union field to use
>     union
>     {
>         foo foo;
>         bar bar;
>     };
>     int32_t     second_tag;
>     size_t      _choice_2; // choice of which union field to use
>     union
>     {
>         fie fie;
>         fum fum;
>     };
> } NestedUnion;
> {code}
> h3. Choice dispatch key expressions
> We currently support only very restricted and simple subset of choice 
> dispatch key expressions. We would like to refactor the DPath expression 
> compiler and make it generate C code in order to support more kinds of choice 
> dispatch key expressions.
> h3. No match between choice dispatch key and choice branch keys
> Right now c-daffodil is more strict than scala-daffodil when unparsing 
> infoset XML files with no matches (or mismatches) between choice dispatch 
> keys and branch keys. Perhaps c-daffodil should load such an XML file without 
> a no match processing error and unparse the infoset to a binary data file 
> without a no match processing error. We would have to code and call a choice 
> branch resolver in C which peeks at the next XML element, figures out which 
> branch does that element indicate exists inside the choice group, and 
> initializes the choice and element runtime data (_choice and childNode->erd 
> member fields) accordingly. We probably would replace the initChoice() call 
> in walkInfosetNode() with a call to that choice branch resolver and we might 
> not need to call initChoice() in unparseSelf(). When I called initChoice() in 
> all these parse, walk, and unparse places, I was pondering removing the 
> _choice member field and calling initChoice() as a function to tell us which 
> element to visit next, but we probably should have a mutable choice runtime 
> data structure.
> h3. Floating point numbers
> Right now runtime2 prints floating point numbers in XML infoset files 
> slightly differently than runtime1 does. This means TDML tests may need to 
> use different XML infoset files for different runtimes. We should be able to 
> make the TDML Runner compare floating point numbers numerically, not 
> textually, so that TDML tests won't have to use two different XML infoset 
> files.
> h3. Arrays
> Instead of expanding arrays inline within childrenERDs, we may want to store 
> a single entry for an array in childrenERDs giving the array's offset and 
> size of all its elements. We would have to write code for special case 
> treatment of array member fields versus scalar member fields but we could 
> save space/memory in childrenERDs for use cases with very large arrays. An 
> array element's ERD should have minOccurs and maxOccurs where minOccurs is 
> unsigned and maxOccurs is signed with -1 meaning "unbounded". The actual 
> number of children in an array instance would have to be stored in the array 
> instance object (where, in the C struct or what?). An array node has to be a 
> different kind of infoset node with a place for this number of actual 
> children to be stored. Probably all ERDs should just get minOccurs and 
> maxOccurs and a scalar is just one with 1, 1 as those values, an optional 
> element is 0,1, and an array is all other legal combinations. N, -1 and N, M 
> with N<=M. A restriction that minOccurs is 0, 1, or equal to maxOccurs (which 
> is not -1) is acceptable. A restriction that maxOccurs is 1, -1, or equal to 
> minOccurs is also fine (means variable-length arrays always have unbounded 
> number of elements.)
> h3. Daffodil module/subdirectory names
> When Daffodil is ready to move from a 3.x to a 4.x release, rename the 
> modules to have shorter and easier to understand names as discussed in 
> DAFFODIL-2406.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to