stevedlawrence commented on a change in pull request #525: URL: https://github.com/apache/daffodil/pull/525#discussion_r609635867
########## File path: README.md ########## @@ -23,107 +26,134 @@ [<img src="https://img.shields.io/maven-central/v/org.apache.daffodil/daffodil-core_2.12.svg?color=brightgreen&label=version" align="right"/>][Releases] <br clear="both" /> -Apache Daffodil is an open-source implementation of the [DFDL specification] -that uses DFDL data descriptions to parse fixed format data into an infoset. -This infoset is commonly converted into XML or JSON to enable the use of -well-established XML or JSON technologies and libraries to consume, inspect, -and manipulate fixed format data in existing solutions. Daffodil is also -capable of serializing or "unparsing" data back to the original data format. -The DFDL infoset can also be converted directly to/from the data structures -carried by data processing frameworks so as to bypass any XML/JSON overheads. +Apache Daffodil is an open-source implementation of the [DFDL +specification] that uses DFDL data descriptions to parse fixed format +data into an infoset. This infoset is commonly converted into XML or +JSON to enable the use of well-established XML or JSON technologies +and libraries to consume, inspect, and manipulate fixed format data in +existing solutions. Daffodil is also capable of serializing or +"unparsing" data back to the original data format. The DFDL infoset +can also be converted directly to/from the data structures carried by +data processing frameworks so as to bypass any XML/JSON overheads. -For more information about Daffodil, see https://daffodil.apache.org/. +For more information about Daffodil, see the [Website]. Review comment: Minor, but I actually prefer the full URL here. The url is something that some people might want to know so we might not want to hide it. ########## File path: README.md ########## @@ -23,107 +26,134 @@ [<img src="https://img.shields.io/maven-central/v/org.apache.daffodil/daffodil-core_2.12.svg?color=brightgreen&label=version" align="right"/>][Releases] <br clear="both" /> -Apache Daffodil is an open-source implementation of the [DFDL specification] -that uses DFDL data descriptions to parse fixed format data into an infoset. -This infoset is commonly converted into XML or JSON to enable the use of -well-established XML or JSON technologies and libraries to consume, inspect, -and manipulate fixed format data in existing solutions. Daffodil is also -capable of serializing or "unparsing" data back to the original data format. -The DFDL infoset can also be converted directly to/from the data structures -carried by data processing frameworks so as to bypass any XML/JSON overheads. +Apache Daffodil is an open-source implementation of the [DFDL +specification] that uses DFDL data descriptions to parse fixed format +data into an infoset. This infoset is commonly converted into XML or +JSON to enable the use of well-established XML or JSON technologies +and libraries to consume, inspect, and manipulate fixed format data in +existing solutions. Daffodil is also capable of serializing or +"unparsing" data back to the original data format. The DFDL infoset +can also be converted directly to/from the data structures carried by +data processing frameworks so as to bypass any XML/JSON overheads. -For more information about Daffodil, see https://daffodil.apache.org/. +For more information about Daffodil, see the [Website]. ## Build Requirements * JDK 8 or higher * SBT 0.13.8 or higher -* C compiler (for daffodil-runtime2 only) -* Mini-XML Version 3.2 or higher (for daffodil-runtime2 only) +* C compiler C99 or higher +* Mini-XML Version 3.2 or higher + +Since Daffodil has a DFDL to C backend, you will need a C compiler +([gcc] or [clang]), the [Mini-XML] library, and possibly the GNU +[argp] library if your system's C library doesn't include it. You can +install gcc and libmxml as system packages on most Unix based +platforms with distribution-specific packager commands such as (Debian +and Ubuntu): + + # Just mentioning all other packages you might need too + sudo apt install build-essential curl git libmxml-dev Review comment: I'm wondering if it would be better to have a separate mardown file that describes details to how to get the required packages from different distros if that's something we want to maintain. That way the main readme just lists the dependencies. Some users can easily figure that out, and then a more detailed page is available for those that need commands. Or at the very list this can be broken down into a "Buid Setup" section with different sections, e.g . DNF-based distros, APT-based distros, Windows, etc. so people can easily skip to the section they need. ########## File path: README.md ########## @@ -23,107 +26,134 @@ [<img src="https://img.shields.io/maven-central/v/org.apache.daffodil/daffodil-core_2.12.svg?color=brightgreen&label=version" align="right"/>][Releases] <br clear="both" /> -Apache Daffodil is an open-source implementation of the [DFDL specification] -that uses DFDL data descriptions to parse fixed format data into an infoset. -This infoset is commonly converted into XML or JSON to enable the use of -well-established XML or JSON technologies and libraries to consume, inspect, -and manipulate fixed format data in existing solutions. Daffodil is also -capable of serializing or "unparsing" data back to the original data format. -The DFDL infoset can also be converted directly to/from the data structures -carried by data processing frameworks so as to bypass any XML/JSON overheads. +Apache Daffodil is an open-source implementation of the [DFDL +specification] that uses DFDL data descriptions to parse fixed format +data into an infoset. This infoset is commonly converted into XML or +JSON to enable the use of well-established XML or JSON technologies +and libraries to consume, inspect, and manipulate fixed format data in +existing solutions. Daffodil is also capable of serializing or +"unparsing" data back to the original data format. The DFDL infoset +can also be converted directly to/from the data structures carried by +data processing frameworks so as to bypass any XML/JSON overheads. -For more information about Daffodil, see https://daffodil.apache.org/. +For more information about Daffodil, see the [Website]. ## Build Requirements * JDK 8 or higher * SBT 0.13.8 or higher -* C compiler (for daffodil-runtime2 only) -* Mini-XML Version 3.2 or higher (for daffodil-runtime2 only) +* C compiler C99 or higher +* Mini-XML Version 3.2 or higher + +Since Daffodil has a DFDL to C backend, you will need a C compiler Review comment: Are the're any thoughts on making this C backend optional / is it even possible to be optional? I know of people that build Daffodil from source and build an RPM, and they probably would not want to the C backend, so a way to turn it off so that the potential attack surface is lessend might be a beneficial option. Is it even possible to turn off from the build right now, or is it too tightly coupled into the Daffodil compiler? ########## File path: daffodil-runtime2/src/main/resources/c/libcli/xml_reader.c ########## @@ -53,41 +54,44 @@ strtobool(const char *numptr, const char **errstrp) } else { - error_msg = "Error converting XML data to boolean"; + static Error error = {ERR_STRTOBOOL, {NULL}}; + error.s = numptr; + *errorptr = &error; } - *errstrp = error_msg; return value; } // Convert an XML element's text to a double (call strtod with our own // error checking) static double -strtodnum(const char *numptr, const char **errstrp) +strtodnum(const char *numptr, const Error **errorptr) { char *endptr = NULL; // Clear errno to detect error after calling strtod errno = 0; const double value = strtod(numptr, &endptr); - // Report any issues converting the string to a number + // Check for any errors converting the string to a number if (errno != 0) { - *errstrp = "Error converting XML data to number"; + static Error error = {ERR_STRTOD_ERRNO, {NULL}}; + error.s = numptr; + *errorptr = &error; Review comment: It's been a while since I've done C, but I feel like I recall that static variables should generally be avoided. I guess it's safe in this case, but I wonder if there is an alterantive approach? Is the goal here to just avoid alloacations? Personally, it would feel nice to me if everything, including the numptr part, could be done in one line--I think it's the static-ness of this that means we can't do that? Some alterantive might also make it so it could be harder to set the error parameters. I imagine if we forgot to set error.s here we might get a null pointer derference, or at the very least an error message with the string "(null)" or something. Also seems like this could become an issue if this is every used for parellel parsing in the same thread. These statics are essentially globals that other threads could clobber, so I guess this isn't thread-safe? ########## File path: daffodil-runtime2/src/main/resources/c/libruntime/errors.h ########## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef ERRORS_H +#define ERRORS_H + +#include <stdio.h> // for FILE, size_t +#include <stdint.h> // for int64_t + +// ErrorCode - types of errors which could occur + +enum ErrorCode +{ + ERR_CHOICE_KEY, + ERR_FILE_CLOSE, + ERR_FILE_FLUSH, + ERR_FILE_OPEN, + ERR_FIXED_VALUE, + ERR_INFOSET_READ, + ERR_INFOSET_WRITE, + ERR_PARSE_BOOL, + ERR_STACK_EMPTY, + ERR_STACK_OVERFLOW, + ERR_STACK_UNDERFLOW, + ERR_STREAM_EOF, + ERR_STREAM_ERROR, + ERR_STRTOBOOL, + ERR_STRTOD_ERRNO, + ERR_STRTOI_ERRNO, + ERR_STRTONUM_EMPTY, + ERR_STRTONUM_NOT, + ERR_STRTONUM_RANGE, + ERR_XML_DECL, + ERR_XML_ELEMENT, + ERR_XML_ERD, + ERR_XML_GONE, + ERR_XML_INPUT, + ERR_XML_LEFT, + ERR_XML_MISMATCH, + ERR_XML_WRITE +}; Review comment: Runtime1 has different kinds of diagnostics, e.g. error, warning, validation error, recoverable error. Are these all just considered "errors" in runtime1? Are there any diagnostics that are fatal errors? ########## File path: README.md ########## @@ -23,107 +26,134 @@ [<img src="https://img.shields.io/maven-central/v/org.apache.daffodil/daffodil-core_2.12.svg?color=brightgreen&label=version" align="right"/>][Releases] <br clear="both" /> -Apache Daffodil is an open-source implementation of the [DFDL specification] -that uses DFDL data descriptions to parse fixed format data into an infoset. -This infoset is commonly converted into XML or JSON to enable the use of -well-established XML or JSON technologies and libraries to consume, inspect, -and manipulate fixed format data in existing solutions. Daffodil is also -capable of serializing or "unparsing" data back to the original data format. -The DFDL infoset can also be converted directly to/from the data structures -carried by data processing frameworks so as to bypass any XML/JSON overheads. +Apache Daffodil is an open-source implementation of the [DFDL +specification] that uses DFDL data descriptions to parse fixed format +data into an infoset. This infoset is commonly converted into XML or +JSON to enable the use of well-established XML or JSON technologies +and libraries to consume, inspect, and manipulate fixed format data in +existing solutions. Daffodil is also capable of serializing or +"unparsing" data back to the original data format. The DFDL infoset +can also be converted directly to/from the data structures carried by +data processing frameworks so as to bypass any XML/JSON overheads. -For more information about Daffodil, see https://daffodil.apache.org/. +For more information about Daffodil, see the [Website]. ## Build Requirements * JDK 8 or higher * SBT 0.13.8 or higher -* C compiler (for daffodil-runtime2 only) -* Mini-XML Version 3.2 or higher (for daffodil-runtime2 only) +* C compiler C99 or higher +* Mini-XML Version 3.2 or higher + +Since Daffodil has a DFDL to C backend, you will need a C compiler +([gcc] or [clang]), the [Mini-XML] library, and possibly the GNU +[argp] library if your system's C library doesn't include it. You can +install gcc and libmxml as system packages on most Unix based +platforms with distribution-specific packager commands such as (Debian +and Ubuntu): + + # Just mentioning all other packages you might need too + sudo apt install build-essential curl git libmxml-dev + +You will need the Java Software Development Kit ([JDK]) and the Scala +Build Tool ([SBT]) to build Daffodil, run all tests, create packages, +and more. [SDK] offers an easy and uniform way to install both java +and sbt on any Unix based platform: + + curl -s "https://get.sdkman.io" | bash + sdk install java + sdk install sbt + +You can edit the Compile / cCompiler setting in build.sbt if you don't +want sbt to call your C compiler with "cc" as the driver command. + +On Windows, the easiest way to install gcc and libargp is to install +[MSYS2]'s collection of free tools and libraries although MSYS2 has no +package for libmxml which you'll need to build from source. First +install [MSYS2] following its website's installation instructions, +then run the following commands in a "MSYS2 MSYS" window: + + pacman -S gcc git libargp-devel make pkgconf + git clone https://github.com/michaelrsweet/mxml.git + cd mxml + ./configure --prefix=/usr --disable-shared --disable-threads + make + make install + +You also need to install [JDK} and [SBT] from their Windows +installation packages and define an environment variable using +Windows' control panel for editing environment variables. Define an +environment variable with the name `MSYS2_PATH_TYPE` and the value +`inherit`. Now when you open a new "MSYS2 MSYS" window from the Start +Menu, you will be able to type your sbt commands in the MSYS2 window +and both sbt and daffodil will be able to call the C compiler. ## Getting Started -You will need the full Java Software Development Kit ([JDK] or [SDK]), -not the Java Runtime Environment (JRE), to build Daffodil. You also -will need [SBT] to build Daffodil, run all tests, create packages, and -more. - -In order to build daffodil-runtime2, you will need a C compiler (for -example, [gcc]), the [Mini-XML] library, and possibly the [argp] -library if your system doesn't include it in its C library. - Below are some of the more common commands used for Daffodil development. ### Compile -```text -$ sbt compile -``` + sbt compile ### Tests Run all unit tests: -```text -$ sbt test -``` + sbt test Run all command line interface tests: -```text -$ sbt it:test -``` + sbt it:test ### Command Line Interface -Create Linux and Windows shell scripts in `daffodil-cli/target/universal/stage/bin/`. See -the [Command Line Interface] documentation for details on its usage: +Create Linux and Windows shell scripts in +`daffodil-cli/target/universal/stage/bin/`. See the [Command Line +Interface] documentation for details on its usage: -```btext -$ sbt daffodil-cli/stage -``` + sbt daffodil-cli/stage ### License Check -Generate an [Apache RAT] license check report located in ``target/rat.txt`` and error if -any unapproved licenses are found: +Generate an [Apache RAT] license check report located in +``target/rat.txt`` and error if any unapproved licenses are found: -```text -$ sbt ratCheck -``` + sbt ratCheck ### Test Coverage Report Generate an [sbt-scoverage] test coverage report located in ``target/scala-ver/scoverage-report/``: -```text -$ sbt clean coverage test it:test -$ sbt coverageAggregate -``` + sbt clean coverage test it:test + sbt coverageAggregate ## Getting Help For questions, we can be reached at the [email protected] or [email protected] mailing lists. Bugs can be reported via the [Daffodil JIRA]. [email protected] mailing lists. Bugs can be reported via the +[Daffodil JIRA]. ## License Apache Daffodil is licensed under the [Apache License, v2.0]. [Apache License, v2.0]: https://www.apache.org/licenses/LICENSE-2.0 [Apache RAT]: https://creadur.apache.org/rat/ -[CodeCov]: https://codecov.io/gh/apache/daffodil/ +[CodeCov]: https://app.codecov.io/gh/apache/daffodil [Command Line Interface]: https://daffodil.apache.org/cli/ -[DFDL specification]: http://www.ogf.org/dfdl -[Daffodil JIRA]: https://issues.apache.org/jira/projects/DAFFODIL +[DFDL specification]: https://daffodil.apache.org/docs/dfdl/ +[Daffodil JIRA]: https://issues.apache.org/jira/projects/DAFFODIL/ [Github Actions]: https://github.com/apache/daffodil/actions?query=branch%3Amaster+ -[JDK]: https://docs.oracle.com/en/java/javase/11/install/overview-jdk-installation.html +[JDK]: https://adoptopenjdk.net/ [Mini-XML]: https://www.msweet.org/mxml/ +[MSYS2]: https://www.msys2.org/ [Releases]: http://daffodil.apache.org/releases/ -[SBT]: http://www.scala-sbt.org -[SDK]: https://sdkman.io -[Website]: https://daffodil.apache.org +[SBT]: https://www.scala-sbt.org/ +[SDK]: https://sdkman.io/ Review comment: Should this be SDKMan or something else? SDK normally means something different. ########## File path: README.md ########## @@ -23,107 +26,134 @@ [<img src="https://img.shields.io/maven-central/v/org.apache.daffodil/daffodil-core_2.12.svg?color=brightgreen&label=version" align="right"/>][Releases] <br clear="both" /> -Apache Daffodil is an open-source implementation of the [DFDL specification] -that uses DFDL data descriptions to parse fixed format data into an infoset. -This infoset is commonly converted into XML or JSON to enable the use of -well-established XML or JSON technologies and libraries to consume, inspect, -and manipulate fixed format data in existing solutions. Daffodil is also -capable of serializing or "unparsing" data back to the original data format. -The DFDL infoset can also be converted directly to/from the data structures -carried by data processing frameworks so as to bypass any XML/JSON overheads. +Apache Daffodil is an open-source implementation of the [DFDL +specification] that uses DFDL data descriptions to parse fixed format +data into an infoset. This infoset is commonly converted into XML or +JSON to enable the use of well-established XML or JSON technologies +and libraries to consume, inspect, and manipulate fixed format data in +existing solutions. Daffodil is also capable of serializing or +"unparsing" data back to the original data format. The DFDL infoset +can also be converted directly to/from the data structures carried by +data processing frameworks so as to bypass any XML/JSON overheads. -For more information about Daffodil, see https://daffodil.apache.org/. +For more information about Daffodil, see the [Website]. ## Build Requirements * JDK 8 or higher * SBT 0.13.8 or higher -* C compiler (for daffodil-runtime2 only) -* Mini-XML Version 3.2 or higher (for daffodil-runtime2 only) +* C compiler C99 or higher +* Mini-XML Version 3.2 or higher + +Since Daffodil has a DFDL to C backend, you will need a C compiler +([gcc] or [clang]), the [Mini-XML] library, and possibly the GNU +[argp] library if your system's C library doesn't include it. You can +install gcc and libmxml as system packages on most Unix based +platforms with distribution-specific packager commands such as (Debian +and Ubuntu): + + # Just mentioning all other packages you might need too + sudo apt install build-essential curl git libmxml-dev + +You will need the Java Software Development Kit ([JDK]) and the Scala +Build Tool ([SBT]) to build Daffodil, run all tests, create packages, +and more. [SDK] offers an easy and uniform way to install both java +and sbt on any Unix based platform: + + curl -s "https://get.sdkman.io" | bash + sdk install java + sdk install sbt + +You can edit the Compile / cCompiler setting in build.sbt if you don't +want sbt to call your C compiler with "cc" as the driver command. + +On Windows, the easiest way to install gcc and libargp is to install +[MSYS2]'s collection of free tools and libraries although MSYS2 has no +package for libmxml which you'll need to build from source. First +install [MSYS2] following its website's installation instructions, +then run the following commands in a "MSYS2 MSYS" window: + + pacman -S gcc git libargp-devel make pkgconf + git clone https://github.com/michaelrsweet/mxml.git + cd mxml + ./configure --prefix=/usr --disable-shared --disable-threads + make + make install + +You also need to install [JDK} and [SBT] from their Windows Review comment: Should be [JDK] ########## File path: daffodil-runtime2/src/main/resources/c/libcli/daffodil_main.c ########## @@ -85,23 +93,25 @@ main(int argc, char *argv[]) output = fopen_or_exit(output, daffodil_parse.outfile, "w"); // Parse the input file into our infoset. - PState pstate = {input, 0, NULL}; + PState pstate = {input, 0, NULL, NULL}; root->erd->parseSelf(root, &pstate); - continue_or_exit(pstate.error_msg); + print_diagnostics(pstate.validati); Review comment: Is ``validati`` a typo? ########## File path: README.md ########## @@ -23,107 +26,134 @@ [<img src="https://img.shields.io/maven-central/v/org.apache.daffodil/daffodil-core_2.12.svg?color=brightgreen&label=version" align="right"/>][Releases] <br clear="both" /> -Apache Daffodil is an open-source implementation of the [DFDL specification] -that uses DFDL data descriptions to parse fixed format data into an infoset. -This infoset is commonly converted into XML or JSON to enable the use of -well-established XML or JSON technologies and libraries to consume, inspect, -and manipulate fixed format data in existing solutions. Daffodil is also -capable of serializing or "unparsing" data back to the original data format. -The DFDL infoset can also be converted directly to/from the data structures -carried by data processing frameworks so as to bypass any XML/JSON overheads. +Apache Daffodil is an open-source implementation of the [DFDL +specification] that uses DFDL data descriptions to parse fixed format +data into an infoset. This infoset is commonly converted into XML or +JSON to enable the use of well-established XML or JSON technologies +and libraries to consume, inspect, and manipulate fixed format data in +existing solutions. Daffodil is also capable of serializing or +"unparsing" data back to the original data format. The DFDL infoset +can also be converted directly to/from the data structures carried by +data processing frameworks so as to bypass any XML/JSON overheads. -For more information about Daffodil, see https://daffodil.apache.org/. +For more information about Daffodil, see the [Website]. ## Build Requirements * JDK 8 or higher * SBT 0.13.8 or higher -* C compiler (for daffodil-runtime2 only) -* Mini-XML Version 3.2 or higher (for daffodil-runtime2 only) +* C compiler C99 or higher +* Mini-XML Version 3.2 or higher + +Since Daffodil has a DFDL to C backend, you will need a C compiler +([gcc] or [clang]), the [Mini-XML] library, and possibly the GNU +[argp] library if your system's C library doesn't include it. You can +install gcc and libmxml as system packages on most Unix based +platforms with distribution-specific packager commands such as (Debian +and Ubuntu): + + # Just mentioning all other packages you might need too + sudo apt install build-essential curl git libmxml-dev + +You will need the Java Software Development Kit ([JDK]) and the Scala +Build Tool ([SBT]) to build Daffodil, run all tests, create packages, +and more. [SDK] offers an easy and uniform way to install both java +and sbt on any Unix based platform: + + curl -s "https://get.sdkman.io" | bash Review comment: I don't think we should reccomend that people download a script from the internet and pipe it to bash. Especially when I'd assume most Linux distros already support installing openjdk. I'd also rather reference the sbt download page, which has deb, rpm, and windows installers ########## File path: daffodil-runtime2/src/main/resources/c/libcli/daffodil_main.c ########## @@ -113,22 +123,24 @@ main(int argc, char *argv[]) if (strcmp(daffodil_unparse.infoset_converter, "xml") == 0) { // Initialize our infoset's values from the XML data. - XMLReader xmlReader = { - xmlReaderMethods, input, root, NULL, NULL}; - const char *error_msg = + XMLReader xmlReader = {xmlReaderMethods, input, root, NULL, + NULL}; + const Error *error = walkInfoset((VisitEventHandler *)&xmlReader, root); - continue_or_exit(error_msg); + continue_or_exit(error); } else { - error(EXIT_FAILURE, 0, "Cannot read infoset type '%s'", - daffodil_unparse.infoset_converter); + const Error error = {ERR_INFOSET_READ, + {daffodil_unparse.infoset_converter}}; Review comment: Thought's on increasing the max length for lines? Seems like a lot of lines are wrapped and I think it makes it a bit harder to read. We're not stuck in the days of 72 character devices anymore. I figure anything that can be viewed in GitHub review without scrolling is reasonable, and that's somewhere around 110 I think. ########## File path: README.md ########## @@ -23,107 +26,134 @@ [<img src="https://img.shields.io/maven-central/v/org.apache.daffodil/daffodil-core_2.12.svg?color=brightgreen&label=version" align="right"/>][Releases] <br clear="both" /> -Apache Daffodil is an open-source implementation of the [DFDL specification] -that uses DFDL data descriptions to parse fixed format data into an infoset. -This infoset is commonly converted into XML or JSON to enable the use of -well-established XML or JSON technologies and libraries to consume, inspect, -and manipulate fixed format data in existing solutions. Daffodil is also -capable of serializing or "unparsing" data back to the original data format. -The DFDL infoset can also be converted directly to/from the data structures -carried by data processing frameworks so as to bypass any XML/JSON overheads. +Apache Daffodil is an open-source implementation of the [DFDL +specification] that uses DFDL data descriptions to parse fixed format +data into an infoset. This infoset is commonly converted into XML or +JSON to enable the use of well-established XML or JSON technologies +and libraries to consume, inspect, and manipulate fixed format data in +existing solutions. Daffodil is also capable of serializing or +"unparsing" data back to the original data format. The DFDL infoset +can also be converted directly to/from the data structures carried by +data processing frameworks so as to bypass any XML/JSON overheads. -For more information about Daffodil, see https://daffodil.apache.org/. +For more information about Daffodil, see the [Website]. ## Build Requirements * JDK 8 or higher * SBT 0.13.8 or higher -* C compiler (for daffodil-runtime2 only) -* Mini-XML Version 3.2 or higher (for daffodil-runtime2 only) +* C compiler C99 or higher +* Mini-XML Version 3.2 or higher + +Since Daffodil has a DFDL to C backend, you will need a C compiler +([gcc] or [clang]), the [Mini-XML] library, and possibly the GNU +[argp] library if your system's C library doesn't include it. You can +install gcc and libmxml as system packages on most Unix based +platforms with distribution-specific packager commands such as (Debian +and Ubuntu): + + # Just mentioning all other packages you might need too + sudo apt install build-essential curl git libmxml-dev + +You will need the Java Software Development Kit ([JDK]) and the Scala +Build Tool ([SBT]) to build Daffodil, run all tests, create packages, +and more. [SDK] offers an easy and uniform way to install both java +and sbt on any Unix based platform: + + curl -s "https://get.sdkman.io" | bash + sdk install java + sdk install sbt + +You can edit the Compile / cCompiler setting in build.sbt if you don't +want sbt to call your C compiler with "cc" as the driver command. + +On Windows, the easiest way to install gcc and libargp is to install +[MSYS2]'s collection of free tools and libraries although MSYS2 has no +package for libmxml which you'll need to build from source. First +install [MSYS2] following its website's installation instructions, +then run the following commands in a "MSYS2 MSYS" window: + + pacman -S gcc git libargp-devel make pkgconf + git clone https://github.com/michaelrsweet/mxml.git + cd mxml + ./configure --prefix=/usr --disable-shared --disable-threads + make + make install + +You also need to install [JDK} and [SBT] from their Windows +installation packages and define an environment variable using +Windows' control panel for editing environment variables. Define an +environment variable with the name `MSYS2_PATH_TYPE` and the value +`inherit`. Now when you open a new "MSYS2 MSYS" window from the Start +Menu, you will be able to type your sbt commands in the MSYS2 window +and both sbt and daffodil will be able to call the C compiler. ## Getting Started -You will need the full Java Software Development Kit ([JDK] or [SDK]), -not the Java Runtime Environment (JRE), to build Daffodil. You also -will need [SBT] to build Daffodil, run all tests, create packages, and -more. - -In order to build daffodil-runtime2, you will need a C compiler (for -example, [gcc]), the [Mini-XML] library, and possibly the [argp] -library if your system doesn't include it in its C library. - Below are some of the more common commands used for Daffodil development. ### Compile -```text -$ sbt compile -``` + sbt compile ### Tests Run all unit tests: -```text -$ sbt test -``` + sbt test Run all command line interface tests: -```text -$ sbt it:test -``` + sbt it:test Review comment: This is now ``IntegrationTest / test``, might cause conflicts when this is eventually rebased. ########## File path: daffodil-runtime2/src/main/resources/c/libcli/xml_writer.c ########## @@ -16,83 +16,93 @@ */ #include "xml_writer.h" -#include "stack.h" // for stack_is_empty, stack_pop, stack_push, stack_top, stack_init, stack_is_full #include <assert.h> // for assert -#include <mxml.h> // for mxmlNewOpaquef, mxml_node_t, mxmlElementSetAttr, mxmlNewElement, mxmlDelete, mxmlNewXML, mxmlSaveFile, MXML_NO_CALLBACK +#include <mxml.h> // for mxmlNewOpaquef, mxml_node_t, mxmlElementSetAttr, mxmlGetOpaque, mxmlNewElement, mxmlDelete, mxmlGetElement, mxmlNewXML, mxmlSaveFile, MXML_NO_CALLBACK #include <stdbool.h> // for bool #include <stdint.h> // for int16_t, int32_t, int64_t, int8_t, uint16_t, uint32_t, uint64_t, uint8_t -#include <stdio.h> // for NULL, fflush +#include <stdio.h> // for NULL #include <string.h> // for strcmp +#include "errors.h" // for Error, ERR_XML_DECL, ERR_XML_ELEMENT, ERR_XML_WRITE, Error::(anonymous) +#include "stack.h" // for stack_is_empty, stack_pop, stack_push, stack_top, stack_init -// Push new XML document on stack. This function is not -// thread-safe since it uses static storage. +// Push new XML document on stack (note the stack is stored in a +// static array which could overflow and stop the program; it also +// means none of those functions are thread-safe) -static const char * +static const Error * xmlStartDocument(XMLWriter *writer) { -#define MAX_DEPTH 100 + enum + { + MAX_DEPTH = 100 Review comment: Thoughts on having a file for all these constant? Feels like your're trying really hard to avoid allocating things on the heap, but this can be pretty limiting. I imagine it would be nice for people with different resources to be able to easily bump up these constants in a single file rather than having to find all the places in where constants are limiting the size of thigns that can be parsed/unparse. ########## File path: daffodil-runtime2/src/main/resources/c/libcli/xml_reader.c ########## @@ -327,70 +346,75 @@ xmlNumberElem(XMLReader *reader, const ERD *erd, void *number) { if (strcmp(name_from_xml, name_from_erd) == 0) { + static Error error_erd = {ERR_XML_ERD, {NULL}}; + // Check for any errors getting the number - const char *errstr = NULL; + const Error *error = NULL; // Handle varying bit lengths of both signed & unsigned numbers const enum TypeCode typeCode = erd->typeCode; switch (typeCode) { case PRIMITIVE_BOOLEAN: - *(bool *)number = strtobool(number_from_xml, &errstr); + *(bool *)number = strtobool(number_from_xml, &error); break; case PRIMITIVE_FLOAT: - *(float *)number = strtofnum(number_from_xml, &errstr); + *(float *)number = strtofnum(number_from_xml, &error); break; case PRIMITIVE_DOUBLE: - *(double *)number = strtodnum(number_from_xml, &errstr); + *(double *)number = strtodnum(number_from_xml, &error); break; case PRIMITIVE_INT16: *(int16_t *)number = (int16_t)strtonum( - number_from_xml, INT16_MIN, INT16_MAX, &errstr); + number_from_xml, INT16_MIN, INT16_MAX, &error); break; case PRIMITIVE_INT32: *(int32_t *)number = (int32_t)strtonum( - number_from_xml, INT32_MIN, INT32_MAX, &errstr); + number_from_xml, INT32_MIN, INT32_MAX, &error); break; case PRIMITIVE_INT64: *(int64_t *)number = (int64_t)strtonum( - number_from_xml, INT64_MIN, INT64_MAX, &errstr); + number_from_xml, INT64_MIN, INT64_MAX, &error); break; case PRIMITIVE_INT8: *(int8_t *)number = (int8_t)strtonum(number_from_xml, INT8_MIN, - INT8_MAX, &errstr); + INT8_MAX, &error); break; case PRIMITIVE_UINT16: *(uint16_t *)number = - (uint16_t)strtounum(number_from_xml, UINT16_MAX, &errstr); + (uint16_t)strtounum(number_from_xml, UINT16_MAX, &error); break; case PRIMITIVE_UINT32: *(uint32_t *)number = - (uint32_t)strtounum(number_from_xml, UINT32_MAX, &errstr); + (uint32_t)strtounum(number_from_xml, UINT32_MAX, &error); break; case PRIMITIVE_UINT64: *(uint64_t *)number = - (uint64_t)strtounum(number_from_xml, UINT64_MAX, &errstr); + (uint64_t)strtounum(number_from_xml, UINT64_MAX, &error); break; case PRIMITIVE_UINT8: *(uint8_t *)number = - (uint8_t)strtounum(number_from_xml, UINT8_MAX, &errstr); + (uint8_t)strtounum(number_from_xml, UINT8_MAX, &error); break; default: - errstr = "Unexpected ERD typeCode while reading number from " - "XML data"; + error_erd.d64 = typeCode; + error = &error_erd; Review comment: Can error_erd be defined here instead of at the top, or does the C version you're targeting not allow that? ########## File path: daffodil-runtime2/src/main/resources/c/libruntime/errors.c ########## @@ -0,0 +1,200 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include "errors.h" +#include <error.h> // for error +#include <inttypes.h> // for PRId64 +#include <stdio.h> // for NULL, feof, ferror, FILE, size_t +#include <stdlib.h> // for EXIT_FAILURE + +// error_message - return an internationalized error message + +static const char * +error_message(enum ErrorCode code) +{ + switch (code) + { + case ERR_CHOICE_KEY: + return "no match between choice dispatch key %" PRId64 + " and any branch key"; + case ERR_FILE_CLOSE: + return "error closing file"; + case ERR_FILE_FLUSH: + return "error flushing stream to file"; + case ERR_FILE_OPEN: + return "error opening file '%s'"; + case ERR_FIXED_VALUE: + return "value of element '%s' does not match value of its " + "'fixed' attribute"; + case ERR_INFOSET_READ: + return "cannot read infoset type '%s'"; + case ERR_INFOSET_WRITE: + return "cannot write infoset type '%s'"; + case ERR_PARSE_BOOL: + return "error parsing binary value %" PRId64 " as either true or false"; + case ERR_STACK_EMPTY: + return "stack empty, stopping program"; + case ERR_STACK_OVERFLOW: + return "stack overflow, stopping program"; + case ERR_STACK_UNDERFLOW: + return "stack underflow, stopping program"; + case ERR_STREAM_EOF: + return "EOF in stream, stopping program"; + case ERR_STREAM_ERROR: + return "error in stream, stopping program"; + case ERR_STRTOBOOL: + return "error converting XML data '%s' to boolean"; + case ERR_STRTOD_ERRNO: + return "error converting XML data '%s' to number"; + case ERR_STRTOI_ERRNO: + return "error converting XML data '%s' to integer"; + case ERR_STRTONUM_EMPTY: + return "found no number in XML data '%s'"; + case ERR_STRTONUM_NOT: + return "found non-number characters in XML data '%s'"; + case ERR_STRTONUM_RANGE: + return "number in XML data '%s' out of range"; + case ERR_XML_DECL: + return "error making new XML declaration"; + case ERR_XML_ELEMENT: + return "error making new XML element '%s'"; + case ERR_XML_ERD: + return "unexpected ERD typeCode %" PRId64 " while reading XML data"; + case ERR_XML_GONE: + return "ran out of XML data"; + case ERR_XML_INPUT: + return "unable to read XML data from input file"; + case ERR_XML_LEFT: + return "did not consume all of the XML data, '%s' left"; + case ERR_XML_MISMATCH: + return "found mismatch between XML data and infoset '%s'"; + case ERR_XML_WRITE: + return "error writing XML document"; + default: + return "unrecognized error code, shouldn't happen"; Review comment: Can this throw an assertion if this case is hit? ########## File path: README.md ########## @@ -23,107 +26,134 @@ [<img src="https://img.shields.io/maven-central/v/org.apache.daffodil/daffodil-core_2.12.svg?color=brightgreen&label=version" align="right"/>][Releases] <br clear="both" /> -Apache Daffodil is an open-source implementation of the [DFDL specification] -that uses DFDL data descriptions to parse fixed format data into an infoset. -This infoset is commonly converted into XML or JSON to enable the use of -well-established XML or JSON technologies and libraries to consume, inspect, -and manipulate fixed format data in existing solutions. Daffodil is also -capable of serializing or "unparsing" data back to the original data format. -The DFDL infoset can also be converted directly to/from the data structures -carried by data processing frameworks so as to bypass any XML/JSON overheads. +Apache Daffodil is an open-source implementation of the [DFDL +specification] that uses DFDL data descriptions to parse fixed format +data into an infoset. This infoset is commonly converted into XML or +JSON to enable the use of well-established XML or JSON technologies +and libraries to consume, inspect, and manipulate fixed format data in +existing solutions. Daffodil is also capable of serializing or +"unparsing" data back to the original data format. The DFDL infoset +can also be converted directly to/from the data structures carried by +data processing frameworks so as to bypass any XML/JSON overheads. -For more information about Daffodil, see https://daffodil.apache.org/. +For more information about Daffodil, see the [Website]. ## Build Requirements * JDK 8 or higher * SBT 0.13.8 or higher -* C compiler (for daffodil-runtime2 only) -* Mini-XML Version 3.2 or higher (for daffodil-runtime2 only) +* C compiler C99 or higher +* Mini-XML Version 3.2 or higher + +Since Daffodil has a DFDL to C backend, you will need a C compiler +([gcc] or [clang]), the [Mini-XML] library, and possibly the GNU +[argp] library if your system's C library doesn't include it. You can +install gcc and libmxml as system packages on most Unix based +platforms with distribution-specific packager commands such as (Debian +and Ubuntu): + + # Just mentioning all other packages you might need too + sudo apt install build-essential curl git libmxml-dev + +You will need the Java Software Development Kit ([JDK]) and the Scala +Build Tool ([SBT]) to build Daffodil, run all tests, create packages, +and more. [SDK] offers an easy and uniform way to install both java +and sbt on any Unix based platform: + + curl -s "https://get.sdkman.io" | bash + sdk install java + sdk install sbt + +You can edit the Compile / cCompiler setting in build.sbt if you don't +want sbt to call your C compiler with "cc" as the driver command. Review comment: Is this something people might need to commonly do? If so, I wonder if it would be better to reccomnd actual commands to run, e.g. ``set Compiler / cCompiler := "gcc"`` rather than chaning build configs? Or maybe we should set cCompiler base don an environment varialble if one is set? Does the ccPlugin not have logic for finding the appropriate compiler? ########## File path: daffodil-runtime2/src/main/resources/c/libruntime/errors.c ########## @@ -0,0 +1,200 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include "errors.h" +#include <error.h> // for error +#include <inttypes.h> // for PRId64 +#include <stdio.h> // for NULL, feof, ferror, FILE, size_t +#include <stdlib.h> // for EXIT_FAILURE + +// error_message - return an internationalized error message + +static const char * +error_message(enum ErrorCode code) +{ + switch (code) + { + case ERR_CHOICE_KEY: + return "no match between choice dispatch key %" PRId64 + " and any branch key"; + case ERR_FILE_CLOSE: + return "error closing file"; + case ERR_FILE_FLUSH: + return "error flushing stream to file"; + case ERR_FILE_OPEN: + return "error opening file '%s'"; + case ERR_FIXED_VALUE: + return "value of element '%s' does not match value of its " + "'fixed' attribute"; + case ERR_INFOSET_READ: + return "cannot read infoset type '%s'"; + case ERR_INFOSET_WRITE: + return "cannot write infoset type '%s'"; + case ERR_PARSE_BOOL: + return "error parsing binary value %" PRId64 " as either true or false"; + case ERR_STACK_EMPTY: + return "stack empty, stopping program"; + case ERR_STACK_OVERFLOW: + return "stack overflow, stopping program"; + case ERR_STACK_UNDERFLOW: + return "stack underflow, stopping program"; + case ERR_STREAM_EOF: + return "EOF in stream, stopping program"; + case ERR_STREAM_ERROR: + return "error in stream, stopping program"; + case ERR_STRTOBOOL: + return "error converting XML data '%s' to boolean"; + case ERR_STRTOD_ERRNO: + return "error converting XML data '%s' to number"; + case ERR_STRTOI_ERRNO: + return "error converting XML data '%s' to integer"; + case ERR_STRTONUM_EMPTY: + return "found no number in XML data '%s'"; + case ERR_STRTONUM_NOT: + return "found non-number characters in XML data '%s'"; + case ERR_STRTONUM_RANGE: + return "number in XML data '%s' out of range"; + case ERR_XML_DECL: + return "error making new XML declaration"; + case ERR_XML_ELEMENT: + return "error making new XML element '%s'"; + case ERR_XML_ERD: + return "unexpected ERD typeCode %" PRId64 " while reading XML data"; + case ERR_XML_GONE: + return "ran out of XML data"; + case ERR_XML_INPUT: + return "unable to read XML data from input file"; + case ERR_XML_LEFT: + return "did not consume all of the XML data, '%s' left"; + case ERR_XML_MISMATCH: + return "found mismatch between XML data and infoset '%s'"; + case ERR_XML_WRITE: + return "error writing XML document"; + default: + return "unrecognized error code, shouldn't happen"; + } +} + +// print_maybe_stop - print a message and maybe stop the program + +static void +print_maybe_stop(const Error *err, int status) +{ + const int errnum = 0; + const char *format = "%s"; + const char *msg = error_message(err->code); + + switch (err->code) + { + case ERR_FILE_OPEN: + case ERR_FIXED_VALUE: + case ERR_INFOSET_READ: + case ERR_INFOSET_WRITE: + case ERR_STRTOBOOL: + case ERR_STRTOD_ERRNO: + case ERR_STRTOI_ERRNO: + case ERR_STRTONUM_EMPTY: + case ERR_STRTONUM_NOT: + case ERR_STRTONUM_RANGE: + case ERR_XML_ELEMENT: + case ERR_XML_LEFT: + case ERR_XML_MISMATCH: + error(status, errnum, msg, err->s); + break; + case ERR_CHOICE_KEY: + case ERR_PARSE_BOOL: + case ERR_XML_ERD: + error(status, errnum, msg, err->d64); + break; + default: + error(status, errnum, format, msg); + break; + } +} + +// need_diagnostics - return pointer to validation diagnostics + +Diagnostics * +need_diagnostics(void) +{ + static Diagnostics validati; + return &validati; Review comment: I can't say I ever seen these usage of static variable before. Is this just a way to have a global variable without actually looking like a global variable? ########## File path: daffodil-runtime2/src/main/resources/c/libruntime/errors.h ########## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef ERRORS_H +#define ERRORS_H + +#include <stdio.h> // for FILE, size_t +#include <stdint.h> // for int64_t + +// ErrorCode - types of errors which could occur + +enum ErrorCode +{ + ERR_CHOICE_KEY, + ERR_FILE_CLOSE, + ERR_FILE_FLUSH, + ERR_FILE_OPEN, + ERR_FIXED_VALUE, + ERR_INFOSET_READ, + ERR_INFOSET_WRITE, + ERR_PARSE_BOOL, + ERR_STACK_EMPTY, + ERR_STACK_OVERFLOW, + ERR_STACK_UNDERFLOW, + ERR_STREAM_EOF, + ERR_STREAM_ERROR, + ERR_STRTOBOOL, + ERR_STRTOD_ERRNO, + ERR_STRTOI_ERRNO, + ERR_STRTONUM_EMPTY, + ERR_STRTONUM_NOT, + ERR_STRTONUM_RANGE, + ERR_XML_DECL, + ERR_XML_ELEMENT, + ERR_XML_ERD, + ERR_XML_GONE, + ERR_XML_INPUT, + ERR_XML_LEFT, + ERR_XML_MISMATCH, + ERR_XML_WRITE +}; + +// Error - specific error occuring now + +typedef struct Error +{ + enum ErrorCode code; + union + { + const char *s; // for %s + int64_t d64; // for %d64 + }; +} Error; + +// Diagnostics - array of validation errors + +typedef struct Diagnostics +{ + Error array[100]; + size_t length; +} Diagnostics; + +// PState - mutable state while parsing data + +typedef struct PState +{ + FILE * stream; // input to read data from + size_t position; // 0-based position in stream + Diagnostics *validati; // any validation diagnostics + const Error *error; // any error which stops program +} PState; + +// UState - mutable state while unparsing infoset + +typedef struct UState +{ + FILE * stream; // output to write data to + size_t position; // 0-based position in stream + Diagnostics *validati; // any validation diagnostics + const Error *error; // any error which stops program +} UState; + +// need_diagnostics - return pointer to validation diagnostics + +extern Diagnostics *need_diagnostics(void); Review comment: Why get diagnostics from the static variable, rather than just having the PState/UState have a Diagnostics instances? Then PState/UState can be passed around and this state can be mutated as diagnostics/errors are created. Avoids potentialy issues with global state/threading. ########## File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/BinaryAbstractCodeGenerator.scala ########## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.daffodil.runtime2.generators + +import org.apache.daffodil.dsom.ElementBase +import org.apache.daffodil.schema.annotation.props.gen.BitOrder +import org.apache.daffodil.schema.annotation.props.gen.ByteOrder +import org.apache.daffodil.schema.annotation.props.gen.OccursCountKind + +trait BinaryAbstractCodeGenerator { + + def binaryAbstractGenerateCode(e: ElementBase, initialValue: String, prim: String, + parseArgs: String, unparseArgs: String, cgState: CodeGeneratorState): Unit = { + + // For the time being this is a very limited back end. + // So there are some restrictions to enforce. + e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, "Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.") + e.schemaDefinitionUnless(e.byteOrderEv.isConstant, "Runtime dfdl:byteOrder expressions not supported.") + e.schemaDefinitionUnless(e.elementLengthInBitsEv.isConstant, "Runtime dfdl:length expressions not supported.") + + val fieldName = e.namedQName.local + val byteOrder = e.byteOrderEv.constValue + val conv = if (byteOrder eq ByteOrder.BigEndian) "be" else "le" + val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) e.maxOccurs else 0 + val fixed = e.xml.attribute("fixed") + val fixedValue = if (fixed.isDefined) fixed.get.text else "" + + def addStatements(deref: String): Unit = { + val initStatement = s" instance->$fieldName$deref = $initialValue;" + val parseStatement = + s""" parse_${conv}_$prim(&instance->$fieldName$deref, $parseArgs); + | if (pstate->error) return;""".stripMargin + val unparseStatement = + s""" unparse_${conv}_$prim(instance->$fieldName$deref, $unparseArgs); + | if (ustate->error) return;""".stripMargin + cgState.addSimpleTypeStatements(initStatement, parseStatement, unparseStatement) + + if (fixedValue.nonEmpty) { + val init2 = "" + val parse2 = + s""" parse_validate_fixed(instance->$fieldName$deref == $fixedValue, "$fieldName", pstate); Review comment: Wondering if there's a potentially security issue here? Looks like we just inject the fixed value from teh schema directly into this C code. Which means I think a schema could look something like this to inject some code? ```xml <xs:element name="foo" ... fixed='0, "foo", pstate); malcious C code here; //' /> ``` ########## File path: daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/generators/BinaryAbstractCodeGenerator.scala ########## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.daffodil.runtime2.generators + +import org.apache.daffodil.dsom.ElementBase +import org.apache.daffodil.schema.annotation.props.gen.BitOrder +import org.apache.daffodil.schema.annotation.props.gen.ByteOrder +import org.apache.daffodil.schema.annotation.props.gen.OccursCountKind + +trait BinaryAbstractCodeGenerator { + + def binaryAbstractGenerateCode(e: ElementBase, initialValue: String, prim: String, + parseArgs: String, unparseArgs: String, cgState: CodeGeneratorState): Unit = { + + // For the time being this is a very limited back end. + // So there are some restrictions to enforce. + e.schemaDefinitionUnless(e.bitOrder eq BitOrder.MostSignificantBitFirst, "Only dfdl:bitOrder 'mostSignificantBitFirst' is supported.") + e.schemaDefinitionUnless(e.byteOrderEv.isConstant, "Runtime dfdl:byteOrder expressions not supported.") + e.schemaDefinitionUnless(e.elementLengthInBitsEv.isConstant, "Runtime dfdl:length expressions not supported.") + + val fieldName = e.namedQName.local + val byteOrder = e.byteOrderEv.constValue + val conv = if (byteOrder eq ByteOrder.BigEndian) "be" else "le" + val arraySize = if (e.occursCountKind == OccursCountKind.Fixed) e.maxOccurs else 0 + val fixed = e.xml.attribute("fixed") + val fixedValue = if (fixed.isDefined) fixed.get.text else "" Review comment: Would prefer we add this fixed logic to the dsom. For example, right now there is nothing pasing the fixed value and making sure it matches the type. For example ``` <xs:element name="someBool" type="xs:boolean" fixed="invalidBooleanValue" /> ``` Ideally the dsom would convert the "fixed" value to match the primitive type, which the DSOM should be doing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
