tuxji commented on a change in pull request #422:
URL: https://github.com/apache/incubator-daffodil/pull/422#discussion_r526947840
##########
File path: .gitignore
##########
@@ -52,23 +43,35 @@ daffodil-extra/*
.idea_modules
*.iml
#
-# FOR EMACS ENSIME
+# For Emacs & ensime
#
-.ensime_cache
-# these are autosave emacs files
-\#*#
+*~
.\#*
.\#.*
+.ensime_cache
+\#*#
#
-# FOR VIM
+# For vim
#
.*.swp
#
# For Eclipse
#
.classpath
.project
-bin/
+.settings
/eclipse-projects
-.settings/
-
+bin
+#
+# For C, Scala Metal, Visual Studio Code & zig cc
+#
+*.a
+*.o
+.bloop
+.metals
+.vscode
+GPATH
+GRTAGS
+GTAGS
+daffodil-runtime2/src/main/resources/daffodil
+zig-cache
Review comment:
This .gitignore is an old version. I know the main branch has removed
most IDE-specific files from this .gitignore in a more recent pull request and
I will keep this .gitignore as clean as possible when I rebase runtime-2202 to
keep up with changes to the main branch.
##########
File path: build.sbt
##########
@@ -68,9 +100,9 @@ lazy val tdmlProc =
Project("daffodil-tdml-processor", file("daffodil-td
.settings(commonSettings)
lazy val cli = Project("daffodil-cli",
file("daffodil-cli")).configs(IntegrationTest)
- .dependsOn(tdmlProc, sapi, japi, udf %
"it->test") // causes sapi/japi to be pulled in to the helper zip/tar
+ .dependsOn(tdmlProc, runtime2, sapi, japi, udf %
"it->test") // causes runtime2/sapi/japi to be pulled in to the helper zip/tar
Review comment:
Even though we're using Class.forName to avoid any compile time
dependency from daffodil core or cli classes on runtime2 classes, we still need
to ensure that the runtime2 classes are included in the daffodil cli's
classpath.
##########
File path: build.sbt
##########
@@ -43,6 +47,34 @@ lazy val runtime1 = Project("daffodil-runtime1",
file("daffodil-runtime1
.dependsOn(io, lib % "test->test", udf, macroLib
% "compile-internal, test-internal")
.settings(commonSettings, usesMacros)
+val runtime2CFiles = Library("libruntime2.a")
+lazy val runtime2 = Project("daffodil-runtime2",
file("daffodil-runtime2")).configs(IntegrationTest)
+ .enablePlugins(CcPlugin)
+ .dependsOn(core, core % "test->test", tdmlProc)
+ .settings(commonSettings)
+ .settings(publishArtifact in (Compile,
packageDoc) := false)
+ .settings(
+ Compile / ccTargets := ListSet(runtime2CFiles),
+ Compile / cSources := Map(
+ runtime2CFiles -> (
+ ((Compile / resourceDirectory).value / "c"
* GlobFilter("*.c")).get() ++
+ ((Compile / resourceDirectory).value /
"examples" * GlobFilter("*.c")).get()
+ )
+ ),
+ Compile / cIncludeDirectories := Map(
+ runtime2CFiles -> Seq(
+ (Compile / resourceDirectory).value / "c",
+ (Compile / resourceDirectory).value /
"examples"
+ )
+ ),
+ Compile / cFlags := (Compile /
cFlags).value.withDefaultValue(Seq(
+ "-g",
+ "-Wall",
+ "-Wextra",
+ "-Wno-missing-field-initializers",
+ ))
+ )
+
Review comment:
Strictly speaking, we don't need to compile the C source files in "sbt
compile" because nothing uses their object files (our runtime2 code always
builds an executable directly from C source files at runtime). The goal of
these sbt-cc settings is to warn developers as quickly as possible if they
change something in the C source files that doesn't compile correctly,
otherwise developers may not realize it until they run the runtime2 unit tests.
I'm willing to take out these sbt-cc settings if some developers don't want to
have any C compiler installed on their systems. Then only developers working
with the C source files will need to install a C compiler and they can get
early warnings by using an IDE with a C plugin like Visual Studio Code to edit
the C source files.
##########
File path: daffodil-cli/src/main/scala/org/apache/daffodil/Main.scala
##########
@@ -1337,11 +1398,42 @@ object Main extends Logging {
0
}
- case _ => {
- // This should never happen, this is caught by validation
- Assert.impossible()
- // 1
+ case Some(conf.generate) => {
+ conf.subcommands match {
+ case List(conf.generate, conf.generate.c) => {
+ val generateOpts = conf.generate.c
+
+ // Read any config file and any tunables given as arguments
+ val cfgFileNode = generateOpts.config.toOption match {
+ case None => None
+ case Some(pathToConfig) =>
Some(this.loadConfigurationFile(pathToConfig))
+ }
+ val tunables = retrieveTunables(generateOpts.tunables, cfgFileNode)
+
+ // Create a CodeGenerator from the DFDL schema
+ val generator = createGeneratorFromSchema(generateOpts.schema(),
generateOpts.rootNS.toOption,
+ tunables, generateOpts.language)
+
+ // Ask the CodeGenerator to generate source code from the DFDL
schema
+ val rootNS = generateOpts.rootNS.toOption
+ val outputDir = generateOpts.outputDir.toOption.getOrElse(".")
+ val rc = generator match {
+ case Some(generator) => {
+ Timer.getResult("generating", generator.generateCode(rootNS,
outputDir))
+ displayDiagnostics(generator)
+ if (generator.isError) 1 else 0
+ }
+ case None => 1
+ }
+ rc
+ }
+ // Required to avoid "match may not be exhaustive", but should never
happen
+ case _ => Assert.impossible()
+ }
Review comment:
We can generate code for more languages by adding more cases here. The
code inside the case probably would need no change except for the first two
lines, so if we add another case we should extract the code in our "c" case
into a function that can be called from another language-specific case.
```
case List(conf.generate, conf.generate.c) => {
val generateOpts = conf.generate.c
```
##########
File path:
daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd
##########
@@ -388,6 +396,14 @@
</xs:documentation>
</xs:annotation>
</xs:element>
+ <xs:element name="tdmlImplementation" type="xs:string"
default="daffodil" minOccurs="0">
+ <xs:annotation>
+ <xs:documentation>
+ TDMLDFDLProcessorFactory implementation to use when running TDML
tests.
+ Allowed values are "daffodil" (default), "daffodil-runtime2",
and "ibm".
+ </xs:documentation>
+ </xs:annotation>
+ </xs:element>
<xs:element name="unqualifiedPathStepPolicy"
type="daf:TunableUnqualifiedPathStepPolicy" default="noNamespace" minOccurs="0">
Review comment:
TDML tests can set this tunable to tell Daffodil which TDML processor
class should be used to run them (they already have to say which TDML
implementations they can work with if they want to be compatible with multiple
processors). Otherwise, we would have to use a different CLI executable to run
TDML tests with runtime2's TDML processor class, or else modify `daffodil test
<tdml>` to accept an option similar to this tunable.
##########
File path: daffodil-runtime2/src/main/resources/examples/ex_int32.c
##########
@@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include "ex_int32.h" // for generated code structs
+#include <endian.h> // for be32toh, htobe32
+#include <stddef.h> // for ptrdiff_t
+#include <stdio.h> // for NULL, fread, fwrite, size_t, FILE
+
+// Prototypes needed for compilation
+
+static void c2_initSelf(c2 *instance);
+static const char *c2_parseSelf(c2 *instance, const PState *pstate);
+static const char *c2_unparseSelf(const c2 *instance, const UState *ustate);
+static void c1_initSelf(c1 *instance);
+static const char *c1_parseSelf(c1 *instance, const PState *pstate);
+static const char *c1_unparseSelf(const c1 *instance, const UState *ustate);
+
+// Metadata singletons
+
+static const ERD e1_ERD = {
+ {
+ "ex", // namedQName.prefix
+ "e1", // namedQName.local
+ NULL, // namedQName.ns
+ },
+ PRIMITIVE_INT32, // typeCode
+ 0, // numChildren
+ NULL, // offsets
+ NULL, // childrenERDs
+ NULL, // initSelf
+ NULL, // parseSelf
+ NULL, // unparseSelf
+};
+
+static const ERD e2_ERD = {
+ {
+ "ex", // namedQName.prefix
+ "e2", // namedQName.local
+ NULL, // namedQName.ns
+ },
+ PRIMITIVE_INT32, // typeCode
+ 0, // numChildren
+ NULL, // offsets
+ NULL, // childrenERDs
+ NULL, // initSelf
+ NULL, // parseSelf
+ NULL, // unparseSelf
+};
+
+static const ERD e3_ERD = {
+ {
+ "ex", // namedQName.prefix
+ "e3", // namedQName.local
+ NULL, // namedQName.ns
+ },
+ PRIMITIVE_INT32, // typeCode
+ 0, // numChildren
+ NULL, // offsets
+ NULL, // childrenERDs
+ NULL, // initSelf
+ NULL, // parseSelf
+ NULL, // unparseSelf
+};
+
+static const c2 c2_compute_ERD_offsets;
+
+static const ptrdiff_t c2_offsets[2] = {
+ (char *)&c2_compute_ERD_offsets.e2 - (char *)&c2_compute_ERD_offsets,
+ (char *)&c2_compute_ERD_offsets.e3 - (char *)&c2_compute_ERD_offsets};
+
+static const ERD *c2_childrenERDs[2] = {&e2_ERD, &e3_ERD};
+
+static const ERD c2_ERD = {
+ {
+ "ex", // namedQName.prefix
+ "c2", // namedQName.local
+ NULL, // namedQName.ns
+ },
+ COMPLEX, // typeCode
+ 2, // numChildren
+ c2_offsets, // offsets
+ c2_childrenERDs, // childrenERDs
+ (ERDInitSelf)&c2_initSelf, // initSelf
+ (ERDParseSelf)&c2_parseSelf, // parseSelf
+ (ERDUnparseSelf)&c2_unparseSelf, // unparseSelf
+};
+
+static const c1 c1_compute_ERD_offsets;
+
+static const ptrdiff_t c1_offsets[2] = {
+ (char *)&c1_compute_ERD_offsets.e1 - (char *)&c1_compute_ERD_offsets,
+ (char *)&c1_compute_ERD_offsets.c2 - (char *)&c1_compute_ERD_offsets};
+
+static const ERD *c1_childrenERDs[2] = {&e1_ERD, &c2_ERD};
+
+static const ERD c1_ERD = {
+ {
+ "ex", // namedQName.prefix
+ "c1", // namedQName.local
+ "http://example.com", // namedQName.ns
+ },
+ COMPLEX, // typeCode
+ 2, // numChildren
+ c1_offsets, // offsets
+ c1_childrenERDs, // childrenERDs
+ (ERDInitSelf)&c1_initSelf, // initSelf
+ (ERDParseSelf)&c1_parseSelf, // parseSelf
+ (ERDUnparseSelf)&c1_unparseSelf, // unparseSelf
+};
+
+// Return a root element to be used for parsing or unparsing
+
+InfosetBase *
+rootElement()
+{
+ static c1 instance;
+ InfosetBase *root = &instance._base;
+ c1_ERD.initSelf(root);
+ return root;
+}
+
+// Methods to initialize, parse, and unparse infoset nodes
+
+static void
+c2_initSelf(c2 *instance)
+{
+ instance->e2 = 0xCDCDCDCD;
+ instance->e3 = 0xCDCDCDCD;
+ instance->_base.erd = &c2_ERD;
+}
+
+static const char *
+c2_parseSelf(c2 *instance, const PState *pstate)
+{
+ const char *error_msg = NULL;
+ if (!error_msg)
+ {
+ char buffer[4];
+ size_t count = fread(&buffer, 1, sizeof(buffer), pstate->stream);
+ if (count < sizeof(buffer))
+ {
+ error_msg = eof_or_error_msg(pstate->stream);
+ }
+ instance->e2 = be32toh(*((uint32_t *)(&buffer)));
+ }
+ if (!error_msg)
+ {
+ char buffer[4];
+ size_t count = fread(&buffer, 1, sizeof(buffer), pstate->stream);
+ if (count < sizeof(buffer))
+ {
+ error_msg = eof_or_error_msg(pstate->stream);
+ }
+ instance->e3 = be32toh(*((uint32_t *)(&buffer)));
+ }
+ return error_msg;
+}
+
+static const char *
+c2_unparseSelf(const c2 *instance, const UState *ustate)
+{
+ const char *error_msg = NULL;
+ if (!error_msg)
+ {
+ union
+ {
+ char c_val[4];
+ uint32_t i_val;
+ } buffer;
+ buffer.i_val = htobe32(instance->e2);
+ size_t count = fwrite(buffer.c_val, 1, sizeof(buffer), ustate->stream);
+ if (count < sizeof(buffer))
+ {
+ error_msg = eof_or_error_msg(ustate->stream);
+ }
+ }
+ if (!error_msg)
+ {
+ union
+ {
+ char c_val[4];
+ uint32_t i_val;
+ } buffer;
+ buffer.i_val = htobe32(instance->e3);
+ size_t count = fwrite(buffer.c_val, 1, sizeof(buffer), ustate->stream);
+ if (count < sizeof(buffer))
+ {
+ error_msg = eof_or_error_msg(ustate->stream);
+ }
+ }
+ return error_msg;
+}
+
+static void
+c1_initSelf(c1 *instance)
+{
+ instance->e1 = 0xCDCDCDCD;
+ c2_initSelf(&instance->c2);
+ instance->_base.erd = &c1_ERD;
+}
+
+static const char *
+c1_parseSelf(c1 *instance, const PState *pstate)
+{
+ const char *error_msg = NULL;
+ if (!error_msg)
+ {
+ char buffer[4];
+ size_t count = fread(&buffer, 1, sizeof(buffer), pstate->stream);
+ if (count < sizeof(buffer))
+ {
+ error_msg = eof_or_error_msg(pstate->stream);
+ }
+ instance->e1 = be32toh(*((uint32_t *)(&buffer)));
+ }
+ if (!error_msg)
+ {
+ error_msg = c2_parseSelf(&instance->c2, pstate);
+ }
+ return error_msg;
+}
+
+static const char *
+c1_unparseSelf(const c1 *instance, const UState *ustate)
+{
+ const char *error_msg = NULL;
+ if (!error_msg)
+ {
+ union
+ {
+ char c_val[4];
+ uint32_t i_val;
+ } buffer;
+ buffer.i_val = htobe32(instance->e1);
+ size_t count = fwrite(buffer.c_val, 1, sizeof(buffer), ustate->stream);
+ if (count < sizeof(buffer))
+ {
+ error_msg = eof_or_error_msg(ustate->stream);
+ }
+ }
+ if (!error_msg)
+ {
+ error_msg = c2_unparseSelf(&instance->c2, ustate);
+ }
+ return error_msg;
+}
Review comment:
This C source file's directory and name now clearly labels it as a
generated code example. You can link it with the files in the `c` directory to
debug and test any part of the code. Over time, we'll create more generated
code examples and put them in this directory too.
##########
File path: daffodil-cli/src/main/scala/org/apache/daffodil/Main.scala
##########
@@ -728,6 +764,31 @@ object Main extends Logging {
pf
}
+ def createGeneratorFromSchema(schema: URI, rootNS: Option[RefQName],
tunables: Map[String, String],
+ language: String): Option[DFDL.CodeGenerator]
= {
+ val compiler = {
+ val c = Compiler().withTunables(tunables)
+ rootNS match {
+ case None => c
+ case Some(RefQName(_, root, ns)) => c.withDistinguishedRootNode(root,
ns.toStringOrNullIfNoNS)
+ }
+ }
+
+ val schemaSource = URISchemaSource(schema)
+ val cg = Timer.getResult("compiling", {
+ val processorFactory = compiler.compileSource(schemaSource)
+ if (!processorFactory.isError) {
+ val generator = processorFactory.forLanguage(language)
+ displayDiagnostics(generator)
+ Some(generator)
+ } else {
+ displayDiagnostics(processorFactory)
+ None
+ }
+ })
+ cg
+ }
+
Review comment:
Creating a generator from a schema is now similar to creating a
processor from a schema; both use the same factory. You compile the schema
first to get a factory, then you call `processorFactory.forLanguage(language)`
to get a code generator instead of calling `processorFactory.onPath(path)` to
get a data processor.
##########
File path:
daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd
##########
@@ -380,6 +380,14 @@
</xs:documentation>
</xs:annotation>
</xs:element>
+ <xs:element name="runtime" type="xs:string" default="runtime1"
minOccurs="0">
+ <xs:annotation>
+ <xs:documentation>
+ Runtime implementation to use when running daffodil parse or
unparse.
+ Allowed values are "runtime1" (default) and "runtime2".
+ </xs:documentation>
+ </xs:annotation>
+ </xs:element>
Review comment:
I will remove this tunable since nothing uses it and I've realized we
don't need or want it.
##########
File path: daffodil-runtime2/src/main/resources/.clang-format
##########
@@ -0,0 +1,22 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+AlignConsecutiveDeclarations: true
+AllowShortFunctionsOnASingleLine: None
+AlwaysBreakAfterReturnType: TopLevelDefinitions
+BasedOnStyle: llvm
+BreakBeforeBraces: Allman
+IndentWidth: 4
+KeepEmptyLinesAtTheStartOfBlocks: false
Review comment:
This clang-format file should be kept in source control so developers
can format the C source files consistently.
##########
File path: daffodil-runtime2/src/main/resources/.vscode/launch.json
##########
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements. See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+{
+ // Use IntelliSense to learn about possible attributes.
+ // Hover to view descriptions of existing attributes.
+ // For more information, visit:
https://go.microsoft.com/fwlink/?linkid=830387
+ "version": "0.2.0",
+ "configurations": [
+
+ {
+ "name": "Debug daffodil",
+ "type": "cppdbg",
+ "request": "launch",
+ "program": "${workspaceFolder}/daffodil",
+ "args": ["parse",
"../../test/resources/org/apache/daffodil/runtime2/parse_int32"],
+ "stopAtEntry": false,
+ "cwd": "${workspaceFolder}",
+ "environment": [],
+ "externalConsole": false,
+ "MIMode": "gdb",
+ "setupCommands": [
+ {
+ "description": "Enable pretty-printing for gdb",
+ "text": "-enable-pretty-printing",
+ "ignoreFailures": true
+ }
+ ],
+ "preLaunchTask": "Build daffodil",
+ "miDebuggerPath": "/usr/bin/gdb"
+ }
+ ]
+}
Review comment:
Now that I think about it, we probably shouldn't keep this IDE-specific
file or the other IDE-specific file below in source control, however.
##########
File path:
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/CodeGenerator.scala
##########
@@ -0,0 +1,177 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.runtime2
+
+import java.io.File
+import java.nio.file.FileSystems
+import java.nio.file.Files
+import java.nio.file.Paths
+import java.util.Collections
+
+import org.apache.daffodil.api.DFDL
+import org.apache.daffodil.api.Diagnostic
+import org.apache.daffodil.dsom.Root
+import org.apache.daffodil.dsom.SchemaDefinitionError
+import org.apache.daffodil.runtime2.generators.CodeGeneratorState
+import org.apache.daffodil.util.Misc
+import org.apache.daffodil.xml.RefQName
+
+/**
+ * Generates and compiles C source files from a DFDL schema encapsulated in a
[[Root]].
+ * Implements the [[DFDL.CodeGenerator]] trait to allow it to be called by
Daffodil code.
+ * @param root Provides the DFDL schema for code generation
+ */
+class CodeGenerator(root: Root) extends DFDL.CodeGenerator {
+ // Used by compileCode and pickCompiler methods
+ private lazy val isWindows =
System.getProperty("os.name").toLowerCase().startsWith("windows")
+ // Used by WithDiagnostics methods
+ private var diagnostics: Seq[Diagnostic] = Nil
+ private var errorStatus: Boolean = false
+
+ // Writes C source files into a "c" subdirectory of the given output
directory.
+ // Removes the "c" subdirectory if it existed before.
+ override def generateCode(rootNS: Option[RefQName], outputDirArg: String):
os.Path = {
+ // Get the paths of the output directory and its code subdirectory
+ val outputDir = os.Path(Paths.get(outputDirArg).toAbsolutePath)
+ val codeDir = outputDir/"c"
+
+ // Ensure our output directory exists while our code subdirectory does not
+ os.makeDir.all(outputDir)
+ os.remove.all(codeDir)
+
+ // Copy our resource directory and all its C source files to our code
subdirectory
+ val resourceUri = Misc.getRequiredResource("/c")
+ val fileSystem = if (resourceUri.getScheme == "jar")
+ FileSystems.newFileSystem(resourceUri, Collections.emptyMap(), null)
+ else
+ null
+ try {
+ val resourceDir = os.Path(if (fileSystem != null)
fileSystem.getPath("/c") else Paths.get(resourceUri))
+ os.copy(resourceDir, codeDir)
+ }
+ finally
+ if (fileSystem != null) fileSystem.close()
+
+ // Generate C code from the DFDL schema
+ val rootElementName = rootNS.getOrElse(root.refQName).local
+ val codeGeneratorState = new CodeGeneratorState()
+ Runtime2CodeGenerator.generateCode(root.document, codeGeneratorState)
+ val codeHeaderText = codeGeneratorState.generateCodeHeader
+ val codeFileText = codeGeneratorState.generateCodeFile(rootElementName)
+
+ // Write the generated C code into our code subdirectory
+ val generatedCodeHeader = codeDir/"generated_code.h"
+ val generatedCodeFile = codeDir/"generated_code.c"
+ os.write(generatedCodeHeader, codeHeaderText)
+ os.write(generatedCodeFile, codeFileText)
+
+ // Return our output directory in case caller wants to call compileCode
next
+ outputDir
+ }
+
+ // Compiles any C source files inside a "c" subdirectory of the given output
directory.
+ // Returns the path of the newly created executable to use in TDML tests or
something else.
+ override def compileCode(outputDir: os.Path): os.Path = {
+ // Get the paths of the code subdirectory and the executable we will build
+ val codeDir = outputDir/"c"
+ val exe = if (isWindows) codeDir/"daffodil" else codeDir/"daffodil.exe"
+
+ try {
+ // Assemble the compiler's command line arguments
+ val compiler = pickCompiler
+ val files = os.list(codeDir).filter(_.ext == "c")
+ val libs = Seq("-lmxml", if (isWindows) "-largp" else "-lpthread")
+
+ // Call the compiler if it was found. We run the compiler in the output
directory,
+ // not in the "c" subdirectory, in order to let the compiler (which
might be "zig cc")
+ // cache/reuse previously built files (which might be in a "zig_cache"
subdirectory).
+ // We can't let "zig_cache" be put into "c" because we always remove and
re-generate
+ // everything in "c" from scratch.
+ if (compiler.nonEmpty) {
+ val result = os.proc(compiler, "-I", codeDir, files, libs, "-o",
exe).call(cwd = outputDir, stderr = os.Pipe)
+
+ // Report any compiler output as a warning
+ if (result.out.text.nonEmpty || result.err.text.nonEmpty) {
+ warning("Unexpected compiler output on stdout: %s on stderr: %s",
result.out.text, result.err.text)
+ }
+ }
+ } catch {
+ // Report any subprocess termination error as an error
+ case e: os.SubprocessException =>
+ error("Error compiling generated code: %s wd: %s",
Misc.getSomeMessage(e).get, outputDir.toString)
+ }
+
+ // Report any failure to build the executable as an error
+ if (!os.exists(exe)) error("No executable was built: %s", exe.toString)
+ exe
+ }
+
+ /**
+ * Searches for any available C compiler on the system. Tries to find the
+ * compiler given by `CC` if `CC` exists in the environment, then tries to
+ * find any compiler from the following list:
+ *
+ * - zig cc
+ * - gcc
+ * - clang
+ * - cc
+ *
+ * Returns the first compiler found as a sequence of strings in case the
+ * compiler is a program with a subcommand argument. Returns the empty
+ * sequence if no compiler could be found in the user's PATH.
+ */
+ lazy val pickCompiler: Seq[String] = {
+ val ccEnv = System.getenv("CC")
+ val compilers = Seq(ccEnv, "zig cc", "gcc", "clang", "cc")
+ val path = System.getenv("PATH").split(File.pathSeparatorChar)
+ def inPath(compiler: String): Boolean = {
+ (compiler != null) && {
+ val exec = compiler.takeWhile(_ != ' ')
+ val exec2 = exec + ".exe"
+ path.exists(dir => Files.isExecutable(Paths.get(dir, exec))
+ || (isWindows && Files.isExecutable(Paths.get(dir, exec2))))
+ }
+ }
+ val compiler = compilers.find(inPath)
+ if (compiler.isDefined)
+ compiler.get.split(' ').toSeq
+ else
+ Seq.empty[String]
+ }
+
+ /**
+ * Adds a warning message to the diagnostics
+ */
+ def warning(formatString: String, args: Any*): Unit = {
+ val sde = new SchemaDefinitionError(None, None, formatString, args: _*)
+ diagnostics :+= sde
+ }
+
+ /**
+ * Adds an error message to the diagnostics and sets isError true
+ */
+ def error(formatString: String, args: Any*): Unit = {
+ val sde = new SchemaDefinitionError(None, None, formatString, args: _*)
+ diagnostics :+= sde
+ errorStatus = true
+ }
+
+ // Implements the WithDiagnostics methods
+ override def getDiagnostics: Seq[Diagnostic] = diagnostics
+ override def isError: Boolean = errorStatus
+}
Review comment:
This is the core functionality that enables `daffodil generate c -s
<schema> <outdir>`. It also builds an executable to be run by TDML tests. If
you install [zig
cc](https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html#:~:text=Install%20simply%20by%20unzipping%20a,%2C%20and%20you're%20done.)
as your compiler, you will be able to run a large suite of TDML tests more
quickly because the `zig cc` frontend will cache the object files compiled from
the static C source files and compile only the generated code whenever it
changes.
##########
File path: daffodil-runtime2/src/main/resources/c/infoset.h
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef INFOSET_H
+#define INFOSET_H
+
+#include <stddef.h> // for ptrdiff_t
+#include <stdint.h> // for int32_t
+#include <stdio.h> // for FILE, size_t
+
+// Prototypes needed for compilation
+
+struct ElementRuntimeData;
+struct InfosetBase;
+struct PState;
+struct UState;
+struct VisitEventHandler;
+
+typedef struct ElementRuntimeData ERD;
+typedef struct InfosetBase InfosetBase;
+typedef struct PState PState;
+typedef struct UState UState;
+typedef struct VisitEventHandler VisitEventHandler;
+
+typedef void (*ERDInitSelf)(InfosetBase *infoNode);
+typedef const char *(*ERDParseSelf)(InfosetBase * infoNode,
+ const PState *pstate);
+typedef const char *(*ERDUnparseSelf)(const InfosetBase *infoNode,
+ const UState * ustate);
+
+typedef const char *(*VisitStartDocument)(const VisitEventHandler *handler);
+typedef const char *(*VisitEndDocument)(const VisitEventHandler *handler);
+typedef const char *(*VisitStartComplex)(const VisitEventHandler *handler,
+ const InfosetBase * base);
+typedef const char *(*VisitEndComplex)(const VisitEventHandler *handler,
+ const InfosetBase * base);
+typedef const char *(*VisitInt32Elem)(const VisitEventHandler *handler,
+ const ERD *erd, const int32_t *location);
+
+// NamedQName - name of an infoset element
+
+typedef struct NamedQName
+{
+ const char *prefix; // prefix (optional, may be NULL)
+ const char *local; // local name
+ const char *ns; // namespace URI (optional, may be NULL)
+} NamedQName;
+
+// TypeCode - type of an infoset element
+
+enum TypeCode
+{
+ COMPLEX,
+ PRIMITIVE_INT32
+};
+
+// ERD - element runtime data needed to parse/unparse objects
+
+typedef struct ElementRuntimeData
+{
+ const NamedQName namedQName;
+ const enum TypeCode typeCode;
+ const size_t numChildren;
+ const ptrdiff_t * offsets;
+ const ERD ** childrenERDs;
+
+ const ERDInitSelf initSelf;
+ const ERDParseSelf parseSelf;
+ const ERDUnparseSelf unparseSelf;
+} ERD;
+
+// InfosetBase - representation of an infoset element
+
+typedef struct InfosetBase
+{
+ const ERD *erd;
+} InfosetBase;
+
+// PState - parser state while parsing input
+
+typedef struct PState
+{
+ FILE *stream; // input to read from
+} PState;
+
+// UState - unparser state while unparsing infoset
+
+typedef struct UState
+{
+ FILE *stream; // output to write to
+} UState;
+
+// VisitEventHandler - methods to be called when walking an infoset
+
+typedef struct VisitEventHandler
+{
+ const VisitStartDocument visitStartDocument;
+ const VisitEndDocument visitEndDocument;
+ const VisitStartComplex visitStartComplex;
+ const VisitEndComplex visitEndComplex;
+ const VisitInt32Elem visitInt32Elem;
+} VisitEventHandler;
+
+// get_erd_name, get_erd_xmlns, get_erd_ns - get name and xmlns
+// attribute/value from ERD to use for XML element
+
+extern const char *get_erd_name(const ERD *erd);
+extern const char *get_erd_xmlns(const ERD *erd);
+extern const char *get_erd_ns(const ERD *erd);
+
+// rootElement - return a root element to walk while parsing or unparsing
+
+// (actual definition will be in generated_code.c, not infoset.c)
+extern InfosetBase *rootElement();
+
+// walkInfoset - walk an infoset and call VisitEventHandler methods
+
+extern const char *walkInfoset(const VisitEventHandler *handler,
+ const InfosetBase * infoset);
+
+// eof_or_error_msg - check if a stream has its eof or error indicator set
+
+extern const char *eof_or_error_msg(FILE *stream);
+
+#endif // INFOSET_H
Review comment:
If you throw away the CLI and XML reader/writer, you're left with this
infoset API and the actual generated code.
##########
File path:
daffodil-runtime2/src/main/scala/org/apache/daffodil/runtime2/Runtime2DataProcessor.scala
##########
@@ -0,0 +1,206 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.runtime2
+
+import java.io.File
+import java.io.InputStream
+import java.io.OutputStream
+
+import org.apache.daffodil.api.DFDL
+import org.apache.daffodil.api.DaffodilTunables
+import org.apache.daffodil.api.DataLocation
+import org.apache.daffodil.api.ValidationMode
+import org.apache.daffodil.externalvars.Binding
+import org.apache.daffodil.processors.Failure
+import org.apache.daffodil.processors.ProcessorResult
+import org.apache.daffodil.processors.Success
+import org.apache.daffodil.processors.VariableMap
+import org.apache.daffodil.processors.WithDiagnosticsImpl
+import org.apache.daffodil.processors.parsers.ParseError
+import org.apache.daffodil.processors.unparsers.UnparseError
+import org.apache.daffodil.util.Maybe
+import org.apache.daffodil.util.Maybe.Nope
+
+/**
+ * Effectively a scala proxy object that does its work via the underlying
C-code.
+ * Will need to consider how to use features of underlying C-code to get
infoset,
+ * walk infoset, generate XML for use by TDML tests.
+ */
+class Runtime2DataProcessor(executableFile: os.Path) extends
DFDL.DataProcessorBase {
+
+ override def withValidationMode(mode: ValidationMode.Type):
DFDL.DataProcessor = ???
+
+ override def withTunable(name: String, value: String): DFDL.DataProcessor =
???
+
+ override def withTunables(tunables: Map[String, String]): DFDL.DataProcessor
= ???
+
+ override def withExternalVariables(extVars: Map[String, String]):
DFDL.DataProcessor = ???
+
+ override def withExternalVariables(extVars: File): DFDL.DataProcessor = ???
+
+ override def withExternalVariables(extVars: Seq[Binding]):
DFDL.DataProcessor = ???
+
+ override def validationMode: ValidationMode.Type = ???
+
+ override def getTunables(): DaffodilTunables = ???
+
+ override def save(output: DFDL.Output): Unit = ???
+
+ override def variableMap: VariableMap = ???
+
+ override def setValidationMode(mode: ValidationMode.Type): Unit = ???
+
+ override def setExternalVariables(extVars: Map[String, String]): Unit = ???
+
+ override def setExternalVariables(extVars: File): Unit = ???
+
+ override def setExternalVariables(extVars: File, tunable: DaffodilTunables):
Unit = ???
+
+ override def setExternalVariables(extVars: Seq[Binding]): Unit = ???
+
+ override def setTunable(tunable: String, value: String): Unit = ???
+
+ override def setTunables(tunables: Map[String, String]): Unit = ???
+
+ /**
+ * Returns an object which contains the result, and/or diagnostic
information.
+ */
+ def parse(input: InputStream): ParseResult = {
+ val tempDir = os.temp.dir()
+ val infile = tempDir/"infile"
+ val outfile = tempDir/"outfile"
+ try {
+ os.write(infile, input)
+ val result = os.proc(executableFile, "parse", "-I", "xml", "-o",
outfile, infile).call(cwd = tempDir, stderr = os.Pipe)
+ if (result.out.text.isEmpty && result.err.text.isEmpty) {
+ val parseResult = new ParseResult(outfile, Success)
+ parseResult
+ } else {
+ val msg = s"Unexpected daffodil output on stdout: ${result.out.text}
on stderr: ${result.err.text}"
+ val parseError = new ParseError(Nope, Nope, Nope, Maybe(msg))
+ val parseResult = new ParseResult(outfile, Failure(parseError))
+ parseResult.addDiagnostic(parseError)
+ parseResult
+ }
+ } catch {
+ case e: os.SubprocessException =>
+ val parseError = if (e.result.out.text.isEmpty &&
e.result.err.text.isEmpty) {
+ new ParseError(Nope, Nope, Maybe(e), Nope)
+ } else {
+ val msg = s"${e.getMessage} with stdout: ${e.result.out.text} and
stderr: ${e.result.err.text}"
+ new ParseError(Nope, Nope, Nope, Maybe(msg))
+ }
+ val parseResult = new ParseResult(outfile, Failure(parseError))
+ parseResult.addDiagnostic(parseError)
+ parseResult
+ }
+ }
+
+ /**
+ * Unparses (that is, serializes) data to the output, returns an object
which contains any diagnostics.
+ */
+ def unparse(input: InputStream, output: OutputStream): UnparseResult = {
+ val tempDir = os.temp.dir()
+ val infile = tempDir/"infile"
+ val outfile = tempDir/"outfile"
+ try {
+ os.write(infile, input)
+ val result = os.proc(executableFile, "unparse", "-I", "xml", "-o",
outfile, infile).call(cwd = tempDir, stderr = os.Pipe)
+ val finalBitPos0b = os.size(outfile) * 8 // File sizes are bytes, so
must multiply to get final position in bits
+ os.read.stream(outfile).writeBytesTo(output)
+ if (result.out.text.isEmpty && result.err.text.isEmpty) {
+ val unparseResult = new UnparseResult(finalBitPos0b, Success)
+ unparseResult
+ } else {
+ val msg = s"Unexpected daffodil output on stdout: ${result.out.text}
on stderr: ${result.err.text}"
+ val unparseError = new UnparseError(Nope, Nope, Nope, Maybe(msg))
+ val unparseResult = new UnparseResult(finalBitPos0b,
Failure(unparseError))
+ unparseResult.addDiagnostic(unparseError)
+ unparseResult
+ }
+ } catch {
+ case e: os.SubprocessException =>
+ val unparseError = if (e.result.out.text.isEmpty &&
e.result.err.text.isEmpty) {
+ new UnparseError(Nope, Nope, Maybe(e), Nope)
+ } else {
+ val msg = s"${e.getMessage} with stdout: ${e.result.out.text} and
stderr: ${e.result.err.text}"
+ new UnparseError(Nope, Nope, Nope, Maybe(msg))
+ }
+ val finalBitPos0b = 0L
+ val unparseResult = new UnparseResult(finalBitPos0b,
Failure(unparseError))
+ unparseResult.addDiagnostic(unparseError)
+ unparseResult
+ }
+ }
+}
+
+object Runtime2DataLocation {
+ class Runtime2DataLocation(_isAtEnd: Boolean,
+ _bitPos1b: Long,
+ _bytePos1b: Long) extends DataLocation {
+ override def isAtEnd: Boolean = _isAtEnd
+ override def bitPos1b: Long = _bitPos1b
+ override def bytePos1b: Long = _bytePos1b
+ }
+
+ def apply(isAtEnd: Boolean = true,
+ bitPos1b: Long = 0L,
+ bytePos1b: Long = 0L): DataLocation = {
+ new Runtime2DataLocation(isAtEnd, bitPos1b, bytePos1b)
+ }
+}
+
+final class ParseResult(outfile: os.Path,
+ override val processorStatus: ProcessorResult,
+ loc: DataLocation = Runtime2DataLocation())
+ extends DFDL.ParseResult
+ with DFDL.State
+ with WithDiagnosticsImpl {
+
+ override def resultState: DFDL.State = this
+
+ override def validationStatus: Boolean = processorStatus.isSuccess
+
+ override def currentLocation: DataLocation = loc
+
+ def infosetAsXML : scala.xml.Elem = {
+ val xml = scala.xml.XML.loadFile(outfile.toIO)
+ xml
+ }
+}
+
+final class UnparseResult(val finalBitPos0b: Long,
+ override val processorStatus: ProcessorResult,
+ loc: DataLocation = Runtime2DataLocation())
+ extends DFDL.UnparseResult
+ with DFDL.State
+ with WithDiagnosticsImpl {
+ /**
+ * Data is 'scannable' if it consists entirely of textual data, and that data
+ * is all in the same encoding.
+ */
+ override def isScannable: Boolean = false // Safest answer since we don't
know for sure
+
+ override def encodingName: String = ??? // We don't need encoding unless
isScannable is true
+
+ override def validationStatus: Boolean = processorStatus.isSuccess
+
+ override def currentLocation: DataLocation = loc
+
+ override def resultState: DFDL.State = this
+}
Review comment:
This is another key piece of functionality allowing TDML tests to use
the executable built by the code generator.
##########
File path: daffodil-cli/src/main/scala/org/apache/daffodil/Main.scala
##########
@@ -543,11 +542,48 @@ class CLIConf(arguments: Array[String]) extends
scallop.ScallopConf(arguments)
val info = tally(descr = "increment test result information output level,
one level for each -i")
}
+ // Generate Subcommand Options
+ val generate = new scallop.Subcommand("generate") {
+ descr("generate <language> code from a DFDL schema")
+
+ banner("""|Usage: daffodil [GLOBAL_OPTS] generate <language>
[SUBCOMMAND_OPTS]
+ |""".stripMargin)
+ shortSubcommandsHelp()
+ footer("""|
+ |Run 'daffodil generate <language> --help' for subcommand
specific options""".stripMargin)
+
+ val c = new scallop.Subcommand("c") {
+ banner("""|Usage: daffodil generate c -s <schema> [-r
[{namespace}]<root>]
+ | [-c <file>] [outputDir]
+ |
+ |Generate C code from a DFDL schema to parse or unparse data
+ |
+ |Generate Options:""".stripMargin)
+
+ descr("generate C code from a DFDL schema")
+ helpWidth(76)
+
+ val language = "c"
+ val schema = opt[URI]("schema", required = true, argName = "file", descr
= "the annotated DFDL schema to use to generate source code.")
+ val rootNS = opt[RefQName]("root", argName = "node", descr = "the root
element of the XML file to use. An optional namespace may be provided. This
needs to be one of the top-level elements of the DFDL schema defined with
--schema. Requires --schema. If not supplied uses the first element of the
first schema")
+ val tunables = props[String]('T', keyName = "tunable", valueName =
"value", descr = "daffodil tunable to be used when compiling schema.")
+ val config = opt[String](short = 'c', argName = "file", descr = "path to
file containing configuration items.")
+ val outputDir = trailArg[String](required = false, descr = "output
directory in which to generate source code. If not specified, uses current
directory.")
+
+ validateOpt(schema) {
+ case None => Left("No schemas specified using the --schema option")
+ case _ => Right(Unit)
+ }
+ }
+ addSubcommand(c)
+ }
+
Review comment:
I like how easy these changes turned out to be. By using two
subcommands in series one after another, we can generate code for multiple
languages later even though we support only C at this time, and each language
can have different code generator options if necessary.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]