This is an automated email from the ASF dual-hosted git repository.
olabusayoT pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/daffodil.git
The following commit(s) were added to refs/heads/main by this push:
new 64ece5030 Set default `infosetWalkerMode` to "streaming"
64ece5030 is described below
commit 64ece503025ece5a539528d00ea8253aa3547906
Author: olabusayoT <[email protected]>
AuthorDate: Tue Jun 16 11:26:26 2026 -0400
Set default `infosetWalkerMode` to "streaming"
- update documentation accordingly
DAFFODIL-3089
Deprecation/Compatibility
This will revert the change in DAFFODIL-3070 that set the default
InfosetWalkerMode to nonStreaming.
---
.../daffodil/runtime1/infoset/InfosetWalker.scala | 26 ++++++++++++----------
.../resources/org/apache/daffodil/xsd/dafext.xsd | 9 ++++----
2 files changed, 18 insertions(+), 17 deletions(-)
diff --git
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/InfosetWalker.scala
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/InfosetWalker.scala
index 2906e9dd3..b2d36dc92 100644
---
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/InfosetWalker.scala
+++
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/InfosetWalker.scala
@@ -31,15 +31,17 @@ import org.apache.daffodil.lib.util.MStackOfInt
* Two concrete implementations exist, selectable via the `infosetWalkerMode`
* tunable:
*
- * - [[StreamingInfosetWalker]] (`infosetWalkerMode = "streaming"`): emits
events
- * incrementally as elements are finalized during parsing. Keeps memory
usage
- * bounded for large or deeply-nested infosets, but incurs overhead from
+ * - [[StreamingInfosetWalker]] (`infosetWalkerMode = "streaming"`, default):
emits
+ * events incrementally as elements are finalized during parsing. Keeps
memory
+ * usage bounded for large or deeply-nested infosets, but incurs overhead
from
* repeated speculative walk attempts.
*
- * - [[NonStreamingInfosetWalker]] (`infosetWalkerMode = "nonStreaming"`,
default):
- * defers all output until the entire infoset is available, then walks it in
- * one pass. Faster for schemas where the infoset fits comfortably in
memory,
- * because it avoids the overhead of incremental walk attempts.
+ * - [[NonStreamingInfosetWalker]] (`infosetWalkerMode = "nonStreaming"`):
defers
+ * all output until the entire infoset is available, then walks it in one
pass.
+ * Faster for schemas where the infoset fits comfortably in memory because
it
+ * avoids the overhead of incremental walk attempts. If you want potentially
+ * better performance, set this tunable to "nonStreaming", but there may be
+ * a significant memory impact.
*
* Callers invoke [[walk]] periodically during parsing. When `lastWalk = true`
* the walker must flush any remaining events before returning. [[isFinished]]
@@ -77,11 +79,11 @@ trait InfosetWalker {
* then walks the entire infoset in a single pass when `walk(lastWalk = true)`
* is called. Intermediate `walk()` calls are no-ops.
*
- * This is the default walker (tunable `infosetWalkerMode = "nonStreaming"`).
- * It is faster than [[StreamingInfosetWalker]] for most schemas because it
- * avoids the overhead of repeated speculative walk attempts, at the cost of
- * holding the full infoset in memory until parsing finishes. For very large
- * infosets or memory-constrained environments, prefer
[[StreamingInfosetWalker]].
+ * If you want potentially better performance, set `infosetWalkerMode =
"nonStreaming"`
+ * (which uses this walker), but there may be a significant memory impact. It
is faster than
+ * [[StreamingInfosetWalker]] for most schemas because it avoids the overhead
of
+ * repeated speculative walk attempts, at the cost of holding the full infoset
in
+ * memory until parsing finishes.
*
* @param root The root [[DIElement]] of the infoset to walk.
* @param outputter The [[api.infoset.InfosetOutputter]] that receives events.
diff --git
a/daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd
b/daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd
index a66c2be63..db5568846 100644
--- a/daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd
+++ b/daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd
@@ -239,15 +239,14 @@
</xs:restriction>
</xs:simpleType>
</xs:element>
- <xs:element name="infosetWalkerMode"
type="daf:TunableInfosetWalkerMode" default="nonStreaming" minOccurs="0">
+ <xs:element name="infosetWalkerMode"
type="daf:TunableInfosetWalkerMode" default="streaming" minOccurs="0">
<xs:annotation>
<xs:documentation>
Daffodil can periodically walk the internal infoset to send
events to the configured
InfosetOutputter (streaming) or it can walk the internal infoset
once at the end of
- parsing (nonStreaming). The idea being that simple schemas or
schemas with lots of
- points of uncertainty would benefit from the nonStreaming
infoset walker, while
- very large schemas or situations where memory is contrained
would benefit
- from the streaming infoset walker.
+ parsing (nonStreaming). The default is "streaming", which keeps
memory usage bounded
+ for large infosets. If you want potentially better performance,
set this tunable to
+ "nonStreaming", but there may be a significant memory impact.
</xs:documentation>
</xs:annotation>
</xs:element>