(daffodil) branch main updated: Set default `infosetWalkerMode` to "streaming"

olabusayo Tue, 16 Jun 2026 09:20:37 -0700

This is an automated email from the ASF dual-hosted git repository.

olabusayoT pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/daffodil.git



The following commit(s) were added to refs/heads/main by this push:
     new 64ece5030 Set default `infosetWalkerMode` to "streaming"
64ece5030 is described below

commit 64ece503025ece5a539528d00ea8253aa3547906
Author: olabusayoT <[email protected]>
AuthorDate: Tue Jun 16 11:26:26 2026 -0400

    Set default `infosetWalkerMode` to "streaming"
    
     - update documentation accordingly
    
    DAFFODIL-3089
    
    Deprecation/Compatibility
    This will revert the change in DAFFODIL-3070 that set the default 
InfosetWalkerMode to nonStreaming.
---
 .../daffodil/runtime1/infoset/InfosetWalker.scala  | 26 ++++++++++++----------
 .../resources/org/apache/daffodil/xsd/dafext.xsd   |  9 ++++----
 2 files changed, 18 insertions(+), 17 deletions(-)

diff --git 
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/InfosetWalker.scala
 
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/InfosetWalker.scala
index 2906e9dd3..b2d36dc92 100644
--- 
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/InfosetWalker.scala
+++ 
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/InfosetWalker.scala
@@ -31,15 +31,17 @@ import org.apache.daffodil.lib.util.MStackOfInt
  * Two concrete implementations exist, selectable via the `infosetWalkerMode`
  * tunable:
  *
- *  - [[StreamingInfosetWalker]] (`infosetWalkerMode = "streaming"`): emits 
events
- *    incrementally as elements are finalized during parsing. Keeps memory 
usage
- *    bounded for large or deeply-nested infosets, but incurs overhead from
+ *  - [[StreamingInfosetWalker]] (`infosetWalkerMode = "streaming"`, default): 
emits
+ *    events incrementally as elements are finalized during parsing. Keeps 
memory
+ *    usage bounded for large or deeply-nested infosets, but incurs overhead 
from
  *    repeated speculative walk attempts.
  *
- *  - [[NonStreamingInfosetWalker]] (`infosetWalkerMode = "nonStreaming"`, 
default):
- *    defers all output until the entire infoset is available, then walks it in
- *    one pass. Faster for schemas where the infoset fits comfortably in 
memory,
- *    because it avoids the overhead of incremental walk attempts.
+ *  - [[NonStreamingInfosetWalker]] (`infosetWalkerMode = "nonStreaming"`): 
defers
+ *    all output until the entire infoset is available, then walks it in one 
pass.
+ *    Faster for schemas where the infoset fits comfortably in memory because 
it
+ *    avoids the overhead of incremental walk attempts. If you want potentially
+ *    better performance, set this tunable to "nonStreaming", but there may be
+ *    a significant memory impact.
  *
  * Callers invoke [[walk]] periodically during parsing. When `lastWalk = true`
  * the walker must flush any remaining events before returning. [[isFinished]]
@@ -77,11 +79,11 @@ trait InfosetWalker {
  * then walks the entire infoset in a single pass when `walk(lastWalk = true)`
  * is called. Intermediate `walk()` calls are no-ops.
  *
- * This is the default walker (tunable `infosetWalkerMode = "nonStreaming"`).
- * It is faster than [[StreamingInfosetWalker]] for most schemas because it
- * avoids the overhead of repeated speculative walk attempts, at the cost of
- * holding the full infoset in memory until parsing finishes. For very large
- * infosets or memory-constrained environments, prefer 
[[StreamingInfosetWalker]].
+ * If you want potentially better performance, set `infosetWalkerMode = 
"nonStreaming"`
+ * (which uses this walker), but there may be a significant memory impact. It 
is faster than
+ * [[StreamingInfosetWalker]] for most schemas because it avoids the overhead 
of
+ * repeated speculative walk attempts, at the cost of holding the full infoset 
in
+ * memory until parsing finishes.
  *
  * @param root      The root [[DIElement]] of the infoset to walk.
  * @param outputter The [[api.infoset.InfosetOutputter]] that receives events.
diff --git 
a/daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd 
b/daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd
index a66c2be63..db5568846 100644
--- a/daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd
+++ b/daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd
@@ -239,15 +239,14 @@
             </xs:restriction>
           </xs:simpleType>
         </xs:element>
-        <xs:element name="infosetWalkerMode" 
type="daf:TunableInfosetWalkerMode" default="nonStreaming" minOccurs="0">
+        <xs:element name="infosetWalkerMode" 
type="daf:TunableInfosetWalkerMode" default="streaming" minOccurs="0">
           <xs:annotation>
             <xs:documentation>
               Daffodil can periodically walk the internal infoset to send 
events to the configured
               InfosetOutputter (streaming) or it can walk the internal infoset 
once at the end of
-              parsing (nonStreaming). The idea being that simple schemas or 
schemas with lots of
-              points of uncertainty would benefit from the nonStreaming 
infoset walker, while
-              very large schemas or situations where memory is contrained 
would benefit
-              from the streaming infoset walker.
+              parsing (nonStreaming). The default is "streaming", which keeps 
memory usage bounded
+              for large infosets. If you want potentially better performance, 
set this tunable to
+              "nonStreaming", but there may be a significant memory impact.
             </xs:documentation>
           </xs:annotation>
         </xs:element>

(daffodil) branch main updated: Set default `infosetWalkerMode` to "streaming"

Reply via email to