[
https://issues.apache.org/jira/browse/DAFFODIL-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Lawrence updated DAFFODIL-1598:
-------------------------------------
Fix Version/s: 4.2.0
(was: 4.1.0)
> Unparser: For strings that truncate, the dfdl:valueLength function cannot
> suspend
> ---------------------------------------------------------------------------------
>
> Key: DAFFODIL-1598
> URL: https://issues.apache.org/jira/browse/DAFFODIL-1598
> Project: Daffodil
> Issue Type: Bug
> Components: Back End, General
> Reporter: Mike Beckerle
> Assignee: Olabusayo Kilo
> Priority: Major
> Fix For: 4.2.0
>
>
> When unparsing, the strategy used to determine the target length for an
> element is to determine the value length by allowing the unparsing to go
> forward but into a buffering data output stream. The value length is
> determined by capturing the starting position, the ending position (both of
> which are in the buffered output), and subtracting.
> However, if the string is a truncated string (lengthKind 'explicit' or
> 'implicit', with dfdl:truncateSpecifiedLengthString 'yes'), we have a cycle
> because the value-length of an element is the post-truncation length, yet to
> determine the target length we will often need to know the value-length.
> For example:
> {code}
> <xs:element name="len" type="xs:int" dfdl:outputValueCalc="{
> if (dfdl:valueLength(../data) lt 100) then 100 else
> dfdl:valueLength(../data)
> }" />
> <xs:element name="data" type="xs:string" dfdl:lengthKind='explicit'
> dfdl:length='{ ../len }'
> dfdl:truncateSpecifiedLengthString='yes'/>
> {code}
> In the above, the length expression depends on the 'len' element value. The
> len element value requires the valueLength of the 'data' element. Without
> truncation, this would work, as we could unparse the value of the 'data' into
> a buffering data output stream, and measure its length, and that would
> unblock the suspended 'len' element's dfdl:outputValueCalc expression, which
> would allow the 'data' element's length expression to be evaluated, and we
> would then know how much padding/fill to add to the 'data' element
> representation.
> But if the 'data' string can be truncated, as in the example above, then this
> fails, because we can't unparse it and allow the value length to be derived
> from the unparsed representation in a buffer, since the valueLength is
> supposed to be the post-truncation value.
> So a cyclic deadlock will occur unparsing things like the above.
> The question is: is this a problem? The above example could be re-written as:
> {code}
> <xs:element name="len" type="xs:int" dfdl:outputValueCalc="{
> if (fn:string-length(../data) lt 100) then 100 else
> fn:string-length(../data)
> }" />
> <xs:element name="data" type="xs:string" dfdl:lengthKind='explicit'
> dfdl:length='{ ../len }'
> dfdl:truncateSpecifiedLengthString='yes'/>
> {code}
> The fn:string-length function provides the pre-truncation length of the
> element. This elminates the cycle.
> We can detect this error at runtime, as the ElementRuntimeData structure
> contains optTruncateSpecifiedLengthString, which can be examined at runtime
> by the valueLength function, which can error that dfdl:valueLength is being
> called on a truncated string, and the diagnostic can suggest that
> fn:string-length is preferable.
> However, it's not an error to call dfdl:valueLength on a string that may be
> truncated. It's only an error to do so in a way that creates this deadlock.
> The DFDL spec does not preclude calling dfdl:valueLength on a string element
> that might be truncated, and the spec is clear that this would be the
> post-truncation value-region length.
> So we need a mechanism where we can produce a runtime SDE, not any time
> dfdl:valueLength is called on a string that might be truncated, but a
> mechanism where we examine the deadlocked cycle, and we see if it is caused
> by taking dfdl:valueLength of a string that might be truncated, so we can
> issue the runtime SDE about this particular cyclic definition problem only in
> the cases where it is actually creating a cycle.
> That would be ideal, but an interim acceptable solution might be:
> (a) have tutorials about cycles and include this example
> (b) disallow this at schema compile time as something daffodil just doesn't
> allow - along with the fix which is to use fn:string-length instead.
> or both.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)