[ 
https://issues.apache.org/jira/browse/DAFFODIL-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939692#comment-16939692
 ] 

Mike Beckerle commented on DAFFODIL-2208:
-----------------------------------------

This change is not yet agreed by the DFDL Workgroup. Daffodil behavior may be 
properly compliant with the spec.

Section 9.2.5 may simply be incorrect that zero-length strings with no framing 
can be NormalRep.

Discussion on the mailing list suggests a mode where ZL strings are NormalRep 
one where ZL strings are EmptyRep, with NormalRep being a NEW mode of behavior.

(using dfdlx:emptyElementParsePolicy - a new enum value for it.)

> Empty strings never allowed as optional repeats - not compliant with DFDL 
> spec.
> -------------------------------------------------------------------------------
>
>                 Key: DAFFODIL-2208
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2208
>             Project: Daffodil
>          Issue Type: Bug
>          Components: Back End
>    Affects Versions: 2.4.0
>            Reporter: Mike Beckerle
>            Assignee: Mike Beckerle
>            Priority: Major
>             Fix For: 2.5.0
>
>
> Exerpts here from emails on the [[email protected]|mailto:[email protected]] 
> mailing list.
> {noformat}
> Problem: simple format that is impossible to model
> InboxxMike Beckerle <[email protected]> 1:47 PM (35 minutes ago)
> to DFDL-WG 
> I have a dead-simple little format:
>     data/data/data/data
>     data/data/data/data
> it is lines of "/" separated strings. All elements are optional. 
> I simply want this:
>    data//data
> to round trip. For that to happen I need it to parse into    
> <field>data</field><field></field><field>data</field>
> That is, I require that empty field element in the middle to be created and 
> put into the infoset.
> I can find no way to do this. 
> The
>  strings have no initiator/terminator, so dfdl:emptyValueDelimiterPolicy
>  is not relevant. All the elements are optional, so default values 
> aren't relevant.
> The spec states:
> 9.4.2.2      Simple element (xs:string or xs:hexBinary)
> Required occurrence: If the element has a default value then an item is 
> added to the infoset using the default value, otherwise an item is added
>  to the Infoset using empty string (type xs:string) or empty hexBinary 
> (type xs:hexBinary) as the value.
> Optional occurrence: If dfdl:emptyValueDelimiterPolicy is not 'none'[12] then
> an item is added to the Infoset using empty string (type xs:string) or empty 
> hexBinary (type xs:hexBinary) as the value, otherwise nothing is added to the 
> Infoset.
> There
>  are errata/actions to clarify wording here around 
> dfdl:emptyValueDelimiterPolicy being in effect or not (because there is 
> no initiator/terminator for it to use as opposed to the property in 
> isolation just being 'none'). 
> But that doesn't change anything about this issue.
> If this very simple format is not possible, then we need a property or new 
> property enum value that makes it possible. 
> Thoughts?{noformat}
> Subsequently to that I figured out what I believe is the spec flaw.
>  
> {noformat}
> To start discussion on my own issue.....
> The problem here may be that for a string (or hexBinary), if there is no 
> initiator/terminator, there is no way to distinguish EmptyRep from NormalRep.
> I.e., an empty string is a "normal" value for a string.
> Sections 9.2.3 and 9.2.4 seem to define EmptyRep and NormalRep such that an 
> empty string will be a EmptyRep, not a NormalRep.
> However section 9.2.5 on zero-length says:
>    "The normal representation can be a zero-length representation if the type 
> is xs:string or xs:hexBinary and there is no framing."
> That suggests that when there is no framing, a zero-length string is 
> NormalRep, not EmptyRep, which is the opposite conclusion from what is in 
> sections 9.2.3 and 9.2.4.
> If this latter clarification is correct, then my format *should* work as I 
> expect, because the empty string elements will be considered NormalRep and 
> infoset values will be created for them.
> It simply doesn't work because of a bug in daffodil which has not interpreted 
> this correctly.{noformat}
> That's the bug to fix: Strings and HexBinary with no framing are NormalRep, 
> not EmptyRep.
>  
> Note that some tests in our test suite will have to be revised to take this 
> into account.
> Behavior for public schemas should not change, as the above behavior is all 
> subject to the new property (still a proposal) dfdlx:emptyElementParsePolicy 
> being "treatAsEmpty" (the enum names are subject to change).
> The IBM-created schemas for EDIFACT and others depend on a behavior in IBM 
> DFDL that we call dfdlx:emptyElementParsePolicy='treatAsMissing' (again enums 
> subject to change). That behavior doesn't allow empty strings to be 
> distinguished from absent strings. Under that policy the behavior of daffodil 
> shouldn't change, so those schemas should still interoperate.
> The need for this bug fix is so as to be able to implement a generic schema 
> for a format called USMTF, which is unfortunately, not public. But the 
> simplified examples above illustrate the issue.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to