Hi Steve,

No, that is not a viable solution. As you observed, it doesn't scale. For a 
fixed length field of 10, we would need to specify something like 10-factorial 
alternatives (or is it 2^^10 alternatives?) in dfdl:nilValue.

/Roger

-----Original Message-----
From: Steve Lawrence <slawre...@apache.org> 
Sent: Wednesday, August 10, 2022 8:25 AM
To: users@daffodil.apache.org
Subject: [EXT] Re: Conflicting requirements: fixed length field, nillable, some 
enumeration values shorter than the required length

I'm not sure I love this possibles solution, and it doesn't scale very 
well, but what about something like this:

   <element name="field" type="xs:string" nillable="true"
     dfdl:lengthKind="explicit"
     dfdl:length="3"
     dfdl:textStringJustification="left"
     dfdl:textTrimKind="padChar"
     dfdl:textPadKind="padChar"
     dfdl:textStringPadCharacter="%SP;"
     dfdl:nilKind="literalValue"
     dfdl:nilValue="- %SP;- %SP;%SP;-" />

So the field is left-justified and right-padded with spaces. Left padded 
spaces are not trimmed, so a field like " A " will show up in the 
infoset with the left space and fail validation. And the nilValue is set 
to all the combinations of the nil character preceded with a space.

Like I said, this doesn't scale because you need N nilValues for a 
string of length N. And this scala at all for delimited length fields 
where you don't know the length of the field, unless you just add a 
bunch of nilValues up to some size.

If we had something like %SP*; (similar to how we have %WSP*;), then the 
nilValue could just be "%SP*;-" and this would scale without issue, and 
work for both fixed length and delimited length fields. I believe %SP*; 
has come up in the past, so this might be another argument to added it.


On 8/10/22 7:54 AM, Roger L Costello wrote:
> Thanks Mike. I implemented your approach. It fails to detect invalid input. 
> Let
> me explain.
> 
> Input specifications:
> 
>    * Fixed length field (3)
>    * Nillable, hyphen is the nil value, the hyphen may be anywhere within the 
> 3
>      character field
>    * Values must be left-justified
> 
> Here are examples of valid inputs:
> 
> …/AB /…
> 
> …/ABC/…
> 
> …/-  /…
> 
> .../ - /…
> 
> …/  -/…
> 
> Your solution permits this input (I tested it, Daffodil gives no error or 
> warning):
> 
> …/ AB/…
> 
> Notice that the value is right-justified. That is invalid.
> 
> /Roger
> 
> /Roger
> 
> *From:* Mike Beckerle <mbecke...@apache.org>
> *Sent:* Monday, August 8, 2022 3:58 PM
> *To:* users@daffodil.apache.org
> *Subject:* [EXT] Re: Conflicting requirements: fixed length field, nillable,
> some enumeration values shorter than the required length
> 
> So I think your requirements are this:
> 
> * fixed length 5
> 
> * the hyphen nil indicator may have spaces around it
> 
> * canonical form is left justified for "-" or any value.
> 
> This is the best I could do. I had to surround the nillable element with 
> another
> element so as to get left-justification by way of filling of the unused region
> of a complex type, with fillByte which is %SP;.
> 
> If you want center justified hyphens for the nil case and left-justified 
> strings
> for the value case, then I think it's not possible to model this without using
> separate elements for the nil and value. (That solution not shown here.)
> 
> <*element *name*="Foo"
> *dfdl:length*="5"
> *dfdl:lengthKind*="explicit"
> *dfdl:terminator*="/"
> *dfdl:fillByte*="%SP;"* >
> /<!--
>     The above achieves canonical unparse
>     as left-justified fixed length because
>     the fillByte will be used to fill unused
>     space on the right.
> 
>     This only works for fixed length left-justified data.
>     If this was right-justified, this trick would not work.
>     -->
> /<*complexType* >
>     <*sequence* >
> /<!--
>       The below achieves trimming of spaces either side,
>       but only when parsing. Nothing is added when unparsing.
>       -->
> /<*element *name*="value" *nillable*="true"
> *dfdl:nilValue*="-"
> *dfdl:lengthKind*="delimited"
> *dfdl:textStringJustification*="center"
> *dfdl:textTrimKind*="padChar"
> *dfdl:textPadKind*="none"* >
>         <*simpleType* >
>           <*restriction *base*="xs:string"* >
>             <*enumeration *value*="AB"*/>
>             <*enumeration *value*="ABC"*/>
>           </*restriction* >
>         </*simpleType* >
>       </*element* >
>       </*sequence* >
> </*complexType* >
> </*element* >
> 
> The TDML file I created for this has these tests in it showing that this 
> works:
> 
>     <parserTestCase name="foo1" root="Foo" model="s" roundTrip="onePass">
>       <document>-    /</document>
>       <infoset>
>         <dfdlInfoset>
>           <ex:Foo xmlns=""><value xsi:nil="true"/></ex:Foo>
>         </dfdlInfoset>
>       </infoset>
>     </parserTestCase>
> 
>     <parserTestCase name="foo2" root="Foo" model="s" roundTrip="twoPass">
>       <document> -   /</document>
>       <infoset>
>         <dfdlInfoset>
>           <ex:Foo xmlns=""><value xsi:nil="true"/></ex:Foo>
>         </dfdlInfoset>
>       </infoset>
>     </parserTestCase>
> 
>     <parserTestCase name="foo3" root="Foo" model="s" roundTrip="twoPass">
>       <document> AB  /</document>
>       <infoset>
>         <dfdlInfoset>
>           <ex:Foo xmlns=""><value>AB</value></ex:Foo>
>         </dfdlInfoset>
>       </infoset>
>     </parserTestCase>
> 
>     <parserTestCase name="foo4" root="Foo" model="s" roundTrip="onePass">
>       <document>AB   /</document>
>       <infoset>
>         <dfdlInfoset>
>           <ex:Foo xmlns=""><value>AB</value></ex:Foo>
>         </dfdlInfoset>
>       </infoset>
>     </parserTestCase>
> 
> On Mon, Aug 8, 2022 at 10:22 AM Roger L Costello <coste...@mitre.org
> <mailto:coste...@mitre.org>> wrote:
> 
>      Hi Mike,
> 
>      I gave your suggested approach a try. It failed.
> 
>      With this input:
> 
>      …/AB /…
> 
>      it works.
> 
>      With this input:
> 
>      …/ - /…
> 
>      it fails, producing this error:
> 
>      [error] Validation Error: Foo failed facet checks due to: facet
>      enumeration(s): AB|ABC
> 
>      Further, even if the approach were to work with this example where the 
> field
>      length is 3, it would be an untenable approach for longer fixed fields. 
> For
>      example, if the field length was 10, then the nilValue would need 
> something
>      like 10-factorial whitespace-separated values.
> 
>      Do you have another suggested approach?
> 
>      /Roger
> 
>      *From:* Mike Beckerle <mbecke...@apache.org 
> <mailto:mbecke...@apache.org>>
>      *Sent:* Monday, August 8, 2022 9:38 AM
>      *To:* users@daffodil.apache.org <mailto:users@daffodil.apache.org>
>      *Subject:* [EXT] Re: Conflicting requirements: fixed length field, 
> nillable,
>      some enumeration values shorter than the required length
> 
>      I would try making the nilValue "%SP;-%SP; -". That is two separate
>      possibilities for nilValue, one is space-hyphen-space, the other just
>      hyphen. (It's a whitespace-separated list of nil values tokens.)
> 
>      The first one will be used for unparsing. Both will be tried for parsing.
> 
>      That along with justification left might work.
> 
>      On Mon, Aug 8, 2022 at 8:01 AM Roger L Costello <coste...@mitre.org
>      <mailto:coste...@mitre.org>> wrote:
> 
>          Hi Folks,
> 
>          I have an input field that is fixed length (3). If there is no data, 
> the
>          field is to be populated with a hyphen (of course, it must be padded
>          with spaces to the required length). The schema has a simpleType with
>          enumeration facets. Some enumeration values are less than the 
> required
>          length.
> 
>          Here's how I specify the field:
> 
>          <xs:element name="Foo"
>               nillable="true"
>               dfdl:nilKind="literalValue"
>               dfdl:nilValue="-"
>               dfdl:lengthKind="explicit"
>               dfdl:length="3"
>               dfdl:textTrimKind="padChar"
>               dfdl:textPadKind="padChar"
>               dfdl:textStringPadCharacter="%SP;"
>               dfdl:textStringJustification="center">
>               <xs:simpleType>
>                   <xs:restriction base="xs:string">
>                       <xs:enumeration value="AB"/>
>                       <xs:enumeration value="ABC"/>
>                   </xs:restriction>
>               </xs:simpleType>
>          </xs:element>
> 
>          Notice dfdl:textStringJustification="center" which is fine for the
>          nillable value (hyphen) but not for a regular value such as AB which
>          should be left justified. As the schema is, the input could contain 
> this
>          (assume slash separators):
> 
>          .../ AB/...
> 
>          which is incorrect.
> 
>          So, there are conflicting requirements: the nillable value needs
>          dfdl:textStringJustification="center" whereas the normal values need
>          dfdl:textStringJustification="left". What to do about this?
> 
>          /Roger
> 

Reply via email to