So, Daffodil seems to have some bugs here.

I see a couple of problems with your schema.

1) base=" xs:unsignedShort "
I would have expected this to work, but Daffodil doesn't seem to like the 
spaces in this attribute. Removing the spaces fixes one of the crashes I was 
observing. At a minimum, Daffodil has a diagnostics bug for how poorly it 
handles this, and it might be a bug that this doesn't work completely.

Also, there is not much point in using unsignedShort here. From Daffodil's 
perspective, this is just saying that it is an integer type. Daffodil will 
happily populate it with an integer that does not fit into an unsigned short if 
one of your values is sufficiently big. The restrictions on the base type are 
only relevent to xsd validation, which Daffodil by default does not do. This 
shouldn't cause any issues, but I thought it worth pointing out.

2) dfdlx:repType="xs:unsignedByte"
It might be possible for this to work depending on what settings you put in 
your <dfdl:format> tag. I suspect the issue here is that Daffodil does not know 
how many bytes are in an xs:unsignedByte, because you have 
lengthKind='explicit'. If you set lengthKind='implicit' in your dfdl:format 
annotation, this might work. With lengthKind='explicit', you would need to set 
the minLength and maxLength facets on xs:unsignedByte, which you have no way of 
doing.

Honestly, we hadn't really considered using built in types for the repType. We 
might end up prohibiting it altogether, so I would avoid relying on this 
working based on the <dfdl:format> annotation.

What you are supposed to do here is define your own type with Daffodil 
annotations describing its physical representation. This is why my example 
explicitly defines uint8 instead of using the built-in type.

I was able to get your second example working with the below schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema";
           xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/";
           xmlns:tns="urn:a"
           xmlns:ex="http://example.com";
           xmlns:fn="http://www.w3.org/2005/xpath-functions";
           xmlns:dfdlx="http://www.ogf.org/dfdl/dfdl-1.0/extensions";
           targetNamespace="urn:a" >
  <xs:include 
schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd" />

   <xs:annotation>
    <xs:appinfo source="http://www.ogf.org/dfdl/";>
      <dfdl:format ref="tns:GeneralFormat"
        lengthUnits="bytes"
        byteOrder="littleEndian" bitOrder="leastSignificantBitFirst"
        representation="binary"
       />
   </xs:appinfo>
  </xs:annotation>

<xs:simpleType name="uint8" dfdl:lengthKind="explicit" dfdl:length="1">
  <xs:restriction base="xs:unsignedShort"/>
</xs:simpleType>

<xs:simpleType name="SomeEnumType" dfdlx:repType="tns:uint8">
  <xs:restriction base="xs:unsignedShort">
    <xs:enumeration value="55" dfdlx:repValues="0" />
    <xs:enumeration value="56" dfdlx:repValues="1" />
    <xs:enumeration value="57" dfdlx:repValues="2" />
  </xs:restriction>
</xs:simpleType>

<xs:element name="a" type="tns:SomeEnumType"/>

</xs:schema>


Switching back over to a string baseType did not present me with any additional 
difficulties:

<xs:simpleType name="SomeEnumType" dfdlx:repType="tns:uint8">
  <xs:restriction base="xs:string">
    <xs:enumeration value="ENUM_1" dfdlx:repValues="0" />
    <xs:enumeration value="ENUM_2" dfdlx:repValues="1" />
    <xs:enumeration value="ENUM_3" dfdlx:repValues="2" />
  </xs:restriction>
</xs:simpleType>

> It complains that shorts are 16-bits in length (the repValue base is 8-bits).
I wasn't able to reproduce this particular complaint, but it also sounds like a 
bug. Would you mind posting your complete schema?


________________________________
From: Pirow Engelbrecht <[email protected]>
Sent: Wednesday, October 2, 2019 5:34 AM
To: [email protected] <[email protected]>
Subject: repValue enumaration translation issue


Hello,



This issue has been posted on Stackoverflow originally here: 
https://stackoverflow.com/questions/58168427/dfdl-decoding-of-enumerated-binary-data



Since then, I’ve realised that this mailing list is probably the better 
audience :-)



Here is my original post (with some edits to keep things a bit shorter):

I'm currently working on a DFDL schema for a legacy (custom) binary file format 
used in a system to translate to either XML or JSON. I've got some binary data 
that is enumerated values, i.e. the C-struct data type looks like this (and 
stored as a byte):

typedef enum _SomeEnum

{

  ENUM_1 = 0x00,

  ENUM_2 = 0x01,

  ENUM_3 = 0x02

} SomeEnum;

I can decode the enumeration to a numerical value just fine with this DFDL 
schema code (including checks for speculative parsing):

<xs:element name="SomeEnum" type="xs:unsignedByte>

  <xs:annotation>

    <xs:appinfo source="http://www.ogf.org/dfdl/";>

        <dfdl:assert><![CDATA[{ . lt 3 }]]></dfdl:assert>

    </xs:appinfo>

   </xs:annotation>

</xs:element>

which translates to this XML with the enum field equal to 1 in this instance:

<SomeEnum>1</SomeEnum>

What I would like is to have the ability to translate the decoded enumeration 
value to a string so that the XML result looks like this:

<SomeEnum>ENUM_1</SomeEnum>





Brandon Sloane (Daffodil dev) then responded to the post (also edited, just to 
highlight the preferred solution):

The newest release of Daffodil (2.4.0) includes a DFDL extension designed 
specifically for this problem. Some documentation available on the Daffodil 
wiki<https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Feature+to+support+enumerations+and+typeValueCalc>.

The theory here is that you can define a simple type that is a restriction on 
xs:string as an xsd enumeration; then supply the corresponding binary values as 
a DFDL annotation:

<xs:simpleType name="uint8" dfdl:length="1">

  <xs:restriction base="xs:unsignedInt"/>

</xs:simpleType>



<xs:simpleType name="SomeEnumType" dfdlx:repType="tns:uint8">

  <xs:restriction base="xs:string">

    <xs:enumeration value="ENUM_1" dfdlx:repValues="0" />

    <xs:enumeration value="ENUM_2" dfdlx:repValues="1" />

    <xs:enumeration value="ENUM_3" dfdlx:repValues="2" />

  </xs:restriction>

</xs:simpleType>



<xs:element name="SomeEnum" type="tns:SomeEnumType" />

The benefit here is that the schema is much more maintainable, and Daffodil 
will perform the lookup using a direct hash-table lookup, instead of needed to 
walk through an if-else tree.



I then ran into some issues with the above recommendation:



Daffodil produces the following error for the above schema:

[error] Schema Definition Error: When lengthKind='implicit', both minLength and 
maxLength facets must be specified.



Adding xs:minLength and xs:maxLength, the parser complains that they need to be 
the same value. Setting them the same, the parser then crashes. Not sure what 
these need to be.



I found this JIRA issue<https://issues.apache.org/jira/browse/DAFFODIL-2146> 
(https://issues.apache.org/jira/browse/DAFFODIL-2146)  . It uses the 
inputTypeCalcString inputValueCalc function, but that just throws the error 
that inputTypeCalcString is an unsupported function. It seems these are 
deprecated in version 2.4.0 even though the fix version for these are version 
2.4.0.



What I have realised is that it can translate from one type to another only if 
that type is the exact same length. So this works:

<xs:simpleType name="SomeEnumType" dfdlx:repType="xs:unsignedByte">

  <xs:restriction base=" xs:unsignedByte ">

    <xs:enumeration value="55" dfdlx:repValues="0" />

    <xs:enumeration value="56" dfdlx:repValues="1" />

    <xs:enumeration value="57" dfdlx:repValues="2" />

  </xs:restriction>

</xs:simpleType>

The value 0 is translated to 55, 1 to 56 and 2 to 57. But the moment I change 
the translated base to something else, Daffodil doesn’t like it, e.g.

<xs:simpleType name="SomeEnumType" dfdlx:repType="xs:unsignedByte">

  <xs:restriction base=" xs:unsignedShort ">

    <xs:enumeration value="55" dfdlx:repValues="0" />

    <xs:enumeration value="56" dfdlx:repValues="1" />

    <xs:enumeration value="57" dfdlx:repValues="2" />

  </xs:restriction>

</xs:simpleType>

It complains that shorts are 16-bits in length (the repValue base is 8-bits).



Any ideas/help?



Thanks





Pirow Engelbrecht | Senior Design Engineer
Tel +27 12 678 9740 (ext. 9879) | Cell +27 63 148 3376

76 Regency Drive | Irene | Centurion | 0157<https://goo.gl/maps/v9ZbwjqpPyL2>

[create-transition]<https://etion.co.za/>

Facebook<https://www.facebook.com/Etion-Limited-2194612947433812?_rdc=1&_rdr> | 
YouTube<https://www.youtube.com/channel/UCUY-5oeACtLk2uTsEjZCU6A> | 
LinkedIn<https://www.linkedin.com/company/etionltd> | 
Twitter<https://twitter.com/Etionlimited> | 
Instagram<https://www.instagram.com/Etionlimited/>


Reply via email to