I've updated the VCalendar example to fix the typo, and I've wrapped an xs:sequence carrying the dfdl:ref="tns:folded" around the ProdID element.
You are correct to create a resuseable type that includes folding you have to use a complex type since only complex types can have an xs:sequence needed to carry the layering properties. This problem is an artifact of the non-uniformity of simple/complex types in XSD, and there are lots of places in DFDL like this where you need a complex type in order to describe the representation of what ultimately one thinks of as a simple value, so you end up with the "value element problem". This needs a general fix outside the scope of this layering proposal, along the lines of allowing a simple type to carry a dfdl:hiddenGroupRef property so that the simple element can have a sequence or choice group containing children elements to hold the complex representation of that simple type. Your second observation I think is also correct which is that after running the decoding layering algorithm, one might have more data than one needs to satisfy the parsing. When parsing this would be ignored/skipped. Unparsing is a bit trickier, as this data may need to be provided - e.g., as padding - even though it is not carrying any data. It may just be an algorithm requirement. We certainly anticipate that data will have to be byte-oriented, that is, no final partial byte can be represented. So at least filling the final byte out with bits from fillByte may be necessary, but for many algorithms the requirement may be that the data is padded/filled to a certain byte boundary/alignment. It would be the schema authors responsibility to make sure unparsing the data provides a representation to the layering unparser that satisfies these requirements. I will add something to this affect to the proposal page. ________________________________ From: Steve Lawrence <slawre...@apache.org> Sent: Monday, April 9, 2018 10:18:29 AM To: firstname.lastname@example.org; Mike Beckerle Subject: Re: Simplified DFDL layering/base64 proposal I like this simplified version alot! Some questions: 1) In the VCalendar example, ProdID is an element with a dfdl:ref (typo of dfdl:formatRef) to tns:folded, which contains dfdl:layer* properties. But layering properties are only allowed on xs:sequence's. I assume this was just an example from the old proposal that wasn't fixed up, and should be something like this instead: <xs:sequence dfdl:ref="tns:folded"> <xs:element name="ProdID" type="xs:string" dfdl:initiator="PRODID:" minOccurs="0"/> </xs:sequence> Which raises a small issue with simple types: This layering transform now applies to the initiator/terminator of the simple type. If you do not want a layer to apply to those but only to the value, you'd need make it a complex type with a "Value" element. I'm not sure this is a big deal, but layering on simply types might get a little messier in some cases if the initiators/termiantors shouldn't be transformed. 2) What happens with unused data in an overlying layer. For example: Say we have something like <dfdl:defineFormat name="base64"> <dfdl:format layerTransform="base64" layerLengthKind="explicit" layerLength="8" ... /> </dfdl:defineFormat> <xs:sequence> <xs:sequence dfdl:ref="base64"> <xs:element name="foo" type="xs:string" dfdl:length="3" /> <xs:sequence> <xs:element name="bar" type="xs:string" dfdl:length="3" /> </xs:sequence> Assume the data is this: Zm9vWA==bar The first 8 characters are base64 encoded, and decode to "fooX". The foo element would only consume three of those characters, so the last "X" character would be not consumed by foo. The length of the layer transform was 8 characters, so bar would start parsing after that and consume the "bar" letters. So what happens to the unconsumed "X" character? Is it just thrown away? This seems consistent with how we treat a complex element with explicit length where the children do not consume the full length. Or is this an Runtime SDE? Related, on the unparse side, when we base64 decode "foo" it is only 4 characters, but the layerLength is 8. Are pad characters inserted to fill that out to 8? Do we need a layerPadCharacter and other related pad properties? - Steve On 04/06/2018 04:10 PM, Mike Beckerle wrote: > Never mind that one. I've simplified it even further: > > > https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Data+Layering+for+base64+-+Super+Simplified > > > ________________________________ > From: Mike Beckerle > Sent: Friday, April 6, 2018 3:12:31 PM > To: email@example.com > Subject: Simplified DFDL layering/base64 proposal > > > On looking into implementation complexity I've come up with simplifications > that don't reduce expressive power at all, but massively simplify > implementation (and documentation, and testing...) burdens. > > > https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Data+Layering+for+base64+-+Simplified > > > >