Traditionally the way to do this is with a two-pass approach. Have a schema that parses just the headers and treats the payloads as hexbinary/blobs. After parsing it extracts the hexBinary/blob payloads, concatenate them together, and then parse that result with a different schema.

A possible alternative (thought I don't think this has been done before) to do it in one pass might be using a custom Layer. The new Layer would read the Headers and Extended Headers and pass through only the payloads so that Daffodil only ever sees the reassembled payload. Your schema then just becomes something like this:

  <element name="file">
    <complexType>
      <sequence>
        <sequence dfdl:layer="PayloadRessembler">
          <element ref="ex:payload" />
        </sequence>
      </sequence>
    </complexType>
  </element>

And your "payload" element can assume the it's just parsing the assembled 
payload.

This would mean that parsing of the Header and Extended Header are done in code in the Layer and wouldn't even be part of the infoset, which isn't necessarily ideal (often the point of Daffodil is to avoid code specific to one format), but with small enough headers it's maybe not a big deal.

And on unparsing, the layer would have to recreate the Headers/Extended Headers and split the payload.


On 2024-04-10 02:07 PM, Larry Barber wrote:
Does anyone know of a way to handle data that is split into sperate pieces?

I can parse the payload normally, but due to variable length fields, etc., it can span multiple packets – as shown in the diagram below.

I can’t think of a way to allow parsing of the first packet to complete without all of the data being present, and then continuing the parse in the second (or third, fourth, etc.) packet(s).


Reply via email to