Another option is something like below. It doesn't have the niceness
that the value is always in a "value" element regardless of the type
like Mike's snippet does, but it does avoid some duplication.
<xs:sequence>
<xs:element name="name" type="xs:string" dfdl:terminator=":" />
<xs:choice>
<xs:choice dfdl:initiatedContent="yes">
<xs:element name="base64" dfdl:initiator=":" ... />
<xs:element name="uri" dfdl:initiator=">" ... />
</xs:choice>
<xs:element name="string" type="xs:string" ... />
</xs:choice>
</xs:sequence>
So it uses a terminator to find the end of name, then uses nested
choices with initiators to determine if the next thing is base64 content
or a uri, defaulting to string if neither initiator exists. Similar to
Mike's, this also changes the separator to a terminator so the colon is
not in scope when parsing the values.
- Steve
On 03/26/2018 12:40 PM, Mike Beckerle wrote:
> I would suggest this sort of thing.
>
>
> <xs:choice>
>
> <xs:sequence>
>
> <xs:element name="uri" type="tns:upaDummy"/>
>
> <xs:element name="name" type="tns:nameType" dfdl:terminator=":>" />
>
> <xs:element name="value" type="tns:URLType" dfdl:terminator="%NL;"
> />
>
> </xs:sequence>
>
> <xs:sequence>
>
> <xs:element name="b64" type="tns:upaDummy"/>
>
> <xs:element name="name" type="tns:nameType" dfdl:terminator="::"/>
>
> <xs:element name="value" type="tns:Base64Type"
>
> dfdl:terminator="... whatever defines end of base64 ..."/>
>
> </xs:sequence>
>
> <xs:sequence>
> <xs:element name="str" type="tns:upaDummy"/>
> <xs:element name="name" type="tns:nameType" dfdl:terminator=":">
>
> <xs:element name="value" type="xs:string" dfdl:terminator="%NL;"/>
>
> </xs:sequence>
>
> </xs:choice>
>
>
> The uri, b64, and str are flag UPA dummy elements which are unfortunately
> unavoidable due to XSD restrictions.
>
> The type upaDummy should define them to be fixed length zero-length strings.
>
>
> Conversion of separators to terminators here is not arbitrary. When parsing a
> base64, the above will work even if "::" was legal base64 syntax, because
> there's no separator in scope surrounding the base64 value element.
>
>
> ...mike beckerle
>
> Tresys
>
>
>
>
> --------------------------------------------------------------------------------
> *From:* Costello, Roger L. <[email protected]>
> *Sent:* Monday, March 26, 2018 11:05:18 AM
> *To:* [email protected]
> *Subject:* RE: How to parse a line that is delimited by a colon but sometimes
> has two colons?
> I would like to generalize my question a bit.
>
> Not only can there be two consecutive colons:
>
> name:: value
>
> (the second colon indicates the value is base64 text)
>
> But there can be colons within value, e.g.
>
> name:> file:///usr/local/directory/photos/fiona.jpg
>
> (the > symbol indicates the value is a url, and the url may contain a colon)
>
> So, how to express this in DFDL?
>
> /Roger
>
> -----Original Message-----
> From: Costello, Roger L.
> Sent: Monday, March 26, 2018 10:42 AM
> To: [email protected]
> Subject: How to parse a line that is delimited by a colon but sometimes has
> two
> colons?
>
> Hello DFDL experts!
>
> I am using DFDL to parse lines that look like this:
>
> name: value
>
> I am using this DFDL code to parse the lines:
>
> <xs:sequence dfdl:separator=":" dfdl:separatorPosition="infix">
> <xs:element name="name" type="xs:string" />
> <xs:element name="value" type="xs:string" /> </xs:sequence>
>
> If the value is base64 text, then a double colon is used:
>
> name:: base64-value
>
> The above DFDL code doesn't seem to work in this situation. What's the
> correct
> way to write DFDL code which can handle lines with a single colon as well as
> lines with a double colon?
>
> /Roger
>