Created https://issues.apache.org/jira/browse/DAFFODIL-2876 to track this
issue.

I've sent email to the other members of dfdl-wg at Open Grid Forum for
consideration as a DFDL spec issue.

On Fri, Feb 16, 2024 at 1:18 PM Claude Mamo <claude.m...@gmail.com> wrote:

> Is the EDI example you provided a real one? Can you give example data that
>> shows the need?
>>
>
> This example was taken from
> https://github.com/smooks/smooks-edi-cartridge/blob/v2.0.0-RC3/common-schemas/src/main/resources/EDIFACT-Common/IBM_EDI_Format.dfdl.xsd#L122-L125
> which is based on
> https://github.com/DFDLSchemas/EDIFACT/blob/master/EDIFACT-Common/IBM_EDI_Format.xsd
> . In my particular issue, the *CompositeSep* variable is set to "*^*" but
> the *extraEscapeCharacters* are hard-coded to *+ : '*  When I unparse an
> infoset, instead of obtaining the following:
>
> HDR*1*0*59.97*64.92*4.95*Wed Nov 15 13:45:28 EST 2006
> CUS*user1*Harry^Fletcher*SD
> ORD*1*1*364*The 40-Year-Old Virgin*29.98
> ORD*2*1*299*Pulp Fiction*29.99
>
> I'm getting:
>
> HDR*1*0*59.97*64.92*4.95*Wed Nov 15 13*?:*45*?:*28 EST 2006
> CUS*user1*Harry^Fletcher*SD
> ORD*1*1*364*The 40-Year-Old Virgin*29.98
> ORD*2*1*299*Pulp Fiction*29.99
>
> Note the time colon separators are escaped when they shouldn't be.
>
> If you didn't have this, is there any sensible work around?
>>
>
> The only easy workaround I can think of is having the user override the 
> *EDIFormat
> *so that it references an escape scheme where the *extraEscapeCharacters *are
> set correctly but, as one can imagine, that's not ideal.
>
> On Fri, Feb 16, 2024 at 4:05 PM Mike Beckerle <mbecke...@apache.org>
> wrote:
>
>> We could do this. It does feel inconsistent that escapeCharacter and
>> escapeEscapeCharacter are dynamic, but extraEscapedCharacters is not.
>>
>> I think the design principle at work here was that things associated with
>> escape characters need to be dynamic, but other kinds of escape
>> schemes involving escape block start/end, are not.
>> The extraEscapedCharacter property was a late addition to the DFDL spec.,
>> so didn't get the scrutiny to notice this need I think.
>>
>> Is the EDI example you provided a real one? Can you give example data
>> that shows the need?
>>
>> If you didn't have this, is there any sensible work around?
>>
>> I will send this over to the DFDL Workgroup mailing list as well to
>> discuss the DFDL spec amendment you are suggesting.
>>
>> In anticipation of this, what are some error situations. E.g., I expect
>> it's a Schema Definition Error if
>>
>> * any of the extraEscapedCharacters are the same as the escape or
>> escapeEscape character
>> * are the same as the first character of any delimiter in scope
>> * are not unique
>>
>> Daffodil is, of course, open-source so if you want this feature you can
>> have it. I think it would not be terribly hard to add this to daffodil, as
>> one would just look at how escapeCharacter works, and copy that same idea
>> for extraEscapedCharacters.
>>
>>
>>
>> On Fri, Feb 16, 2024 at 2:40 AM Claude Mamo <claude.m...@gmail.com>
>> wrote:
>>
>>> Dear community,
>>>
>>> Is there a way to set *extraEscapedCharacters* in *escapeScheme*
>>> dynamically like so:
>>>
>>> <dfdl:defineEscapeScheme name="EDIEscapeScheme">
>>>   <dfdl:escapeScheme escapeKind="escapeCharacter"
>>> escapeCharacter="{$ibmEdiFmt:EscapeChar}"
>>> escapeEscapeCharacter="{$ibmEdiFmt:EscapeChar}"
>>> extraEscapedCharacters="{$ibmEdiFmt:extraEscapedCharacters}"/>
>>> </dfdl:defineEscapeScheme>
>>>
>>> My understanding is that it doesn't accept DFDL expressions and when I
>>> attempt to reference a variable I get:
>>>
>>> Schema Definition Error: For property dfdl:extraEscapedCharacters the
>>>> length of string must be exactly 1 character
>>>
>>>
>>> Perhaps support for DFDL expressions can be added to the DFDL spec?
>>>
>>> Claude
>>>
>>>

Reply via email to