That sounds great, cheers! On Fri, Feb 16, 2024 at 10:55 PM Mike Beckerle <mbecke...@apache.org> wrote:
> Created https://issues.apache.org/jira/browse/DAFFODIL-2876 to track this > issue. > > I've sent email to the other members of dfdl-wg at Open Grid Forum for > consideration as a DFDL spec issue. > > On Fri, Feb 16, 2024 at 1:18 PM Claude Mamo <claude.m...@gmail.com> wrote: > >> Is the EDI example you provided a real one? Can you give example data >>> that shows the need? >>> >> >> This example was taken from >> https://github.com/smooks/smooks-edi-cartridge/blob/v2.0.0-RC3/common-schemas/src/main/resources/EDIFACT-Common/IBM_EDI_Format.dfdl.xsd#L122-L125 >> which is based on >> https://github.com/DFDLSchemas/EDIFACT/blob/master/EDIFACT-Common/IBM_EDI_Format.xsd >> . In my particular issue, the *CompositeSep* variable is set to "*^*" >> but the *extraEscapeCharacters* are hard-coded to *+ : '* When I >> unparse an infoset, instead of obtaining the following: >> >> HDR*1*0*59.97*64.92*4.95*Wed Nov 15 13:45:28 EST 2006 >> CUS*user1*Harry^Fletcher*SD >> ORD*1*1*364*The 40-Year-Old Virgin*29.98 >> ORD*2*1*299*Pulp Fiction*29.99 >> >> I'm getting: >> >> HDR*1*0*59.97*64.92*4.95*Wed Nov 15 13*?:*45*?:*28 EST 2006 >> CUS*user1*Harry^Fletcher*SD >> ORD*1*1*364*The 40-Year-Old Virgin*29.98 >> ORD*2*1*299*Pulp Fiction*29.99 >> >> Note the time colon separators are escaped when they shouldn't be. >> >> If you didn't have this, is there any sensible work around? >>> >> >> The only easy workaround I can think of is having the user override the >> *EDIFormat >> *so that it references an escape scheme where the *extraEscapeCharacters >> *are set correctly but, as one can imagine, that's not ideal. >> >> On Fri, Feb 16, 2024 at 4:05 PM Mike Beckerle <mbecke...@apache.org> >> wrote: >> >>> We could do this. It does feel inconsistent that escapeCharacter and >>> escapeEscapeCharacter are dynamic, but extraEscapedCharacters is not. >>> >>> I think the design principle at work here was that things associated >>> with escape characters need to be dynamic, but other kinds of escape >>> schemes involving escape block start/end, are not. >>> The extraEscapedCharacter property was a late addition to the DFDL >>> spec., so didn't get the scrutiny to notice this need I think. >>> >>> Is the EDI example you provided a real one? Can you give example data >>> that shows the need? >>> >>> If you didn't have this, is there any sensible work around? >>> >>> I will send this over to the DFDL Workgroup mailing list as well to >>> discuss the DFDL spec amendment you are suggesting. >>> >>> In anticipation of this, what are some error situations. E.g., I expect >>> it's a Schema Definition Error if >>> >>> * any of the extraEscapedCharacters are the same as the escape or >>> escapeEscape character >>> * are the same as the first character of any delimiter in scope >>> * are not unique >>> >>> Daffodil is, of course, open-source so if you want this feature you can >>> have it. I think it would not be terribly hard to add this to daffodil, as >>> one would just look at how escapeCharacter works, and copy that same idea >>> for extraEscapedCharacters. >>> >>> >>> >>> On Fri, Feb 16, 2024 at 2:40 AM Claude Mamo <claude.m...@gmail.com> >>> wrote: >>> >>>> Dear community, >>>> >>>> Is there a way to set *extraEscapedCharacters* in *escapeScheme* >>>> dynamically like so: >>>> >>>> <dfdl:defineEscapeScheme name="EDIEscapeScheme"> >>>> <dfdl:escapeScheme escapeKind="escapeCharacter" >>>> escapeCharacter="{$ibmEdiFmt:EscapeChar}" >>>> escapeEscapeCharacter="{$ibmEdiFmt:EscapeChar}" >>>> extraEscapedCharacters="{$ibmEdiFmt:extraEscapedCharacters}"/> >>>> </dfdl:defineEscapeScheme> >>>> >>>> My understanding is that it doesn't accept DFDL expressions and when I >>>> attempt to reference a variable I get: >>>> >>>> Schema Definition Error: For property dfdl:extraEscapedCharacters the >>>>> length of string must be exactly 1 character >>>> >>>> >>>> Perhaps support for DFDL expressions can be added to the DFDL spec? >>>> >>>> Claude >>>> >>>>