To use a pattern to discriminate a choice, you would use a dfdl:discriminator 
statement with testKind='pattern' on each branch of the choice. That assertion 
is using a regex to look at the data stream, and fails if the data stream at 
the current position doesn't start with a non-zero-length match of the pattern.

E.g., something like:

<xs:choice>
   <xs:sequence>
      <xs:annotation><xs:appinfo ...>
          <!-- Must begin with from 1 to 10 'a' characters -->
          <dfdl:discriminator testKind="pattern">a{1,10}</dfdl:discriminator>
      </xs:appinfo></xs:annotation>
      ... rest of 'a's branch...
   </xs:sequence>
   <xs:sequence>
      <xs:annotation><xs:appinfo ...>
          <!-- Must begin with from 1 to 10 'b' characters -->
          <dfdl:discriminator testKind="pattern">b{1,10}</dfdl:discriminator>
      </xs:appinfo></xs:annotation>
      ... rest of 'b's branch...
   </xs:sequence>
   ... other branches ...
</xs:choice>

Did that address your need?

One detail is that the matching of the discriminator pattern isn't consuming 
any data. When the branch is selected and parsing starts, you are still looking 
at the start of the data, not after the pattern matched. An element has to 
absorb these characters and then go on to parse the remainder of the branch. 
Sometimes that's easy, sometimes a slightly different idea makes more sense:

<xs:choice>
   <xs:sequence>
      <!--
       element that matches the pattern. If there is no match this will be 
zero-length string
        -->
      <xs:element name="aaBranchTag" type="xs:string"
            dfdl:lengthKind="pattern" dfdl:lengthPattern="a{1,10}"/>
      <xs:annotation><xs:appinfo ...>
          <!--
             if the match contains 1 or more characters, this is the right 
branch
           -->
          <dfdl:discriminator>{ fn:string-length(./aaBranchTag) gt 0 
}</dfdl:discriminator>
      </xs:appinfo></xs:annotation>
      ... rest of 'a's branch...starting from after the run of 'a's.
   </xs:sequence>
   .... other branches work similarly ....
</xs:choice>





________________________________
From: Patrick Grandjean <p.r.grandj...@gmail.com>
Sent: Thursday, June 18, 2020 3:28 PM
To: users@daffodil.apache.org <users@daffodil.apache.org>
Subject: dfdl:choiceDispatchKey

Hi!

I recently started using dfdl:choiceDispatchKey. According to the 
documentation, it only accepts DFDL expressions. Is it possible to use DFDL 
regular expressions instead? Or is there an alternative that would accept 
regexes?

Patrick.

On Jun 18, 2020, at 9:31 AM, Beckerle, Mike <mbecke...@tresys.com> wrote:


I don't think maxLength is usable on anything but string and hexBinary.

For integers you can use maxInclusive or maxExclusive.
________________________________
From: Steve Lawrence <slawre...@apache.org>
Sent: Thursday, June 18, 2020 9:10 AM
To: users@daffodil.apache.org <users@daffodil.apache.org>
Subject: Re: java.lang.ArithmeticException: Overflow

Thanks for the contribution! I've taken a look and made a few comments
in the PR. I think the totalDigits logic is actually much more
complicated than I originally thought it would be.

Regarding the release, I'm not personally against a patch release, but
keep in mind that the Daffodil release process is fairly slow. We need
to start a DISCUSS thread on the dev mailing list which must remain open
for at least 72 hours and until discussions die down. We then must
create the release and start a VOTE thread on the dev list. The VOTE
takes a minimum of 72 hours. If that passes, we must then start a VOTE
on the incubator mailing list, which also takes a minimum of 72 hours,
but in my experience takes more like a week. And then the jars take
about a day or so to sync to maven repositories. So the whole process
generally takes about 2 weeks.

Depending on how soon you're trying to get a release out, it might make
sense to see if we can come up with an alternative to totalDigits.

Looking at the schema, to me it looks like the numeric1-10 and
numeric1-18 simple types that have this totalDigits issue are used for
fields that imply they are unsigned integer types rather than decimal
(e.g. NumberOfSegmentsInAMessage, NumberOfSecuritySegments,
LengthOfDataInOctetsOfBits). Is that the case? Are these fields allowed
to have decimal points? If not, rather than using an xs:decimal base you
could use an xs:unsignedInteger base and then use the maxLength
restriction rather than totalDigits. As far as I'm aware maxLength
doesn't have the same issue as totalDigits.

On 6/17/20 4:01 PM, Claude Mamo wrote:
> I gave it a go Steve and opened a PR: 
> https://github.com/apache/incubator-daffodil/pull/387. Assuming it's merged, 
> can we have patch release for this? We are preparing to release the EDI 
> cartridge for Smooks 2 (https://github.com/smooks/smooks-edi-cartridge) but 
> there are many places in our generated EDIFACT schemas with totalDigits > 9.
>
> Claude
>
> On 2020/06/09 11:41:20, Steve Lawrence <slawre...@apache.org> wrote:
>> Looks like total digits is just plain broken for anything greater than
>> or equal to 10, which is pretty bad. Looking at the code I *think* the
>> totalDigits check will always succeed if the value being validated is
>> negative, regardless of the number of digits.
>>
>> I've created DAFFODIL-2349 [1] to track this issue.
>>
>> If you (or anyone else) is interested in getting involved in Daffodil
>> development, this would be a good one to get your feet wet. The fix
>> should be pretty self-contained to one file/function.
>>
>> Unfortunately, I'm not sure if there's a good workaround using just
>> restrictions.
>>
>> Best I can come up with is have an inputValueCalc that strips out a
>> negative sign and decimal place, and then restrict that to a length of
>> 10, e.g.
>>
>>   <xs:element name="unscaled" dfdl:inputValueCalc="{
>>     fn:concat(
>>       fn:substring-before(xs:string(fn:abs(../value)), '.'),
>>       fn:substring-after(xs:string(../value), '.')
>>     )
>>   }">
>>   <xs:simpleType>
>>     <xs:restriction base="xs:string">
>>       <x:length value="10"/>
>>     </xs:restriction>
>>   </xs:simpleType>
>> </xs:element>
>>
>> That's pretty terrible though. Maybe someone else can come up with a
>> better workaround?
>>
>> [1] https://issues.apache.org/jira/browse/DAFFODIL-2349
>>
>>
>> On 6/9/20 2:29 AM, Claude Mamo wrote:
>>> Hello Daffodil team,
>>>
>>> Not sure if what I'm getting is expected behaviour. I have a facet defined 
>>> as follows:
>>>
>>> ```
>>>     <xsd:simpleType dfdl:textNumberPattern="#.#" name="numeric1-10">
>>>         <xsd:restriction base="xsd:decimal">
>>>             <xsd:totalDigits value="10"/>
>>>         </xsd:restriction>
>>>     </xsd:simpleType>
>>> ```
>>>
>>> When attempting to parse a file with full validation turned on, Daffodil 
>>> 2.6 throws an exception saying:
>>>
>>> ```
>>> org.apache.daffodil.exceptions.Abort:
>>> Invariant broken. Exception thrown with mark not returned: 
>>> java.lang.ArithmeticException: Overflow
>>> StackTrace:
>>> java.lang.ArithmeticException: Overflow
>>>         at java.math.BigDecimal.intValueExact(BigDecimal.java:3180)
>>>         at 
>>> org.apache.daffodil.processors.SimpleTypeRuntimeData.checkTotalDigits(RuntimeData.scala:526)
>>>         at 
>>> org.apache.daffodil.processors.SimpleTypeRuntimeData.$anonfun$executeFacetCheck$8(RuntimeData.scala:431)
>>>         at 
>>> org.apache.daffodil.processors.SimpleTypeRuntimeData.$anonfun$executeFacetCheck$8$adapted(RuntimeData.scala:427)
>>> ```
>>>
>>> Should I create a bug report? Any suitable alternatives to "totalDigits"?
>>>
>>> Claude
>>>
>>>
>>>
>>
>>

Reply via email to