Is there a way to specify that a negative number may be represented by either a 
number in parentheses or by the traditional representation (negative sign 
preceding a series of digits)?



This pattern raises an error:



textNumberPattern="#;-#,(#)"



Here's the error message:



[error] Schema Definition Error: Invalid textNumberPattern: Malformed pattern 
for ICU DecimalFormat: "#;-#,(#)": Trailing grouping separator is invalid at 
position 5



/Roger



-----Original Message-----
From: Steve Lawrence <slawre...@apache.org>
Sent: Wednesday, August 14, 2019 7:46 AM
To: users@daffodil.apache.org
Subject: [EXT] Re: How to model a fixed-length integer that may be padded with 
space on the left?



That is a function of the value of dfdl:textNumberPattern property, which 
actually describes two subpatterns for the format of text numbers--one for 
positive values and one for negative values. The full syntax is



  dfdl:textNumberPattern="positivePattern;negativePattern"



For example, you'll sometimes see formats where the negative number is wrapped 
in parenthesis instead of prefixed with a minus sign, so the property would 
look something like this:



  dfdl:textNumberPattern="#,##0.###;(#,##0.###)"



If the semicolon/negativePattern is not provided in the pattern it is assumed 
that negativePattern is the same as positivePattern but with a minus sign 
prefix.



If you require a positive sign at the beginning of the number, you want a 
pattern that's something like this:



  dfdl:textNumberPattern="+#,##0.###;-#,##0.###"



Note how the positive subpattern has a plus sign prefix and the negative 
subpattern has a minus sign prefix.



However, this now requires that positive numbers always have a plus sign, so 
your "12" will fail to parse. Unfortunately, there's no way in the pattern 
syntax to make the plus sign prefix optional.



But if you set dfdl:textNumberCheckPolicy="lax", then we do ignore leading plus 
signs in the data, and whatever pattern you're currently using should work for 
both "12" and "+12".



On 8/14/19 7:19 AM, Costello, Roger L. wrote:

> I did some further testing after fixing the errors.

>

> Now the DFDL schema processes this input perfectly:

>

>                  12

>

> And it processes this input perfectly:

>

>                  -12

>

> But it gives an error with this input:

>

>                  +12

>

> Here's the error message:

>

> *[error] Parse Error: Convert to Unlimited Size Integer (for

> xs:integer): Unable to parse '+12' (using up all characters).*

>

> Why do I get that error? It's illegal to use the plus sign with numbers?

>

> /Roger

>

> -----Original Message-----

> From: Steve Lawrence <slawre...@apache.org<mailto:slawre...@apache.org>>

> Sent: Tuesday, August 13, 2019 12:06 PM

> To: users@daffodil.apache.org<mailto:users@daffodil.apache.org>

> Subject: [EXT] Re: How to model a fixed-length integer that may be

> padded with space on the left?

>

> Yep, sounds correct to me.

>

> - Steve

>

> On 8/13/19 12:03 PM, Costello, Roger L. wrote:

>

>  > Steve, one more thing, please. I'd like for you to confirm my "lesson 
> learned."

>

>  >

>

>  > Lesson Learned:

>

>  >

>

>  > If the value of textNumberPattern contains a decimal point, comma,

> or exponent character, then you must define textStandardDecimalSeparator.

>

>  >

>

>  > Correct?

>

>  >

>

>  > /Roger

>

>  >

>

>  > -----Original Message-----

>

>  > From: Costello, Roger L. <coste...@mitre.org

> <mailto:coste...@mitre.org>>

>

>  > Sent: Tuesday, August 13, 2019 11:46 AM

>

>  > To: users@daffodil.apache.org<mailto:users@daffodil.apache.org> 
> <mailto:users@daffodil.apache.org>

>

>  > Subject: Re: How to model a fixed-length integer that may be padded

> with space on the left?

>

>  >

>

>  >> For the first error, I'd guess that your dfdl:textNumberPattern

> has a

>

>  >> decimal character in it.

>

>  >

>

>  > Ah! Yes it does:

>

>  >

>

>  > textNumberPattern="#,##0.###;-#,##0.###"

>

>  >

>

>  >> This makes me think the parsed string is actually "12<CRLF>"

>

>  >

>

>  > Ah! Once again, you are spot on. I had a cr and when I removed it,

> the error went away.

>

>  >

>

>  > Amazing piece of detective work Steve! Thank you!

>

>  >

>

>  > /Roger

>

>  >

>

>  > -----Original Message-----

>

>  > From: Steve Lawrence <slawre...@apache.org

> <mailto:slawre...@apache.org>>

>

>  > Sent: Tuesday, August 13, 2019 11:29 AM

>

>  > To: users@daffodil.apache.org<mailto:users@daffodil.apache.org> 
> <mailto:users@daffodil.apache.org>

>

>  > Subject: [EXT] Re: How to model a fixed-length integer that may be

> padded with space on the left?

>

>  >

>

>  > For the first error, I'd guess that your dfdl:textNumberPattern has

> a decimal character in it. Or due to a bug in our number parsing

> library (ICU), we also require the property if there's a comma or exponent 
> characters in the pattern.

> If the pattern has any of these characters, the property must be

> provided so that ICU knows how to consume decimals/groups if it comes

> across one when parsing the number. Even though they might be used for

> this particular case since it's an integer.

>

>  >

>

>  > For the second error, that means that ICU was unable to convert the

> string to a number based on the dfdl:textNumberPattern and other textNumber 
> properties.

> Common causes of this are the textNumberPattern doesn't allow the

> string, or one of the other text number properties aren't set right.

> However, in this specific case, the error message makes it look like

> there's a newline after the 12. This makes me think the parsed string

> is actually "12<CRLF>" or something similar, which will fail to parse

> if dfdl:textNumberCheckPolicy="strict". So you need to either 1)

> configure your schema to consume this trailing NL via a

> terminator/separator/padding/etc. or 2) set

> dfdl:textNumberCheckPolicy="lax", which tells Daffodil to strip off 
> leading/trailing whitespace/newlines (among other things).

>

>  >

>

>  > On 8/13/19 10:10 AM, Costello, Roger L. wrote:

>

>  >> Thank you Steve. Truly outstanding response.

>

>  >>

>

>  >> I have a few follow-up questions, please.

>

>  >>

>

>  >> I have this ultra-simple DFDL schema:

>

>  >>

>

>  >> <xs:element name="input">

>

>  >>     <xs:complexType>

>

>  >>         <xs:sequence>

>

>  >>             <xs:element name="NumberOfStudents" type="xs:integer" />

>

>  >>         </xs:sequence>

>

>  >>     </xs:complexType>

>

>  >> </xs:element>

>

>  >>

>

>  >> My input file (input.txt) contains this:

>

>  >>

>

>  >> 12

>

>  >>

>

>  >> When I run Daffodil, I get these 2 errors:

>

>  >>

>

>  >> [error] Schema Definition Error: Property

> textStandardDecimalSeparator is not defined.

>

>  >>

>

>  >> [error] Parse Error: Convert to Unlimited Size Integer (for

>

>  >> xs:integer): Unable to parse '12 ' (using up all characters).

>

>  >>

>

>  >> Why do I need to define the decimal point symbol? After all, the

> datatype is xs:integer, not xs:decimal.

>

>  >>

>

>  >> For the second error message, I have no clue what it's saying.

> What is it saying, please?

>

>  >>

>

>  >> /Roger

>

>  >>

>

>  >> -----Original Message-----

>

>  >> From: Steve Lawrence <slawre...@apache.org

> <mailto:slawre...@apache.org>>

>

>  >> Sent: Monday, August 12, 2019 12:56 PM

>

>  >> To: users@daffodil.apache.org<mailto:users@daffodil.apache.org> 
> <mailto:users@daffodil.apache.org>

>

>  >> Subject: [EXT] Re: How to model a fixed-length integer that may be

> padded with space on the left?

>

>  >>

>

>  >> The two properties aren't used unless dfdl:textTrimKind is set to 
> "padChar".

> Setting that should give you the behavior you expect. You'll also

> probably want to set textPadKind="padChar", which will add pad

> characters if needed during unparse.

>

>  >>

>

>  >> Agreed that it would be nice if our errors about unused properties

> could explain why, but that's a more difficult problem to solve that

> just saying when properties aren't used. Especially since sometimes it

> requires the combination of various properties for a property to be set.

>

>  >>

>

>  >> - Steve

>

>  >>

>

>  >> On 8/12/19 12:47 PM, Costello, Roger L. wrote:

>

>  >>> Hello DFDL community,

>

>  >>>

>

>  >>> My input contain a Length field that must be of length 4. Here is

> a sample

> input:

>

>  >>>

>

>  >>> .../ 101/...

>

>  >>>

>

>  >>> There is a space prior to 101, although it might be hard to see it.

>

>  >>> So that field is of length 4.

>

>  >>>

>

>  >>> The Length field could be nil; a dash is the nil value.

>

>  >>>

>

>  >>> I figured this is the way to declare the Length element:

>

>  >>>

>

>  >>> <xs:elementname="Length"

>

>  >>>      nillable="true"

>

>  >>>      type="xs:int"

>

>  >>>      dfdl:lengthKind="explicit"dfdl:length="4"

>

>  >>>      dfdl:lengthUnits="characters"

>

>  >>>      dfdl:nilValue="%WSP*;-%WSP*;"

>

>  >>>      dfdl:textNumberPadCharacter="%SP;"

>

>  >>>      dfdl:textNumberJustification="right"/>

>

>  >>>

>

>  >>> But Daffodil gives these warning messages:

>

>  >>>

>

>  >>> *Warning: DFDL property was ignored:

>

>  >>> textNumberJustification="right"*

>

>  >>>

>

>  >>> *Warning: DFDL property was ignored:

> textNumberPadCharacter="%SP;"*

>

>  >>>

>

>  >>> How come I get those warnings?

>

>  >>>

>

>  >>> Anyway, I removed those two properties and then Daffodil simply

>

>  >>> refused to parse the Length field. How come? What is the right way to do 
> this?

>

>  >>>

>

>  >>> /Roger

>

>  >>>

>

>  >>> P.S. It would be nice if Daffodil, when outputting a warning

>

>  >>> message, gave a brief explanation of why. For example, why is

> textNumberJustification="right"

>

>  >>> ignored?

>

>  >>>

>

>  >>

>

>  >

>


Reply via email to