I don't know the CLI that well, but I believe the parse produced rc 1 because 
there was a validation error.  Those do not indicate a failure to parse.

Data can be "well formed" but have invalid values in it. This is actually 
important because validation wants to be pretty strict about what's allowed, 
but if we didn't allow invalid data to successfully parse, then you couldn't 
build applications that displayed invalid data to exhibit what was invalid 
about them.




________________________________
From: Attila Horvath <attila.j.horv...@gmail.com>
Sent: Monday, April 19, 2021 9:46 AM
To: users@daffodil.apache.org <users@daffodil.apache.org>; Beckerle, Mike 
<mbecke...@owlcyberdefense.com>
Subject: apparent anomalous behavior

When testing DFDL/Daffodil I examine the exit codes following each step of 
following sequence...
[image.png]

  1.  daffodil parse
  2.  daffodil unparse
  3.  diff between parse input ASCII file and reconstituted unparse output 
ASCII file
  4.  xmllint of intermediate ~.xml file against DFDL schema

If parsing throws an error, how is it possible step three (3) above indicates 
there is "no" difference between parse input ASCII file and reconstituted 
unparse output ASCII file?

The log clearly shows value 'abcdef' violates the regex string '[A-Z]{0,11}'...
+ daffodil parse --validate=on -s cefms-gl.dfdl.xsd -r cefms-gl.dfdl.xsd 
'-DSeparator=|' header=absent -o 
out/_-_home_-_attila_-_CDES_-_trunk_-_aud-it_-_data_-_JITC_-_AUD-IT_L2H_ASCII_-_CEFMS-GL_-_Invalid_Format_Content_-_cefms-gl-fy2017-09-00001.xml
 
/home/attila/CDES/trunk/aud-it/data/JITC/AUD-IT_L2H_ASCII/CEFMS-GL/Invalid_Format_Content/cefms-gl-fy2017-09-00001.txt
[error] Validation Error: cvc-type.3.1.3: The value 'abcdef' of element 
'N_1_accrual_ind' is not valid.
Schema context: element reference {}cefms-gl.dfdl.xsd Location line 2 in 
file:/home/attila/CDES/trunk/aud-it/ascii/structured/l2h/hsg-5.x/xsd/cefms-gl.dfdl.xsd
Data location was preceding byte 20043
[error] Validation Error: cvc-pattern-valid: Value 'abcdef' is not facet-valid 
with respect to pattern '[A-Z]{0,11}' for type 
'#AnonType_N_1_accrual_indRecdType-aud-it'.
Schema context: element reference {}cefms-gl.dfdl.xsd Location line 2 in 
file:/home/attila/CDES/trunk/aud-it/ascii/structured/l2h/hsg-5.x/xsd/cefms-gl.dfdl.xsd
Data location was preceding byte 20043
[error] Validation Error: N_1_accrual_ind failed facet checks due to: facet 
pattern(s): [A-Z]{0,11}
Schema context: N_1_accrual_ind Location line 97 column 14 in 
file:/home/attila/CDES/trunk/aud-it/ascii/structured/l2h/hsg-5.x/xsd/cefms-gl.dfdl.xsd
Data location was preceding byte 6
+ parse_exit_code=1
+ echo daffodil parse exit code: 1
daffodil parse exit code: 1

So I would expect the erroneous record to |NOT| appear in intermediate ~.xml 
file which would exclude the record from the unparsed/reconstituted out ASCII 
file, yet the bad record does appear in the intermediate ~.xml...
[image.png]

Thx in advance

Attila


Reply via email to