I'll add that we do have an open issue to change the CLI so that it uses different exist codes on the failure type:
https://issues.apache.org/jira/browse/DAFFODIL-2335 Adding this feature would make it easier to CLI users to determine what happened during a parse (e.g. parse error vs validation error). On 4/19/21 10:44 AM, Beckerle, Mike wrote: > I don't know the CLI that well, but I believe the parse produced rc 1 because > there was a validation error. Those do not indicate a failure to parse. > > Data can be "well formed" but have invalid values in it. This is actually > important because validation wants to be pretty strict about what's allowed, > but > if we didn't allow invalid data to successfully parse, then you couldn't > build > applications that displayed invalid data to exhibit what was invalid about > them. > > > > > -------------------------------------------------------------------------------- > *From:* Attila Horvath <attila.j.horv...@gmail.com> > *Sent:* Monday, April 19, 2021 9:46 AM > *To:* users@daffodil.apache.org <users@daffodil.apache.org>; Beckerle, Mike > <mbecke...@owlcyberdefense.com> > *Subject:* apparent anomalous behavior > When testing DFDL/Daffodil I examine the exit codes following each step of > following sequence... > image.png > > 1. daffodil parse > 2. daffodil unparse > 3. diff between parse input ASCII file and reconstituted unparse output > ASCII file > 4. xmllint of intermediate ~.xml file against DFDL schema > > If parsing throws an error, how is it possible step three (3) above indicates > there is "no" difference between parse input ASCII file and reconstituted > unparse output ASCII file? > > The log clearly shows value 'abcdef' violates the regex string > '[A-Z]{0,11}'... > + daffodil parse --validate=on -s cefms-gl.dfdl.xsd -r cefms-gl.dfdl.xsd > '-DSeparator=|' header=absent -o > out/_-_home_-_attila_-_CDES_-_trunk_-_aud-it_-_data_-_JITC_-_AUD-IT_L2H_ASCII_-_CEFMS-GL_-_Invalid_Format_Content_-_cefms-gl-fy2017-09-00001.xml > > /home/attila/CDES/trunk/aud-it/data/JITC/AUD-IT_L2H_ASCII/CEFMS-GL/Invalid_Format_Content/cefms-gl-fy2017-09-00001.txt > [error] Validation Error: cvc-type.3.1.3: The value 'abcdef' of element > 'N_1_accrual_ind' is not valid. > Schema context: element reference {}cefms-gl.dfdl.xsd Location line 2 in > file:/home/attila/CDES/trunk/aud-it/ascii/structured/l2h/hsg-5.x/xsd/cefms-gl.dfdl.xsd > Data location was preceding byte 20043 > [error] Validation Error: cvc-pattern-valid: *Value '_abcdef_' is not > facet-valid with respect to pattern '[A-Z]{0,11}' *for type > '#AnonType_N_1_accrual_indRecdType-aud-it'. > Schema context: element reference {}cefms-gl.dfdl.xsd Location line 2 in > file:/home/attila/CDES/trunk/aud-it/ascii/structured/l2h/hsg-5.x/xsd/cefms-gl.dfdl.xsd > Data location was preceding byte 20043 > [error] Validation Error: N_1_accrual_ind failed facet checks due to: facet > pattern(s): [A-Z]{0,11} > Schema context: N_1_accrual_ind Location line 97 column 14 in > file:/home/attila/CDES/trunk/aud-it/ascii/structured/l2h/hsg-5.x/xsd/cefms-gl.dfdl.xsd > Data location was preceding byte 6 > + parse_exit_code=1 > + echo daffodil parse exit code: 1 > daffodil parse exit code: 1 > > So I would expect the erroneous record to |NOT| appear in intermediate ~.xml > file which would exclude the record from the unparsed/reconstituted out ASCII > file, yet the bad record does appear in the intermediate ~.xml... > image.png > > Thx in advance > > Attila > >