Hi Mukul,
I'm not sure whether you are automatically notified, when I post a comment on 
this thread. Just in case you are not, I am forwarding the e-mail that I 
received.
Best regards,
John.
> ---------- Original Message ----------
> From: "J Morris (Jira)" <xerces-j-...@xml.apache.org>
> To: j-dev@xerces.apache.org
> Date: 25 December 2021 at 01:03
> Subject: [jira] [Commented] (XERCESJ-1726) Possible Bug: Xerces 2.12.1 for 
> XML Validation with XSD 1.1 Schema under Java
> 
> 
>     [ 
> https://issues.apache.org/jira/browse/XERCESJ-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465116#comment-17465116
>  ] 
> 
> J Morris commented on XERCESJ-1726:
> -----------------------------------
> 
> {{Hi,
> 
> This is all a long time ago now. That is not to say that I have lost interest 
> though.
> 
> My first observation is that the assert that you quoted in your comment does 
> not appear to correspond to the one in the *test1.xsd* file in the 
> *testX.zip* attached to the bug report.
> 
> The one that you quoted was:
> 
> <xs:assert test="./text()[matches(.,'^([a-z]
> {2}[a-z]?-[A-Z]{2}
> 
> \.((((com)|(lib)|(mod([a-z][a-z0-9])?)|(plg[a-z][a-z0-9](-[a-z0-9][a-z0-9])*)|(tpl))_[a-z][a-z0-9](\.sys)?(\.ini))|(ini)|(css)|(localise\.php)))|([a-z]
> {2}[a-z]?-[A-Z]{2}
> 
> \.xml)|(install\.xml)$')]"/>
> 
> whereas my one in the *test1.xsd* file was:
> 
> <xs:assert 
> test="./text()[matches(.,'^([a-z]{2}[a-z]?-[A-Z]{2}\.((((com)|(lib)|(mod(_[a-z][a-z0-9]+)?)|(plg_[a-z][a-z0-9]+(\-[a-z0-9][a-z0-9]+)*)|(tpl))_[a-z][a-z0-9]+(\.sys)?(\.ini))|(ini)|(css)|(localise\.php)))|([a-z]{2}[a-z]?-[A-Z]{2}\.xml)|(install\.xml)$')]"/>
> 
> You will notice differences in that some of the key repeat counts in the 
> regex are missing in your version, as well as some of the underscores ("_").
> 
> The intention of my version was to match lines with *text()* in any of the 
> following formats:
> 
> 1)  ^[a-z]{2}[a-z]?-[A-Z]{2}\.com_[a-z][a-z0-9]+(\.sys)?\.ini$
> 2)  ^[a-z]{2}[a-z]?-[A-Z]{2}\.lib_[a-z][a-z0-9]+(\.sys)?\.ini$
> 3)  
> ^[a-z]{2}[a-z]?-[A-Z]{2}\.mod(_[a-z][a-z0-9]+)?_[a-z][a-z0-9]+(\.sys)?\.ini$
> 4)  
> ^[a-z]{2}[a-z]?-[A-Z]{2}\.plg_[a-z][a-z0-9]+(\-[a-z0-9][a-z0-9]+)*_[a-z][a-z0-9]+(\.sys)?\.ini$
> 5)  ^[a-z]{2}[a-z]?-[A-Z]{2}\.tpl_[a-z][a-z0-9]+(\.sys)?\.ini$
> 6)  ^[a-z]{2}[a-z]?-[A-Z]{2}\.css$
> 7)  ^[a-z]{2}[a-z]?-[A-Z]{2}\.ini$
> 8)  ^[a-z]{2}[a-z]?-[A-Z]{2}\.localise\.php$
> 9)  ^[a-z]{2}[a-z]?-[A-Z]{2}\.xml$
> 10) ^install\.xml$
> 
> Note: The pattern *[a-z]{2}[a-z]?-[A-Z]{2}* is a language prefix (of at least 
> two lowercase alphabetic characters followed by a minus followed by exactly 
> two uppercase alphabetic characters) in the same spirit as locale settings in 
> operating systems.
> 
> This set of 10 patterns was supposed to match the *text()* entries for ALL of 
> lines in the *test1.xml* file so it was a surprise to me to get any errors 
> reported.
> 
> Thank you for your continuing interest.}}
> 
> > Possible Bug: Xerces 2.12.1 for XML Validation with XSD 1.1 Schema under 
> > Java
> > -----------------------------------------------------------------------------
> >
> >                 Key: XERCESJ-1726
> >                 URL: https://issues.apache.org/jira/browse/XERCESJ-1726
> >             Project: Xerces2-J
> >          Issue Type: Bug
> >          Components: Samples
> >    Affects Versions: 2.12.1
> >         Environment: Windows 7
> > Java 1.8.0_261
> > Xerces-J 2.12.1
> >            Reporter: J Morris
> >            Priority: Major
> >              Labels: test
> >         Attachments: testX.zip, test_cases_ mukul.zip
> >
> >   Original Estimate: 72h
> >  Remaining Estimate: 72h
> >
> > I have recently been trying to validate the XML file *test1.xml* with a 
> > schema *test.xsd* containing *assert*/*assertion* constructs, using the 
> > sample program *jaxp.SourceValidator*.
> > Unexpectedly, the result was several reported errors in what appeared to be 
> > syntactically correct and valid XML lines (*test1.xml*: 9 errors).
> > After significant experimentation, it appeared that these errors were 
> > occurring at line numbers which the validation found troublesome. Inserting 
> > an extra line at one of the troublesome line numbers made the previously 
> > erroneous line (now *not* appearing at a troublesome line number) pass 
> > validation. On the other hand, the newly inserted line (occupying the 
> > troublesome line number) would fail validation.
> > I tentatively interpreted this as meaning that *the validation errors were 
> > not real* and began to try to develop a test-case, as similar as possible 
> > to *test1.xml*, but which passed validation. The result was *test2.xml*, 
> > which was generated from *test1.xml* by inserting XML comment lines at each 
> > of the troublesome line numbers, thereby displacing the previously 
> > erroneous lines to non-trooublesome line numbers. Since XML comment lines 
> > do not require validation, this file passed validation for me (*test2.xml*: 
> > 0 errors).
> > I then contacted Mukul Gandhi and he re-ran my validations *but came to a 
> > different result*. He saw errors in both XML files (*test1.xml*: 9 errors; 
> > *test2.xml*: 18 errors). Despite our joint efforts to achieve convergence 
> > between our respective validation runs, we have not so far succeeded.
> > Mukul did point out a couple of things:
> > 1) The way that I was using the "matches" function in the *assert* 
> > constructs. His experience suggested that this was unreliable. However, I 
> > was not certain whether this would have led to the type of behaviour that I 
> > was seeing (apparent troublesome line numbers).
> > 2) He found that certain characters (probably the two accented French 
> > characters) in my XML files were not supported in the default XML encoding 
> > scheme, UTF-8. However, for me, no errors were reported for those by the 
> > validation program *jaxp.SourceValidator*.
> > I would be very gratefull foe some help in getting to the bottom of this 
> > (both the original behaviour and the discrepancies with Mukul's validation 
> > runs).
> 
> 
> 
> --
> This message was sent by Atlassian Jira
> (v8.20.1#820001)
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscr...@xerces.apache.org
> For additional commands, e-mail: j-dev-h...@xerces.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-dev-h...@xerces.apache.org

Reply via email to