[ 
https://issues.apache.org/jira/browse/XERCESJ-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468685#comment-17468685
 ] 

J Morris commented on XERCESJ-1726:
-----------------------------------


{noformat}
Many thanks for your most recent messages.
I shall answer in several parts:
1) What my "test5" filws were inteneded to illustrate,
I (perhaps naively) expected the validation process to maintain the same item 
structure as in the original XML instance document. In particular, I expected 
elements with text parts only occupying a single XML source line to be 
represented in the validation as a single item.
If this were the case,  then assert tests like
if (./text()[matches(., ...
with regex patterns referring to the whole of the text part of the XML source 
element line would succeed.

After discovering that this was not the case (i.e. did not succeed) for certain 
lines in the test5.xml instance document (viz., Lines 40, 158 and 195), in 
spite of the fact that these lines did satisfy the precondition that all the 
text part was on one single line, I began to investigate whether that text part 
had been divided into a sequence of multiple items in the validation. This was 
the purpose of the following line that you were asking about in the schema 
document:
<xs:assert test="if (./text()[last()>1]) then false() else true()"/>
./text() was supposed to select the text part of the current context
[last()>1] was supposed to return true() for lines with text parts which had 
been transformed into a sequence of more than one item.
then false() else true() was supposed to cause the assert to fail on such lines.
(The reason for the perverse ordering of the FALSE() and TRUE() settings was 
because I had syntactic problems in using the "<" operator in place of ">".
What you will notice is that this "sequence of mutliple items" assert fails on 
exactly the same lines as the "match" asserts fails (and on NO other lines).
The commented lines
                                <!-- <xs:assert test="if 
(./text()[matches(.,'^_finder\.ini$')]) then true() else false()"/>
                                <xs:assert test="if 
(./text()[matches(.,'^[a-z]{2}[a-z]?-[A-Z]{2}\.com$)]) then true() else 
false()"/> -->
were suppoosed to shows the extent of each of the (two) items in the resulting 
sequence for XML line 40 in the validation (and, thereby, the location of the 
item break).
2) Your results
To be honest, I think that it is completely pointless for you to keep using 
such short XML files for your test cases. Even with my XML file, the validation 
appears to work perfectly for the first 39 lines of XML. Then something 
unexpected happens at line 40. This splitting of the text part into a sequence 
with multiple items causes the highlighted asserts to fail. In my opinion, what 
is required for a test case is an XML file with a greater number of element 
lines, each with a longer text part (even if the file consists of the same line 
repeated multiple (e.g. a few hundred) times).
3) My questions to you
I would like to ask you why this splitting of the teaxt part is happening on 
just a few of the element lines in my XML instance file?
Is it a surprise to you that it is happening at all?
4) $value
I seem to have misinterpreted your earlier e-mail, when the suggestion of using 
$value was first proposed. I thought that the strategies that you listed in 
that e-mail were all equally acceptable alternatives. Actually, I have not 
succeeded in finding any documentation on the use of $value (or any other 
XSD/XQUERY/XPATH parameter introduced by a "$", apart from variable names). 
Could you explain what its function is? Does it, for example, concatenate 
multiple items into a single string?
Many thanks.


{noformat}


> Possible Bug: Xerces 2.12.1 for XML Validation with XSD 1.1 Schema under Java
> -----------------------------------------------------------------------------
>
>                 Key: XERCESJ-1726
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1726
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: Samples
>    Affects Versions: 2.12.1
>         Environment: Windows 7
> Java 1.8.0_261
> Xerces-J 2.12.1
>            Reporter: J Morris
>            Assignee: Mukul Gandhi
>            Priority: Major
>              Labels: test
>         Attachments: TestSecondError.zip, TestSimplified.zip, testX.zip, 
> test_cases_ mukul.zip
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> I have recently been trying to validate the XML file *test1.xml* with a 
> schema *test.xsd* containing *assert*/*assertion* constructs, using the 
> sample program *jaxp.SourceValidator*.
> Unexpectedly, the result was several reported errors in what appeared to be 
> syntactically correct and valid XML lines (*test1.xml*: 9 errors).
> After significant experimentation, it appeared that these errors were 
> occurring at line numbers which the validation found troublesome. Inserting 
> an extra line at one of the troublesome line numbers made the previously 
> erroneous line (now *not* appearing at a troublesome line number) pass 
> validation. On the other hand, the newly inserted line (occupying the 
> troublesome line number) would fail validation.
> I tentatively interpreted this as meaning that *the validation errors were 
> not real* and began to try to develop a test-case, as similar as possible to 
> *test1.xml*, but which passed validation. The result was *test2.xml*, which 
> was generated from *test1.xml* by inserting XML comment lines at each of the 
> troublesome line numbers, thereby displacing the previously erroneous lines 
> to non-trooublesome line numbers. Since XML comment lines do not require 
> validation, this file passed validation for me (*test2.xml*: 0 errors).
> I then contacted Mukul Gandhi and he re-ran my validations *but came to a 
> different result*. He saw errors in both XML files (*test1.xml*: 9 errors; 
> *test2.xml*: 18 errors). Despite our joint efforts to achieve convergence 
> between our respective validation runs, we have not so far succeeded.
> Mukul did point out a couple of things:
> 1) The way that I was using the "matches" function in the *assert* 
> constructs. His experience suggested that this was unreliable. However, I was 
> not certain whether this would have led to the type of behaviour that I was 
> seeing (apparent troublesome line numbers).
> 2) He found that certain characters (probably the two accented French 
> characters) in my XML files were not supported in the default XML encoding 
> scheme, UTF-8. However, for me, no errors were reported for those by the 
> validation program *jaxp.SourceValidator*.
> I would be very gratefull foe some help in getting to the bottom of this 
> (both the original behaviour and the discrepancies with Mukul's validation 
> runs).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-dev-h...@xerces.apache.org

Reply via email to