[lxml] Re: Tested value modified in validation error message

Paul Higgs Sun, 02 May 2021 10:27:26 -0700

Thanks Bob, that’s much clearer

Rather than the current error message of
Element 'ProcessingStatusValue': [facet 'enumeration'] The value 'Processing 
complete' is not an element of the set {'Ready for English peer review', 'Ready 
for English scientific review', 'Ready for English OCCM review', 'Ready for 
Spanish peer review', 'Ready for Spanish OCCM review', 'Ready for publishing', 
'Ready for translation', 'Processing complete'}.

You would rather see
Element 'ProcessingStatusValue': [facet 'enumeration'] The value 'Processing 
&#x13;complete' is not an element of the set {'Ready for English peer review', 
'Ready for English scientific review', 'Ready for English OCCM review', 'Ready 
for Spanish peer review', 'Ready for Spanish OCCM review', 'Ready for 
publishing', 'Ready for translation', 'Processing complete'}.
for example.

I know where this error message is being generated in libxml2 (in 
xmlschemas.c). If you could send a small XML instance and schema that fails, I 
can run the debugger against XMLint to understand the logic leading to this.

It is probably an error in libxml2 rather than in in the lxml python bindings.

Cheers
Paul

-----Original Message-----
From: Bob Kline <[email protected]> 
Sent: 02 May 2021 17:57
To: Paul Higgs <[email protected]>
Cc: lxml mailing list <[email protected]>
Subject: Re: [lxml] Tested value modified in validation error message

On Sun, May 2, 2021 at 10:37 AM Paul Higgs <[email protected]> wrote:

> If the input is "Processing\ncomplete" and the match string is "Processing 
> complete" then this will not be a match, but if your enumeration value is 
> "Pattern\scomplete" it should be OK (and it will also match "Pattern 
> complete"(.
> (I guess the "sometimes garbles whitespace inside text nodes" is just the 
> commercial XML editor wrapping lines that could exceed a 'pretty' length.

Hi, Paul.

We're dealing with two bugs. The first bug is the behavior of the editor 
vendor's implementation of the DOM API, which is giving us back corrupted XML 
when we ask for the serialized document via the Document object's xml property. 
We know that it's the implementation of the serialization for that property 
which is introducing the corruption, rather than corruption of the values in 
the DOM itself, because when we recursively walk through all the nodes of the 
DOM to implement our own serialization of the document, the corruption is not 
present, and the whitespace inside all of the text nodes is intact. We have 
reported that problem to the editor vendor, and I have implemented a workaround 
to avoid this first bug.

The bug I'm asking about in this forum is the second bug, which is producing an 
incorrect error message, pretending that the value being submitted for testing 
was "Processing complete" (without a newline
character) when the value being tested was actually "Processing\ncomplete" 
(with a newline character). The confusion this misleading error message 
introduced made it much more difficult than it should have been to track down 
and identify the first bug.

Does that make things clearer?

--
Bob Kline
https://www.rksystems.com
mailto:[email protected]
_______________________________________________
lxml - The Python XML Toolkit mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: [email protected]

[lxml] Re: Tested value modified in validation error message

Reply via email to