Hello

As we look to more and more ci verification of the XML Schemas that we develop 
for MPEG DASH, we have started to leverage the flexibility of DTD ENTITY 
definitions when creating XML schema pattern facets. Out Github uses lxml to 
verify pull requests, but it seems that the defined entities are not being used.

I have created the following example files to test this

2-base.xsd
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE schema [
      <!ENTITY lowalpha "a-z">
      <!ENTITY hialpha "A-Z">
      <!ENTITY alpha "&lowalpha;&hialpha;">
      <!ENTITY digit "0-9">
      <!ENTITY uword 
"([&digit;]{1,4}|[1-5][&digit;]{4}|6[0-4][&digit;]{3}|65[0-4][&digit;]{2}|655[0-2][&digit;]|6553[0-5])">
      <!ENTITY Port ":&uword;">
]>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"; xmlns="urn:paul:types2" 
targetNamespace="urn:paul:types2" elementFormDefault="qualified" 
attributeFormDefault="unqualified">
      <xs:simpleType name="PortType">
             <xs:restriction base="xs:string">
                   <xs:pattern value="&Port;"/>
             </xs:restriction>
      </xs:simpleType>
</xs:schema>


2-main.xsd
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"; xmlns="urn:paul:main2" 
xmlns:t2="urn:paul:types2" targetNamespace="urn:paul:main2" 
elementFormDefault="qualified" attributeFormDefault="unqualified">
      <xs:import namespace="urn:paul:types2" schemaLocation="2-base.xsd"/>
      <xs:element name="Port" type="t2:PortType"/>
</xs:schema>

2.xml
<?xml version="1.0" encoding="UTF-8"?>
<Port xmlns="urn:paul:main2" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xsi:schemaLocation="urn:paul:main2 2-main.xsd">:10100</Port>


2.xml validates OK in XMLspy and if the value is changed to :101001 for 
example, a match error is given so we know the pattern is progressing through 
the xs:import etc

2.xml does not validate when used against lxml

from lxml import etree

with open('2-main.xsd', 'r') as schema_file:
    my_schema=etree.XMLSchema(etree.parse(schema_file))

filename='2.xml'
with open(filename) as file:
    my_xml=etree.parse(file)
    my_schema.assertValid(my_xml)


This gives the following error
Traceback (most recent call last):
  File "G:\lxml-test\test1b.py", line 11, in <module>
    my_schema.assertValid(my_xml)
  File "src\lxml\etree.pyx", line 3623, in lxml.etree._Validator.assertValid
lxml.etree.DocumentInvalid: Element '{urn:paul:main1}Port': [facet 'pattern'] 
The value ':10100' is not accepted by the pattern '[]+'., line 2

Is this something that 'should' be supported by lxml? If so, am I missing 
something in the validation?

Appreciate any hints, guidance etc

Paul
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com

Reply via email to