Dear all,

Hope this email finds you well!!

I am trying to obtain multiple error messages from validating a test xml file, 
which I know in advance it has two or more different errors. However, when 
attempting to read lxml's log as suggested in the docs, I just get only one 
(the one which fails first). I'm a bit 'rusty' in python (no pun intended) but 
the few comments I find around the Web (e.g. Stack Overflow, etc.), even though 
not super explicit, do suggest it is possible.... am I doing anything wrong, or 
did I miss anything? (Chances are I did!!)
 
Here is a code snippet (trying to achieve that in Databricks/Spark, lxml 
version is 5.2.2)

from lxml import etree

def validate_xml(xml_file, xsd_file):
    # Parse the XML and XSD files
    xml_doc = etree.parse(xml_file)
    with open(xsd_file, 'r') as f:
        xsd_doc = etree.XML(bytes(f.read(), 'utf-8'))
    
    # Create an XMLSchema object
    schema = etree.XMLSchema(xsd_doc)
    err_log = ""
    # Validate the XML document against the schema
    #is_valid = schema.validate(xml_doc)
    try:
        schema.assertValid(xml_doc)
    except etree.DocumentInvalid as err:
        err_log = str([error for error in schema.error_log])
        #err_log = str(err)
    return err_log or "Valid"

I know error_log is not iterable, but still, I was hoping I would find all the 
error messages by printing the entire log....what am I missing? isn't this 
possible?? (is this a SAX type parser only??)

Also, looks like I cannot attach files in here?? Anyways, you can find example 
xml & xsd used at the end of this message.

For the kind souls out there: Any hint/ suggestion/ example would be much 
appreciated!!!

Thank you so much in advance!!!
Claudio P.
-------------------------------------------------
--- xml & xsd follows:
 ---Schema---

<?xml version="1.0" encoding="utf-8" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema";>
    <xsd:element name="BusinessCard">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="Name" type="xsd:string"/>
                <xsd:element name="phone" maxOccurs="unbounded">
                    <xsd:complexType mixed="true">
                        <xsd:attribute name="type" use="required">
                            <xsd:simpleType>
                                <xsd:restriction base="xsd:string">
                                    <xsd:enumeration value="mobile"/>
                                    <xsd:enumeration value="fax"/>
                                    <xsd:enumeration value="work"/>
                                    <xsd:enumeration value="home"/>
                                </xsd:restriction>
                            </xsd:simpleType>
                        </xsd:attribute>
                    </xsd:complexType>
                </xsd:element>
                <xsd:element name="email" type="xsd:string" minOccurs="0" />
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>
----------------------------------------------------------
---xml---
<?xml version="1.0"?>
<BusinessCard>
        <phone type="mobil">(415) 555-4567</phone>
        <phone type="work">(800) 555-9876</phone>
        <phone type="fax">(510) 555-1234</phone>
        <mail>j...@joe.com</mail>
</BusinessCard>
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com

Reply via email to