Hi Henrik & Alex,

thanks for your suggestions but I thing that it works for you only
because your XML data satisfy some particular assumtions:

Henrik assumes that the the ?xml declaration is present and is at the
beginning of the file.

>> (setq Lst (in "atom.xml" (and (xml?) (xml))))

Alex also assumes that the first line is junk or comment.  However,
the whole XML can have one line only, e.g.

<!-- comment --><hi>123</hi>


<?xml version="1.0" encoding="UTF-8"?><!-- comment --><hi>123</hi>

> there might still be a problem due to the 'comment' line. Is it
> needed (or even legal)? If so, you could skip that line:

>    (setq Lst (in "file" (line) (and (xml?) (xml))))

It is legal to have comments before the root element (case b.xml).

However, it is not legal to have comment before declaration as I found
out now (see case e.xml bellow).

I think that it should be possible to parse all these cases (files
a.xml to d.xml) without knowing which one will "arrive" upfront.  I
guess that the xml function does not handle comments properly,
otherwise it would at least be possible to write:

(or (in F (and (xml?) (xml)))
    (in F (xml)))

The files a.xml to d.xml:

==> /tmp/a.xml <==

==> /tmp/b.xml <==
<!-- comment -->

==> /tmp/c.xml <==
<?xml version="1.0" encoding="UTF-8"?>

==> /tmp/d.xml <==
<?xml version="1.0" encoding="UTF-8"?>
<!-- comment -->

The file e.xml is actually invalid:

==> /tmp/e.xml <==
<!-- comment -->
<?xml version="1.0" encoding="UTF-8"?>

XML Parsing Error: XML or text declaration not at start of entity
Location: file:///tmp/e.xml
Line Number 2, Column 1:
<?xml version="1.0" encoding="UTF-8"?>



Reply via email to