You can preprocess your documents with Andrew Welch’s LexEv parser: http://andrewjwelch.com/lexev/

On 28.03.2014 12:25, Christian Grün wrote:
Hi Constantine,

unfortunately no, because this information is already consumed by the
XML parser (i. e., we don’t get to see it at all when the database is
being built).

Suggestions from other users with similar problems are welcome.
Christian


Hi all,

I would really like to be able to query a large corpus of documents to get
names and counts of the DTDs which are declared in the (somewhat
old-fashioned now) DOCTYPE declaration:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE converted-article PUBLIC "-//ES//DTD journal article DTD version
4.5.2//EN//XML" "art452.dtd" [
]>
<converted-article> <!-- etc -->

Is there any way to get BaseX to preserve this information? Can I rewrite
the doctype declaration into some sort of element node as the DB is being
created so that this info can be queried?

Thanks for any tips,
Constantine.

_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Reply via email to