Hi Vincent,

can you show us your catalog?

Since you mention that it chokes on finding the DTD, it might be that you need rewriteSystem instead of rewriteURI for the DTD locations.

Also if you don't resolve by public ID and refer to the DTD by relative path, this path will be made absolute before being catalog-resolved, so instead of <system> your can use <systemSuffix> in order to only match the tail of the path.

On the other hand, since you say it’s running with standalone Saxon, the same DTD resolution issues should be expected there.

It may well be the case that our efforts to make DTD resolution available to xslt:transform() only focused on supporting xsl:import and xsl:include while not passing the resolver to the doc() function.

Maybe Liam can investigate this in more depth. I then suggest to ask for a budget at Taylor & Francis. We paid Liam to explore and enable the use of catalogs for xsl:import and xsl:include, and he dug through the mess of different interfaces etc. successfully for this limited but important use case. So he is *the* expert in this field and I’d like to warmly recommend paying him so that he can explore and fix the issue.

Gerrit

On 10.07.2020 07:55, Lizzi, Vincent wrote:
Hi Liam,

Thanks for the helpful suggestions. After trying everything you suggested and then also trying a few of Saxon’s configuration options, unfortunately I’m still having the same problem. Trying a shell script that contains the following:

MAIN="$( cd -P "$(dirname "$FILE")/../basex" && pwd )"

CP=$MAIN/BaseX.jar:$MAIN/lib/custom/*:$MAIN/lib/*:$CLASSPATH

echo 1 Saxon

java -cp "$CP" net.sf.saxon.Transform -s:input1.xml -xsl:transform.xsl -catalog:schemas/catalog.xml

echo 2 BaseX transform

java -cp "$CP" org.basex.BaseX -q"(# db:catfile schemas/catalog.xml #) (# db:intparse false #) (# db:dtd true #) (# db:chop false #) { xslt:transform('input1.xml', 'transform.xsl') }"

echo 3 BaseX transform with Saxon features configured

java -Dhttp://saxon.sf.net/feature/entityResolverClass=org.apache.xml.resolver.tools.CatalogResolver -Dhttp://saxon.sf.net/feature/uriResolverClass=org.apache.xml.resolver.tools.CatalogResolver -cp "$CP" org.basex.BaseX -q"(# db:catfile schemas/catalog.xml #) (# db:intparse false #) (# db:dtd true #) (# db:chop false #) { xslt:transform('input1.xml', 'transform.xsl') }"

echo 4 BaseX doc to show XML Catalog is configured correctly to parse XML

java -cp "$CP" org.basex.BaseX -q"(# db:catfile schemas/catalog.xml #) (# db:intparse false #) (# db:dtd true #) (# db:chop false #) { doc('input1.xml') }"

The classpath includes BaseX 9.3.3, Saxon HE 9.9, xml-resolver-1.2.jar, and CatalogManager.properties

 1. The transformation works in Saxon and uses the catalog file to
    locate the DTD when parsing the XML input1.xml.
 2. The BaseX xslt:transform should work the same as #1, but fails
    because the DTD cannot be read
 3. Adding Saxon configuration for Entity Resolver Class and URI Resolve
    Class did not help
 4. Simply parsing the XML using doc() in BaseX with the same
    configuration shows that the XML catalog is configured correctly
    within BaseX

Using strace -f, the log shows that BaseX xslt:transform is reading the catalog.xml file from disk, and then is trying (and failing) to read the DTD from the non-working URIL.

This might be a bug in xslt:transform, so the workaround of using a regular expression replace on the DOCTYPE system URI is probably the practical solution.

Many thanks,

Vincent

_____________________________________________

*Vincent M. Lizzi*

Head of Information Standards | Taylor & Francis Group

vincent.li...@taylorandfrancis.com <mailto:vincent.li...@taylorandfrancis.com>

Information Classification: General

*From:* Liam R. E. Quin <l...@fromoldbooks.org>
*Sent:* Thursday, July 9, 2020 12:55 PM
*To:* Lizzi, Vincent <vincent.li...@taylorandfrancis.com>; BaseX <basex-talk@mailman.uni-konstanz.de> *Subject:* Re: [basex-talk] xslt:transform function not working with XML Catalog

On Thu, 2020-07-09 at 04:32 +0000, Lizzi, Vincent wrote:
 > Hi Liam,
 >
 > Thanks for the reply and suggestions. Based on your suggestion I
 > tried pragmas and strace, and had another go at
 > CatalogManager.properties, but they've not had any effect.

use, strace -f java.... >& hugelogfile.txt
and after, grep -i catalogmanager.properties hugelogfile.txt
and you should see where it's looking. If it doesn't look for that
file, check to see if it opened the jar file containing the resolver.

If you're running BaseX from Oxygen, Oxygen needs to have it in its
classpath too i think.

Also, of course, see if the catalog file is actually being opened!

I actually wrote some of the code in BaseX that makes XML catalogs work
with transform(), or provided a rough draft that Christian improved :),
and debugging it was... interesting.

I'd also try an absolute path for the catalog file - if you are using
the BaseX server, relative paths will be relative to the directory
(folder) where the server itself is running. (and of course the server
needs the resolver in its classpath).

Messages from the catalog manager seem to go (oddly) to standard
output interleaved with any XML output.

The command-line i used for testing this (well, one of the tests) was,

R=$HOME/lib/xmlcatalog/xml-commons-resolver-1.2/resolver.jar
MAIN=$HOME/packages/basex/basex

java -Dxml.catalog.files=saxlog.xml -D'
http://saxon.sf.net/feature/uriResolverClass=org.apache.xml.resolver.tools.CatalogResolver'
-cp
$R/resolver.jar:/home/lee/packages/basex/basex/BaseX.jar:$MAIN/lib/cust
om/*:$MAIN/lib/*: org.basex.BaseX try.xq

(Saxon was in $MAIN)

 >
--
Liam Quin, https://www.delightfulcomputing.com/ <https://www.delightfulcomputing.com>
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

Reply via email to