Re: [basex-talk] Serialization mismatch using Saxon for xslt:transform

2019-08-08 Thread Omar Siam

Hi Steve!


I take from these results that the output from Saxon xslt:transform is 
serialized according to the stylesheet, and then parsed again by BaseX 
as xml on the way to being serialized again on output from the 
function, and the error is coming from that implicit parse.


The communication between BaseX (and others) and Saxon is:

* BaseX passes XML (serailizes it or passes some object which Saxon uses 
to read the document into its own internal representation)


* Saxon genereates an output according what you configure using 



* BaseX has to read and interpret that output to process it any further.

The last step means that Saxon has to generate something that BaseX can 
consume and that is XML most of the time. method html on purpose uses 
some constructs that are not wellformed XML and uses entitiies always 
defined in HTML to be more compatible with some now outdated browsers.


In the end because you run Saxon to transform your XML and then 
(possibly) process it using XQuery again to generate output BaseX is the 
tool that has to be told what the output should look like. Saxon has to 
be used in a way that BaseX unterstands.


So in the end you will most of the times end up with having to tell 
Saxon to produce XML or XHTML. You may be able to do this using two XSL 
stylesheets that only contain  and import the actual 
stylesheet.



I tried xslt:transform-text() but it escapes all of the element tags.

I didn't try that but as I understand it xslt:transform-text() should 
give you some unparsed text that BaseX can't process any further but if 
output by BaseX (using a text method from BaseX standpoint probably) 
should be the short circuit you are looking for.


Best regards

Omar




Re: [basex-talk] Serialization mismatch using Saxon for xslt:transform

2019-08-08 Thread Christian Grün
Hi Steve,

If you want Saxon to do the HTML serialization, you could proceed as follows:

  declare
%rest:path('test/xdoc')
%output:media-type("text/html")
  function test:htmldoc() {
xslt:transform-text('your.doc', 'your.xsl')
  };

I used xslt:transform-text to retrieve the result as string (because
it won’t be valid XML anymore due to the HTML representation), and I
specified text/html as media-type (this way, your output won’t be
serialized as HTML again).

I couldn’t try your example, so you may need to tweak it a little further.

Best,
Christian


[basex-talk] Serialization mismatch using Saxon for xslt:transform

2019-08-07 Thread Majewski, Steven Dennis (sdm7g)

Continuing with my test.xqm experiments… 

I have copied Saxon jar into WEB-INF/lib/  so xslt:transform is using Saxon. 

I have the following functions declared:

  declare
%rest:path("test/xqdoc.xml")
%output:method('xml')
function test:doc() {
  inspect:xqdoc( $test:module_name )
};

declare
   %rest:path('test/xdoc')
   %output:method('html’)   (: I’ve tried various values 
for method and html-version here ! :) 
   %output:html-version('5.0')
   function test:htmldoc() {
 (:  WTF? :)
 xslt:transform( test:doc(),'static/xsl/html-module.xsl', map { 'source' : 
'test.xqm' })
   };


Where html-module.xsl initially was:  
https://github.com/xquery/xquerydoc/blob/master/src/lib/html-module.xsl 
 


Accessing /basex/test/xdoc  returns error message: 

Stopped at /usr/local/tomcat/webapps/basex/test.xqm, 41/20:
[FODC0002] "" (Line 69): The entity "nbsp" was referenced, but not declared.

Which was initially very puzzling, as I could not find nbsp in either my 
test.xqm, html-module.xsl, or inspect:xsdoc() output. 
Turns out, even though it was encoded as  "” in html-module.xsl, Saxon 
was encoding it as “” on output of the transform. 

Initially, I wasn’t sure if Saxon or BaseX was doing the serialization. I guess 
it seems to be  both. 

Original output method for that stylesheet is:


And changing output:method  on the function to xhtml or html doesn’t have any 
effect on results.
Changing output method on the stylesheet to xhtml causes Saxon to serialized it 
as “”  instead of “” , and the serialization method on the basex 
RESTXQ function doesn’t seem to give an errors with any output methods.

I take from these results that the output from Saxon xslt:transform is 
serialized according to the stylesheet, and then parsed again by BaseX as xml 
on the way to being serialized again on output from the function, and the error 
is coming from that implicit parse. 

I don’t suppose there is a way to short circuit the parsing of the 
xslt:transform function output and just output the results directly (?)
I tried xslt:transform-text() but it escapes all of the element tags. 
Or a way to get it to parse the results of that transform function as html ? 

Clearly changing the stylesheet is the easy solution for this test file, but 
for some of my other real cases, I may be trying to repurpose stylesheets that 
will be used in different contexts: within BaseX and outside of that context, 
so keeping different versions of the stylesheets is going to be annoying. 
I suppose I could read in the stylesheets and modify the xml:output line in 
XQuery or XSLT before using it within BaseX.  


— Steve M. 






smime.p7s
Description: S/MIME cryptographic signature