Re: [xslt] Use character entities to represent non-ASCII characters

2022-08-22 Thread Nick Wellnhofer via xslt

On 22/08/2022 17:59, Paul Kinnucan wrote:
Is there a was with libxslt to override the output encoding specified by the 
stylesheet?


It should work to poke into the xsltStylesheet struct after parsing:

xmlFree(style->encoding);
style->encoding = xmlStrdup((xmlChar *) "HTML");

Nic

___
xslt mailing list, project page http://xmlsoft.org/XSLT/
xslt@gnome.org
https://mail.gnome.org/mailman/listinfo/xslt


Re: [xslt] Use character entities to represent non-ASCII characters

2022-08-22 Thread Paul Kinnucan via xslt
Hi Nick,

You are right. I didn’t notice that our output element specified HTML as the 
output method but UTF-8 as the encoding.. Changing the encoding to HTML fixes 
the problem.

So libxslt is doing exactly what it is supposed to do.

Now I have to figure out why JAXP is ignoring the encoding spec. Perhaps the 
app is somehow overriding the encoding specified in the stylesheet.

Is there a was with libxslt to override the output encoding specified by the 
stylesheet?

Regards,

Paul

From: Nick Wellnhofer 
Sent: Monday, August 22, 2022 11:32 AM
To: Paul Kinnucan ; The Gnome XSLT library mailing-list 

Subject: Re: [xslt] Use character entities to represent non-ASCII characters

On 22/08/2022 16:46, Paul Kinnucan wrote:
> Specifying HTML as the output type does not cause libxslt to generate ASCII 
> with character entities for non-ASCII characters.

It works for me:

$ cat t.xsl
http://www.w3.org/1999/XSL/Transform<http://www.w3.org/1999/XSL/Transform>">


£


$ xsltproc t.xsl t.xsl


Note that the output *encoding* must be set to "HTML".

Nick
___
xslt mailing list, project page http://xmlsoft.org/XSLT/
xslt@gnome.org
https://mail.gnome.org/mailman/listinfo/xslt


Re: [xslt] Use character entities to represent non-ASCII characters

2022-08-22 Thread Nick Wellnhofer via xslt

On 22/08/2022 16:46, Paul Kinnucan wrote:

Specifying HTML as the output type does not cause libxslt to generate ASCII 
with character entities for non-ASCII characters.


It works for me:

$ cat t.xsl
http://www.w3.org/1999/XSL/Transform;>


£


$ xsltproc t.xsl t.xsl


Note that the output *encoding* must be set to "HTML".

Nick

___
xslt mailing list, project page http://xmlsoft.org/XSLT/
xslt@gnome.org
https://mail.gnome.org/mailman/listinfo/xslt


Re: [xslt] Use character entities to represent non-ASCII characters

2022-08-22 Thread Paul Kinnucan via xslt
Hi Nick,

Thanks for the quick response.

Specifying HTML as the output type does not cause libxslt to generate ASCII 
with character entities for non-ASCII characters.

I am porting an existing XML-based app from JAXP to libxslt. The app's existing 
tests expect character entities because that is what JAXP produces for HTML 
output. I was hoping to avoid updating the tests for libxslt.

Regards,

Paul

-Original Message-
From: Nick Wellnhofer  
Sent: Sunday, August 21, 2022 9:08 AM
To: The Gnome XSLT library mailing-list 
Cc: Paul Kinnucan 
Subject: Re: [xslt] Use character entities to represent non-ASCII characters

On 19/08/2022 19:41, Paul Kinnucan via xslt wrote:
> I am trying to use libxslt to transform an XML file that contains 
> non-ASCII characters to an HTML file. Other xslt processors, such as 
> JAXP and Xalan, replace non-ASCII characters with their character 
> entity equivalents, e.g., £
> ->  However, libxslt simply outputs the UTF-8 rendition of the
> non-ASCII character.
> 
> Is there a way to get libxslt to output the equivalent character entity 
> instead?

If the output encoding is UTF-8, there's no reason not to output non-ASCII 
characters as UTF-8 (unless you're talking about non-ASCII characters in URI 
attribute values). Setting the output encoding to "HTML" should do what you 
want:



This is non-standard, though. You can also set the output encoding to "ASCII", 
but this will produce numeric character references like "".

Nick

___
xslt mailing list, project page http://xmlsoft.org/XSLT/
xslt@gnome.org
https://mail.gnome.org/mailman/listinfo/xslt


Re: [xslt] Use character entities to represent non-ASCII characters

2022-08-21 Thread Nick Wellnhofer via xslt

On 19/08/2022 19:41, Paul Kinnucan via xslt wrote:
I am trying to use libxslt to transform an XML file that contains non-ASCII 
characters to an HTML file. Other xslt processors, such as JAXP and Xalan, 
replace non-ASCII characters with their character entity equivalents, e.g., £ 
->  However, libxslt simply outputs the UTF-8 rendition of the 
non-ASCII character.


Is there a way to get libxslt to output the equivalent character entity instead?


If the output encoding is UTF-8, there's no reason not to output non-ASCII 
characters as UTF-8 (unless you're talking about non-ASCII characters in URI 
attribute values). Setting the output encoding to "HTML" should do what you want:


   

This is non-standard, though. You can also set the output encoding to "ASCII", 
but this will produce numeric character references like "".


Nick

___
xslt mailing list, project page http://xmlsoft.org/XSLT/
xslt@gnome.org
https://mail.gnome.org/mailman/listinfo/xslt


[xslt] Use character entities to represent non-ASCII characters

2022-08-19 Thread Paul Kinnucan via xslt
Hi,

I am trying to use libxslt to transform an XML file that contains non-ASCII 
characters to an HTML file. Other xslt processors, such as JAXP and Xalan, 
replace non-ASCII characters with their character entity equivalents, e.g., £ 
->  However, libxslt simply outputs the UTF-8 rendition of the non-ASCII 
character.

Is there a way to get libxslt to output the equivalent character entity instead?

Paul

___
xslt mailing list, project page http://xmlsoft.org/XSLT/
xslt@gnome.org
https://mail.gnome.org/mailman/listinfo/xslt