Re: [xslt] Use character entities to represent non-ASCII characters
On 22/08/2022 17:59, Paul Kinnucan wrote: Is there a was with libxslt to override the output encoding specified by the stylesheet? It should work to poke into the xsltStylesheet struct after parsing: xmlFree(style->encoding); style->encoding = xmlStrdup((xmlChar *) "HTML"); Nic ___ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt
Re: [xslt] Use character entities to represent non-ASCII characters
Hi Nick, You are right. I didn’t notice that our output element specified HTML as the output method but UTF-8 as the encoding.. Changing the encoding to HTML fixes the problem. So libxslt is doing exactly what it is supposed to do. Now I have to figure out why JAXP is ignoring the encoding spec. Perhaps the app is somehow overriding the encoding specified in the stylesheet. Is there a was with libxslt to override the output encoding specified by the stylesheet? Regards, Paul From: Nick Wellnhofer Sent: Monday, August 22, 2022 11:32 AM To: Paul Kinnucan ; The Gnome XSLT library mailing-list Subject: Re: [xslt] Use character entities to represent non-ASCII characters On 22/08/2022 16:46, Paul Kinnucan wrote: > Specifying HTML as the output type does not cause libxslt to generate ASCII > with character entities for non-ASCII characters. It works for me: $ cat t.xsl http://www.w3.org/1999/XSL/Transform<http://www.w3.org/1999/XSL/Transform>"> £ $ xsltproc t.xsl t.xsl Note that the output *encoding* must be set to "HTML". Nick ___ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt
Re: [xslt] Use character entities to represent non-ASCII characters
On 22/08/2022 16:46, Paul Kinnucan wrote: Specifying HTML as the output type does not cause libxslt to generate ASCII with character entities for non-ASCII characters. It works for me: $ cat t.xsl http://www.w3.org/1999/XSL/Transform;> £ $ xsltproc t.xsl t.xsl Note that the output *encoding* must be set to "HTML". Nick ___ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt
Re: [xslt] Use character entities to represent non-ASCII characters
Hi Nick, Thanks for the quick response. Specifying HTML as the output type does not cause libxslt to generate ASCII with character entities for non-ASCII characters. I am porting an existing XML-based app from JAXP to libxslt. The app's existing tests expect character entities because that is what JAXP produces for HTML output. I was hoping to avoid updating the tests for libxslt. Regards, Paul -Original Message- From: Nick Wellnhofer Sent: Sunday, August 21, 2022 9:08 AM To: The Gnome XSLT library mailing-list Cc: Paul Kinnucan Subject: Re: [xslt] Use character entities to represent non-ASCII characters On 19/08/2022 19:41, Paul Kinnucan via xslt wrote: > I am trying to use libxslt to transform an XML file that contains > non-ASCII characters to an HTML file. Other xslt processors, such as > JAXP and Xalan, replace non-ASCII characters with their character > entity equivalents, e.g., £ > -> However, libxslt simply outputs the UTF-8 rendition of the > non-ASCII character. > > Is there a way to get libxslt to output the equivalent character entity > instead? If the output encoding is UTF-8, there's no reason not to output non-ASCII characters as UTF-8 (unless you're talking about non-ASCII characters in URI attribute values). Setting the output encoding to "HTML" should do what you want: This is non-standard, though. You can also set the output encoding to "ASCII", but this will produce numeric character references like "". Nick ___ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt
Re: [xslt] Use character entities to represent non-ASCII characters
On 19/08/2022 19:41, Paul Kinnucan via xslt wrote: I am trying to use libxslt to transform an XML file that contains non-ASCII characters to an HTML file. Other xslt processors, such as JAXP and Xalan, replace non-ASCII characters with their character entity equivalents, e.g., £ -> However, libxslt simply outputs the UTF-8 rendition of the non-ASCII character. Is there a way to get libxslt to output the equivalent character entity instead? If the output encoding is UTF-8, there's no reason not to output non-ASCII characters as UTF-8 (unless you're talking about non-ASCII characters in URI attribute values). Setting the output encoding to "HTML" should do what you want: This is non-standard, though. You can also set the output encoding to "ASCII", but this will produce numeric character references like "". Nick ___ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt
[xslt] Use character entities to represent non-ASCII characters
Hi, I am trying to use libxslt to transform an XML file that contains non-ASCII characters to an HTML file. Other xslt processors, such as JAXP and Xalan, replace non-ASCII characters with their character entity equivalents, e.g., £ -> However, libxslt simply outputs the UTF-8 rendition of the non-ASCII character. Is there a way to get libxslt to output the equivalent character entity instead? Paul ___ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt